debian 10 安装_如何在Debian 10上安装和使用ClickHouse

news/2024/7/4 9:38:00

debian 10 安装

介绍 (Introduction)

ClickHouse is an open-source, column-oriented analytics database created by Yandex for OLAP and big data use cases. ClickHouse’s support for real-time query processing makes it suitable for applications that require sub-second analytical results. ClickHouse’s query language is a dialect of SQL that enables powerful declarative querying capabilities while offering familiarity and a smaller learning curve for the end user.

ClickHouse是Yandex为OLAP和大数据使用案例创建的开源,面向列的分析数据库。 ClickHouse对实时查询处理的支持使其适合需要亚秒级分析结果的应用程序。 ClickHouse的查询语言是SQL的一种方言,可启用强大的声明性查询功能,同时为最终用户提供熟悉程度和较小的学习曲线。

Column-oriented databases store records in blocks grouped by columns instead of rows. By not loading data for columns absent in the query, column-oriented databases spend less time reading data while completing queries. As a result, these databases can compute and return results much faster than traditional row-based systems for certain workloads, such as OLAP.

面向列的数据库将记录存储在按列而不是行分组的块中。 通过不加载查询中不存在的列的数据,面向列的数据库在完成查询时花费更少的时间读取数据。 因此,对于某些工作负载(例如OLAP),这些数据库可以比传统的基于行的系统更快地计算和返回结果。

Online Analytics Processing (OLAP) systems allow for organizing large amounts of data and performing complex queries. They are capable of managing petabytes of data and returning query results quickly. In this way, OLAP is useful for work in areas like data science and business analytics.

在线分析处理 (OLAP)系统允许组织大量数据并执行复杂的查询。 它们能够管理PB级数据并快速返回查询结果。 这样,OLAP可用于数据科学和业务分析等领域的工作。

In this tutorial, you’ll install the ClickHouse database server and client on your machine. You’ll use the DBMS for typical tasks and optionally enable remote access from another server so that you’ll be able to connect to the database from another machine. Then you’ll test ClickHouse by modeling and querying example website-visit data.

在本教程中,您将在计算机上安装ClickHouse数据库服务器和客户端。 您将使用DBMS执行典型的任务,并有选择地启用来自另一台服务器的远程访问,以便能够从另一台计算机连接到数据库。 然后,您将通过建模和查询示例网站访问数据来测试ClickHouse。

先决条件 (Prerequisites)

  • One Debian 10 with a sudo enabled non-root user and firewall setup. You can follow the initial server setup tutorial to create the user and set up the firewall.

    一台启用了sudo Debian 10启用了非root用户和防火墙设置。 您可以按照初始服务器设置教程来创建用户并设置防火墙。

  • (Optional) A secondary Debian 10 with a sudo enabled non-root user and firewall setup. You can follow the initial server setup tutorial.

    (可选)具有启用了sudo非Debian用户和防火墙设置的辅助Debian 10。 您可以按照初始服务器设置教程进行操作 。

第1步-安装ClickHouse (Step 1 — Installing ClickHouse)

In this section, you will install the ClickHouse server and client programs using apt.

在本节中,您将使用apt安装ClickHouse服务器和客户端程序。

First, SSH into your server by running:

首先,通过运行SSH进入服务器:

  • ssh sammy@your_server_ip

    ssh sammy @ your_server_ip

dirmngr is a server for managing certificates and keys. It is required for adding and verifying remote repository keys, install it by running:

dirmngr是用于管理证书和密钥的服务器。 添加和验证远程存储库密钥是必需的,请通过运行以下命令进行安装:

  • sudo apt install dirmngr

    须藤apt install dirmngr

Yandex maintains an APT repository that has the latest version of ClickHouse. Add the repository’s GPG key so that you’ll be able to securely download validated ClickHouse packages:

Yandex维护一个具有最新版本的ClickHouse的APT存储库。 添加存储库的GPG密钥,以便您可以安全地下载经过验证的ClickHouse软件包:

  • sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4

    须藤apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4

You will see output similar to the following:

您将看到类似于以下内容的输出:


   
Output
Executing: /tmp/apt-key-gpghome.JkkcKnBAFY/gpg.1.sh --keyserver keyserver.ubuntu.com --recv E0C56BD4 gpg: key C8F1E19FE0C56BD4: public key "ClickHouse Repository Key <milovidov@yandex-team.ru>" imported gpg: Total number processed: 1 gpg: imported: 1

The output confirms it has successfully verified and added the key.

输出确认已成功验证并添加了密钥。

Add the repository to your APT repositories list by executing:

通过执行以下操作,将存储库添加到您的APT存储库列表中:

  • echo "deb http://repo.yandex.ru/clickhouse/deb/stable/ main/" | sudo tee /etc/apt/sources.list.d/clickhouse.list

    回声“ deb http://repo.yandex.ru/clickhouse/deb/stable/ main /” | sudo tee /etc/apt/sources.list.d/clickhouse.list

Here you’ve piped the output of echo to sudo tee so that this output can print to a root-owned file.

在这里,您已将echo的输出通过管道传递到sudo tee以便此输出可以打印到根拥有的文件。

Now, run apt update to update your packages:

现在,运行apt update来更新您的软件包:

  • sudo apt update

    sudo apt更新

The clickhouse-server and clickhouse-client packages will now be available for installation.

clickhouse-serverclickhouse-client软件包现在将可用于安装。

As of ClickHouse version 19.13.3, certain OpenSSL 1.1.1 configurations such as MinProtocol and CipherVersion are not read correctly. In order to workaround this incompatibility, modify the OpenSSL config file and comment out the ssl_conf = ssl_sect line in /etc/ssl/openssl.cnf.

从ClickHouse版本19.13.3开始,某些OpenSSL 1.1.1配置(例如MinProtocol和CipherVersion)无法正确读取。 为了解决此不兼容问题,请修改OpenSSL配置文件,并在/etc/ssl/openssl.cnf ssl_conf = ssl_sect行。

Edit the configuration file by executing:

通过执行以下操作来编辑配置文件:

  • sudo nano /etc/ssl/openssl.cnf

    须藤nano /etc/ssl/openssl.cnf

Then comment out the line containing ssl_conf = ssl_sect, so it looks like the following file:

然后注释掉包含ssl_conf = ssl_sect的行,因此它看起来像以下文件:

/etc/ssl/openssl.cnf
/etc/ssl/openssl.cnf
...

tsa_name        = yes   # Must the TSA name be included in the reply?
                # (optional, default: no)
ess_cert_id_chain   = no    # Must the ESS cert id chain be included?
                # (optional, default: no)
ess_cert_id_alg     = sha1  # algorithm to compute certificate
                # identifier (optional, default: sha1)
[default_conf]
#ssl_conf = ssl_sect

[ssl_sect]
...

Now that the OpenSSL config has been patched, you’re ready to install the ClickHouse server and client packages. Install them with:

既然已经修补了OpenSSL配置,那么您就可以安装ClickHouse服务器和客户端软件包了。 通过以下方式安装它们:

  • sudo apt install clickhouse-server clickhouse-client

    sudo apt安装clickhouse-server clickhouse-client

During the installation, you will be asked to set a password for the default ClickHouse user.

在安装过程中,系统将要求您为默认的ClickHouse用户设置密码。

You’ve installed the ClickHouse server and client successfully. You’re now ready to start the database service and ensure that it’s running correctly.

您已经成功安装了ClickHouse服务器和客户端。 现在,您可以启动数据库服务并确保其正常运行。

步骤2 —启动服务 (Step 2 — Starting the Service)

The clickhouse-server package that you installed in the previous section creates a systemd service, which performs actions such as starting, stopping, and restarting the database server. systemd is an init system for Linux to initialize and manage services. In this section you’ll start the service and verify that it is running successfully.

您在上一节中安装的clickhouse-server软件包将创建一个systemd服务,该服务执行诸如启动,停止和重新启动数据库服务器的操作。 systemd是Linux用来初始化和管理服务的初始化系统。 在本部分中,您将启动服务并验证它是否已成功运行。

Start the clickhouse-server service by running:

通过运行以下命令来启动clickhouse-server服务:

  • sudo service clickhouse-server start

    sudo服务clickhouse-服务器启动

The previous command will not display any output. To verify that the service is running successfully, execute:

上一条命令将不显示任何输出。 要验证服务是否成功运行,请执行:

  • sudo service clickhouse-server status

    sudo服务clickhouse-服务器状态

You’ll see output similar to the following:

您将看到类似于以下内容的输出:


   
Output
● clickhouse-server.service - ClickHouse Server (analytic DBMS for big data) Loaded: loaded (/etc/systemd/system/clickhouse-server.service; enabled; vendor preset: enabled) Active: active (running) since Sat 2018-12-22 07:23:20 UTC; 1h 9min ago Main PID: 27101 (ClickHouse-serv) Tasks: 34 (limit: 1152) CGroup: /system.slice/ClickHouse-server.service └─27101 /usr/bin/ClickHouse-server --config=/etc/ClickHouse-server/config.xml

The output notes that the server is running.

输出结果表明服务器正在运行。

You have successfully started the ClickHouse server and will now be able to use the clickhouse-client CLI program to connect to the server.

您已经成功启动了ClickHouse服务器,现在将能够使用clickhouse-client CLI程序连接到服务器。

第3步-创建数据库和表 (Step 3 — Creating Databases and Tables)

In ClickHouse, you can create and delete databases by executing SQL statements directly in the interactive database prompt. Statements consist of commands following a particular syntax that tell the database server to perform a requested operation along with any data required. You create databases by using the CREATE DATABASE table_name syntax. To create a database, first start a client session by running the following command:

在ClickHouse中,可以通过在交互式数据库提示中直接执行SQL语句来创建和删除数据库。 语句由遵循特定语法的命令组成,这些命令告诉数据库服务器执行请求的操作以及所需的任何数据。 您可以使用CREATE DATABASE table_name语法创建数据库。 要创建数据库,请首先通过运行以下命令启动客户端会话:

  • clickhouse-client --password

    clickhouse-client --password

You will be asked to enter the password you had set during the installation—enter it to successfully start the client session.

系统将要求您输入在安装过程中设置的密码-输入密码以成功启动客户端会话。

The previous command will log you in to the client prompt where you can run ClickHouse SQL statements to perform actions such as:

前面的命令将使您登录到客户端提示符,您可以在其中运行ClickHouse SQL语句以执行诸如以下操作:

  • Creating, updating, and deleting databases, tables, indexes, partitions, and views.

    创建,更新和删除数据库,表,索引,分区和视图。

  • Executing queries to retrieve data that is optionally filtered and grouped using various conditions.

    执行查询以检索使用各种条件过滤和分组的数据。

In this step, with the ClickHouse client ready for inserting data, you’re going to create a database and table. For the purposes of this tutorial, you’ll create a database named test, and inside that you’ll create a table named visits that tracks website-visit durations.

在这一步中,准备好ClickHouse客户端插入数据后,您将创建一个数据库和表。 在本教程中,您将创建一个名为test的数据库,并在其中创建一个名为visits的表,该表可跟踪网站访问的持续时间。

Now that you’re inside the ClickHouse command prompt, create your test database by executing:

现在,您位于ClickHouse命令提示符下,通过执行以下操作创建test数据库:

  • CREATE DATABASE test;

    创建数据库测试 ;

You’ll see the following output that shows that you have created the database:

您将看到以下输出,显示您已创建数据库:


   
Output
CREATE DATABASE test Ok. 0 rows in set. Elapsed: 0.003 sec.

A ClickHouse table is similar to tables in other relational databases; it holds a collection of related data in a structured format. You can specify columns along with their types, add rows of data, and execute different kinds of queries on tables.

ClickHouse表类似于其他关系数据库中的表。 它以结构化格式保存相关数据的集合。 您可以指定列及其类型,添加数据行,并在表上执行各种查询。

The syntax for creating tables in ClickHouse follows this example structure:

在ClickHouse中创建表的语法遵循以下示例结构:

CREATE TABLE table_name
(
    column_name1 column_type [options],
    column_name2 column_type [options],
    ...
) ENGINE = engine

The table_name and column_name values can be any valid ASCII identifiers. ClickHouse supports a wide range of column types; some of the most popular are:

table_namecolumn_name值可以是任何有效的ASCII标识符。 ClickHouse支持多种列类型; 一些最受欢迎的是:

  • UInt64: used for storing integer values in the range 0 to 18446744073709551615.

    UInt64 :用于存储0到18446744073709551615之间的整数值。

  • Float64: used for storing floating point numbers such as 2039.23, 10.5, etc.

    Float64 :用于存储浮点数,例如2039.23、10.5等。

  • String: used for storing variable length characters. It does not require a max-length attribute since it can store arbitrary lengths.

    String :用于存储可变长度字符。 它不需要max-length属性,因为它可以存储任意长度。

  • Date: used for storing dates that follow the YYYY-MM-DD format.

    Date :用于存储遵循YYYY-MM-DD格式的日期。

  • DateTime: used for storing dates coupled with time and follows the YYYY-MM-DD HH:MM:SS format.

    DateTime :用于存储带有时间的日期,并遵循YYYY-MM-DD HH:MM:SS格式。

After the column definitions, you specify the engine used for the table. In ClickHouse, Engines determine the physical structure of the underlying data, the table’s querying capabilities, its concurrent access modes, and support for indexes. Different engine types are suitable for different application requirements. The most commonly used and widely applicable engine type is MergeTree.

在列定义之后,您可以指定用于表的引擎。 在ClickHouse中, 引擎确定基础数据的物理结构,表的查询功能,并发访问模式以及对索引的支持。 不同的发动机类型适合于不同的应用要求。 最常用和广泛适用的引擎类型是MergeTree

Now that you have an overview of table creation, you’ll create a table. Start by confirming the database you’ll be modifying:

现在,您已经对表创建有了概述,接下来将创建一个表。 首先确认要修改的数据库:

  • USE test;

    USE 测试 ;

You will see the following output showing that you have switched to the test database from the default database:

您将看到以下输出,显示您已从default数据库切换到test数据库:


   
Output
USE test Ok. 0 rows in set. Elapsed: 0.001 sec.

The remainder of this guide will assume that you are executing statements within this database’s context.

本指南的其余部分将假定您正在该数据库的上下文中执行语句。

Create your visits table by running this command:

通过运行以下命令来创建您的visits表:

  • CREATE TABLE visits (

    创建表访问 (

  • id UInt64,

    ID UInt64 ,

  • duration Float64,

    持续时间Float64 ,

  • url String,

    网址字串 ,

  • created DateTime

    创建的DateTime

  • ) ENGINE = MergeTree()

    )引擎= MergeTree()

  • PRIMARY KEY id

    主键编号

  • ORDER BY id;

    按id订购;

Here’s a breakdown of what the command does. You create a table named visits that has four columns:

这是该命令的功能细分。 您创建一个名为visits的表,该表具有四列:

  • id: The primary key column. Similarly to other RDBMS systems, a primary key column in ClickHouse uniquely identifies a row; each row should have a unique value for this column.

    id :主键列。 与其他RDBMS系统类似,ClickHouse中的主键列唯一标识一行; 每行对于此列应具有唯一的值。

  • duration: A float column used to store the duration of each visit in seconds. float columns can store decimal values such as 12.50.

    duration :一个float列,用于存储每次访问的持续时间(以秒为单位)。 float列可以存储十进制值,例如12.50。

  • url: A string column that stores the URL visited, such as http://example.com.

    url :一个字符串列,用于存储访问的URL,例如http://example.com

  • created: A date and time column that tracks when the visit occurred.

    created :一个日期和时间列,用于跟踪访问发生的时间。

After the column definitions, you specify MergeTree as the storage engine for the table. The MergeTree family of engines is recommended for production databases due to its optimized support for large real-time inserts, overall robustness, and query support. Additionally, MergeTree engines support sorting of rows by primary key, partitioning of rows, and replicating and sampling data.

在列定义之后,将MergeTree指定为表的存储引擎。 建议将MergeTree引擎家族用于生产数据库,因为它对大型实时插入,总体鲁棒性和查询支持进行了优化支持。 此外,MergeTree引擎支持按主键对行进行排序,对行进行分区以及复制和采样数据。

If you intend to use ClickHouse for archiving data that is not queried often or for storing temporary data, you can use the Log family of engines to optimize for that use-case.

如果打算使用ClickHouse归档不经常查询的数据或存储临时数据,则可以使用Log引擎引擎针对该用例进行优化。

After the column definitions, you’ll define other table-level options. The PRIMARY KEY clause sets id as the primary key column and the ORDER BY clause will store values sorted by the id column. A primary key uniquely identifies a row and is used for efficiently accessing a single row and efficient colocation of rows.

在列定义之后,您将定义其他表级选项。 PRIMARY KEY子句将id设置为主键列,而ORDER BY子句将存储按id列排序的值。 主键唯一地标识一行,并用于有效访问单行和行的有效并置。

On executing the create statement, you will see the following output:

执行create语句时,您将看到以下输出:


   
Output
CREATE TABLE visits ( id UInt64, duration Float64, url String, created DateTime ) ENGINE = MergeTree() PRIMARY KEY id ORDER BY id Ok. 0 rows in set. Elapsed: 0.010 sec.

In this section, you’ve created a database and a table to track website-visit data. In the next step, you’ll insert data into the table, update existing data, and delete that data.

在本部分中,您已经创建了一个数据库和一个表来跟踪网站访问数据。 在下一步中,您将数据插入表中,更新现有数据,然后删除该数据。

步骤4 —插入,更新和删除数据和列 (Step 4 — Inserting, Updating, and Deleting Data and Columns)

In this step, you’ll use your visits table to insert, update, and delete data. The following command is an example of the syntax for inserting rows into a ClickHouse table:

在此步骤中,您将使用visits表来插入,更新和删除数据。 以下命令是在ClickHouse表中插入行的语法示例:

INSERT INTO table_name VALUES (column_1_value, column_2_value, ....);

Now, insert a few rows of example website-visit data into your visits table by running each of the following statements:

现在,通过运行以下每个语句,将几行示例网站访问数据插入到您的visits表中:

  • INSERT INTO visits VALUES (1, 10.5, 'http://example.com', '2019-01-01 00:01:01');

    INSERT INTO 访问 VALUES( 1,10.5,'http://example.com','2019-01-01 00:01:01' );

  • INSERT INTO visits VALUES (2, 40.2, 'http://example1.com', '2019-01-03 10:01:01');

    INSERT INTO 访问 VALUES( 2,40.2,'http://example1.com','2019-01-03 10:01:01' );

  • INSERT INTO visits VALUES (3, 13, 'http://example2.com', '2019-01-03 12:01:01');

    INSERT INTO 访问 VALUES( 3,13,'http://example2.com','2019-01-03 12:01:01' );

  • INSERT INTO visits VALUES (4, 2, 'http://example3.com', '2019-01-04 02:01:01');

    INSERT INTO 访问 VALUES( 4,2,'http://example3.com','2019-01-04 02:01:01' );

You’ll see the following output repeated for each insert statement.

您将看到每个插入语句重复以下输出。


   
Output
INSERT INTO visits VALUES Ok. 1 rows in set. Elapsed: 0.004 sec.

The output for each row shows that you’ve inserted it successfully into the visits table.

每行的输出表明您已成功将其插入到visits表中。

Now you’ll add an additional column to the visits table. When adding or deleting columns from existing tables, ClickHouse supports the ALTER syntax.

现在,您将在visits表中添加一列。 在现有表中添加或删除列时,ClickHouse支持ALTER语法。

For example, the basic syntax for adding a column to a table is as follows:

例如,将列添加到表的基本语法如下:

ALTER TABLE table_name ADD COLUMN column_name column_type;

Add a column named location that will store the location of the visits to a website by running the following statement:

通过运行以下语句,添加一个名为location的列,该列将存储访问网站的位置:

  • ALTER TABLE visits ADD COLUMN location String;

    ALTER TABLE 访问 ADD COLUMN 位置String ;

You’ll see output similar to the following:

您将看到类似于以下内容的输出:


   
Output
ALTER TABLE visits ADD COLUMN location String Ok. 0 rows in set. Elapsed: 0.014 sec.

The output shows that you have added the location column successfully.

输出显示您已经成功添加了location列。

As of version 19.13.3, ClickHouse doesn’t support updating and deleting individual rows of data due to implementation constraints. ClickHouse has support for bulk updates and deletes, however, and has a distinct SQL syntax for these operations to highlight their non-standard usage.

从版本19.13.3开始,由于实施限制,ClickHouse不支持更新和删除单独的数据行。 ClickHouse支持批量更新和删除,并且为这些操作提供了独特SQL语法,以突出它们的非标准用法。

The following syntax is an example for bulk updating rows:

以下语法是批量更新行的示例:

ALTER TABLE table_name UPDATE  column_1 = value_1, column_2 = value_2 ...  WHERE  filter_conditions;

You’ll run the following statement to update the url column of all rows that have a duration of less than 15. Enter it into the database prompt to execute:

您将运行以下语句来更新duration小于15的所有行的url列。将其输入数据库提示以执行:

  • ALTER TABLE visits UPDATE url = 'http://example2.com' WHERE duration < 15;

    ALTER TABLE 访问 UPDATE url =' http://example2.com ' 持续时间<15 ;

The output of the bulk update statement will be as follows:

批量更新语句的输出如下:


   
Output
ALTER TABLE visits UPDATE url = 'http://example2.com' WHERE duration < 15 Ok. 0 rows in set. Elapsed: 0.003 sec.

The output shows that your update query completed successfully. The 0 rows in set in the output denotes that the query did not return any rows; this will be the case for any update and delete queries.

输出显示更新查询成功完成。 输出0 rows in set0 rows in set表示查询未返回任何行; 任何更新和删除查询都会如此。

The example syntax for bulk deleting rows is similar to updating rows and has the following structure:

批量删除行的示例语法类似于更新行,并具有以下结构:

ALTER TABLE table_name DELETE WHERE filter_conditions;

To test deleting data, run the following statement to remove all rows that have a duration of less than 5:

要测试删除数据,请运行以下语句以删除duration小于5所有行:

  • ALTER TABLE visits DELETE WHERE duration < 5;

    ALTER TABLE 访问 DELETE WHERE 持续时间<5 ;

The output of the bulk delete statement will be similar to:

批量删除语句的输出将类似于:


   
Output
ALTER TABLE visits DELETE WHERE duration < 5 Ok. 0 rows in set. Elapsed: 0.003 sec.

The output confirms that you have deleted the rows with a duration of less than five seconds.

输出确认您已删除持续时间少于五秒的行。

To delete columns from your table, the syntax would follow this example structure:

要从表中删除列,语法应遵循以下示例结构:

ALTER TABLE table_name DROP COLUMN column_name;

Delete the location column you added previously by running the following:

通过运行以下命令删除先前添加的location列:

  • ALTER TABLE visits DROP COLUMN location;

    ALTER TABLE 访问 DROP COLUMN的位置 ;

The DROP COLUMN output confirming that you have deleted the column will be as follows:

确认已删除该列的DROP COLUMN输出如下:


   
Output
ALTER TABLE visits DROP COLUMN location String Ok. 0 rows in set. Elapsed: 0.010 sec.

Now that you’ve successfully inserted, updated, and deleted rows and columns in your visits table, you’ll move on to query data in the next step.

现在,您已经成功地在visits表中插入,更新和删除了行和列,接下来将继续查询数据。

第5步-查询数据 (Step 5 — Querying Data)

ClickHouse’s query language is a custom dialect of SQL with extensions and functions suited for analytics workloads. In this step, you’ll run selection and aggregation queries to retrieve data and results from your visits table.

ClickHouse的查询语言是SQL的自定义方言,具有适用于分析工作负载的扩展名和功能。 在此步骤中,您将运行选择查询和汇总查询以从visits表中检索数据和结果。

Selection queries allow you to retrieve rows and columns of data filtered by conditions that you specify, along with options such as the number of rows to return. You can select rows and columns of data using the SELECT syntax. The basic syntax for SELECT queries is:

选择查询使您可以检索按指定条件过滤的数据的行和列,以及诸如要返回的行数之类的选项。 您可以使用SELECT语法SELECT数据的行和列。 SELECT查询的基本语法为:

SELECT func_1(column_1), func_2(column_2) FROM table_name WHERE filter_conditions row_options;

Execute the following statement to retrieve url and duration values for rows where the url is http://example.com.

执行以下语句以检索urlhttp://example.com行的urlduration值。

  • SELECT url, duration FROM visits WHERE url = 'http://example2.com' LIMIT 2;

    SELECT网址,持续时间FROM 访问次数,其中 URL ='http://example2.com'LIMIT 2;

You will see the following output:

您将看到以下输出:


   
Output
SELECT url, duration FROM visits WHERE url = 'http://example2.com' LIMIT 2 ┌─url─────────────────┬─duration─┐ │ http://example2.com │ 10.5 │ └─────────────────────┴──────────┘ ┌─url─────────────────┬─duration─┐ │ http://example2.com │ 13 │ └─────────────────────┴──────────┘ 2 rows in set. Elapsed: 0.013 sec.

The output has returned two rows that match the conditions you specified. Now that you’ve selected values, you can move on to executing aggregation queries.

输出已返回符合您指定条件的两行。 选择值之后,就可以继续执行聚合查询了

Aggregation queries are queries that operate on a set of values and return single output values. In analytics databases, these queries are run frequently and are well optimized by the database. Some aggregate functions supported by ClickHouse are:

聚合查询是对一组值进行操作并返回单个输出值的查询。 在分析数据库中,这些查询会经常运行,并由数据库很好地优化。 ClickHouse支持的一些汇总功能是:

  • count: returns the count of rows matching the conditions specified.

    count :返回符合指定条件的行数。

  • sum: returns the sum of selected column values.

    sum :返回选定列值的总和。

  • avg: returns the average of selected column values.

    avg :返回所选列值的平均值。

Some ClickHouse-specific aggregate functions include:

一些特定于ClickHouse的聚合函数包括:

  • uniq: returns an approximate number of distinct rows matched.

    uniq :返回匹配的不同行的大约数量。

  • topK: returns an array of the most frequent values of a specific column using an approximation algorithm.

    topK :使用近似算法返回特定列的最频繁值的数组。

To demonstrate the execution of aggregation queries, you’ll calculate the total duration of visits by running the sum query:

为了演示聚合查询的执行,您将通过运行sum查询来计算访问的总持续时间:

  • SELECT SUM(duration) FROM visits;

    从访问中选择总和(持续时间) ;

You will see output similar to the following:

您将看到类似于以下内容的输出:


   
Output
SELECT SUM(duration) FROM visits ┌─SUM(duration)─┐ │ 63.7 │ └───────────────┘ 1 rows in set. Elapsed: 0.010 sec.

Now, calculate the top two URLs by executing:

现在,通过执行以下命令来计算前两个URL:

  • SELECT topK(2)(url) FROM visits;

    从访问中选择topK(2)(url) ;

You will see output similar to the following:

您将看到类似于以下内容的输出:


   
Output
SELECT topK(2)(url) FROM visits ┌─topK(2)(url)──────────────────────────────────┐ │ ['http://example2.com','http://example1.com'] │ └───────────────────────────────────────────────┘ 1 rows in set. Elapsed: 0.010 sec.

Now that you have successfully queried your visits table, you’ll delete tables and databases in the next step.

现在您已经成功查询了visits表,接下来将删除表和数据库。

第6步-删除表和数据库 (Step 6 — Deleting Tables and Databases)

In this section, you’ll delete your visits table and test database.

在本部分中,您将删除您的visits表并test数据库。

The syntax for deleting tables follows this example:

删除表的语法遵循以下示例:

DROP TABLE table_name;

To delete the visits table, run the following statement:

要删除visits表,请运行以下语句:

  • DROP TABLE visits;

    DROP TABLE 访问 ;

You will see the following output declaring that you’ve deleted the table successfully:

您将看到以下输出,声明您已成功删除该表:


   
output
DROP TABLE visits Ok. 0 rows in set. Elapsed: 0.005 sec.

You can delete databases using the DROP database table_name syntax. To delete the test database, execute the following statement:

您可以使用DROP database table_name语法删除数据库。 要删除test数据库,请执行以下语句:

  • DROP DATABASE test;

    DROP DATABASE 测试 ;

The resulting output shows that you’ve deleted the database successfully.

结果输出表明您已成功删除数据库。


   
Output
DROP DATABASE test Ok. 0 rows in set. Elapsed: 0.003 sec.

You’ve deleted tables and databases in this step. Now that you’ve created, updated, and deleted databases, tables, and data in your ClickHouse instance, you’ll enable remote access to your database server in the next section.

您已在此步骤中删除了表和数据库。 现在,您已经在ClickHouse实例中创建,更新和删除了数据库,表和数据,接下来的部分将使您能够远程访问数据库服务器。

步骤7 —设置防火墙规则(可选) (Step 7 — Setting Up Firewall Rules (Optional))

If you intend to only use ClickHouse locally with applications running on the same server, or do not have a firewall enabled on your server, you don’t need to complete this section. If instead, you’ll be connecting to the ClickHouse database server remotely, you should follow this step.

如果打算仅将ClickHouse与在同一服务器上运行的应用程序一起在本地使用,或者在服务器上未启用防火墙,则无需完成本节。 如果相反,您将远程连接到ClickHouse数据库服务器,则应遵循此步骤。

Currently your server has a firewall enabled that disables your public IP address accessing all ports. You’ll complete the following two steps to allow remote access:

当前,您的服务器启用了防火墙,该防火墙禁止您的公共IP地址访问所有端口。 您将完成以下两个步骤以允许远程访问:

  • Modify ClickHouse’s configuration and allow it to listen on all interfaces.

    修改ClickHouse的配置,并允许其在所有接口上侦听。

  • Add a firewall rule allowing incoming connections to port 8123, which is the HTTP port that the ClickHouse server runs.

    添加一个防火墙规则,以允许进入端口8123传入连接,端口8123是ClickHouse服务器运行的HTTP端口。

If you are inside the database prompt, exit it by typing CTRL+D.

如果您在数据库提示中,请输入CTRL+D退出它。

Edit the configuration file by executing:

通过执行以下操作来编辑配置文件:

  • sudo nano /etc/clickhouse-server/config.xml

    须藤nano /etc/clickhouse-server/config.xml

Then uncomment the line containing <!-- <listen_host>0.0.0.0</listen_host> -->, like the following file:

然后取消注释包含<!-- <listen_host>0.0.0.0</listen_host> --> ,例如以下文件:

/etc/clickhouse-server/config.xml
/etc/clickhouse-server/config.xml
...
 <interserver_http_host>example.yandex.ru</interserver_http_host>
    -->

    <!-- Listen specified host. use :: (wildcard IPv6 address), if you want to accept connections both with IPv4 and IPv6 from everywhere. -->
    <!-- <listen_host>::</listen_host> -->
    <!-- Same for hosts with disabled ipv6: -->
    <listen_host>0.0.0.0</listen_host>

    <!-- Default values - try listen localhost on ipv4 and ipv6: -->
    <!--
    <listen_host>::1</listen_host>
    <listen_host>127.0.0.1</listen_host>
    -->
...

Save the file and exit. For the new configuration to apply restart the service by running:

保存文件并退出。 要应用新配置,请运行以下命令重新启动服务:

  • sudo service clickhouse-server restart

    sudo服务clickhouse-服务器重启

You won’t see any output from this command. ClickHouse’s server listens on port 8123 for HTTP connections and port 9000 for connections from clickhouse-client. Allow access to both ports for your second server’s IP address with the following command:

您将看不到此命令的任何输出。 ClickHouse的服务器在端口8123上侦听HTTP连接,在端口9000上侦听来自clickhouse-client连接。 使用以下命令允许访问第二个服务器的IP地址的两个端口:

  • sudo ufw allow from second_server_ip/32 to any port 8123

    sudo ufw允许从second_server_ip / 32到任何端口8123

  • sudo ufw allow from second_server_ip/32 to any port 9000

    sudo ufw允许从second_server_ip / 32到任何端口9000

You will see the following output for both commands that shows that you’ve enabled access to both ports:

这两个命令都将看到以下输出,表明您已启用对两个端口的访问:


   
Output
Rule added

ClickHouse will now be accessible from the IP that you added. Feel free to add additional IPs such as your local machine’s address if required.

现在可以从您添加的IP访问ClickHouse。 如果需要,可以随意添加其他IP,例如本地计算机的地址。

To verify that you can connect to the ClickHouse server from the remote machine, first follow the steps in Step 1 of this tutorial on the second server and ensure that you have the clickhouse-client installed on it.

要验证您可以从远程计算机连接到ClickHouse服务器,请首先在第二台服务器上按照本教程的步骤1中的步骤进行操作,并确保已在其上安装clickhouse-client

Now that you have logged in to the second server, start a client session by executing:

现在您已经登录到第二台服务器,通过执行以下命令启动客户端会话:

  • clickhouse-client --host your_server_ip --password

    clickhouse-client --host your_server_ip --password

You will see the following output that shows that you have connected successfully to the server:

您将看到以下输出,显示您已成功连接到服务器:


   
Output
ClickHouse client version 19.13.3.26 (official build). Password for user (default): Connecting to your_server_ip:9000 as user default. Connected to ClickHouse server version 19.13.3 revision 54425. hostname :)

In this step, you’ve enabled remote access to your ClickHouse database server by adjusting your firewall rules.

在此步骤中,您已通过调整防火墙规则启用了对ClickHouse数据库服务器的远程访问。

结论 (Conclusion)

You have successfully set up a ClickHouse database instance on your server and created a database and table, added data, performed queries, and deleted the database. Within ClickHouse’s documentation you can read about their benchmarks against other open-source and commercial analytics databases and general reference documents.

您已经在服务器上成功设置了ClickHouse数据库实例,并创建了数据库和表,添加了数据,执行了查询并删除了该数据库。 在ClickHouse的文档,你可以了解他们的基准,对其他的开源和商业分析数据库和一般的参考文献 。

Further features ClickHouse offers include distributed query processing across multiple servers to improve performance and protect against data loss by storing data over different shards.

ClickHouse提供的其他功能包括跨多个服务器的分布式查询处理,以通过将数据存储在不同的分片上来提高性能并防止数据丢失。

翻译自: https://www.digitalocean.com/community/tutorials/how-to-install-and-use-clickhouse-on-debian-10

debian 10 安装


http://www.niftyadmin.cn/n/3649343.html

相关文章

虚拟机安装Kali Linux操作系统

▣ 博主主站地址&#xff1a;微笑涛声 【www.cztcms.cn】 Kali Linux是基于Debian的Linux发行版&#xff0c; 设计用于数字取证操作系统。每一季度更新一次。由Offensive Security Ltd维护和资助。最先由Offensive Security的Mati Aharoni和Devon Kearns通过重写BackTrack来完成…

指定textview中的某部分进行点击

CharSequence str"没有任何数据&#xff0c;请登录!";SpannableString spannableString1 new SpannableString(str);spannableString1.setSpan(new ClickableSpan(){Overridepublic void onClick(View widget) {Intent intentnew Intent(getActivity(),LoginActivit…

Android重力感应代码例子

android中的很多游戏的游戏都使用了重力感应的技术&#xff0c;就研究了一下重力感应 以屏幕的左下方为原点&#xff0c;箭头指向的方向为正。从-10到10&#xff0c;以浮点数为等级单位&#xff0c;想象以下情形&#xff1a; 手机屏幕向上&#xff08;z轴朝天&#xff09;水平…

react中创建一个组件_如何在React中创建社交关注组件

react中创建一个组件介绍 (Introduction) When you’re building a web site, you’ll often want to share your Social Media accounts for visitors to follow. In this tutorial, you’ll create a Social Follow component in React, using the social media icons provid…

软件框架之imageloader的使用

Imageloader的使用 一、特点&#xff1a; 1&#xff09;多线程下载图片&#xff0c;图片可以来源于网络&#xff0c;文件系统&#xff0c;项目文件夹assets中以及drawable中等 2&#xff09;支持随意的配置ImageLoader&#xff0c;例如线程池&#xff0c;图片下载器&#xff…

实验三(OSPF)7 8

解题思路&#xff1a; 先配置好路由的环回及规划好IP地址&#xff0c;确保正确&#xff1b; &#xff08;由于r8模拟为运营商&#xff0c;因此r1,r2,r3各写一条缺省指向r8 并测试&#xff09; hub-spoke网络结构&#xff0c;需要在r1-r2-r3建立隧道0配置MGRE-多点通用路由协…

HarmonyOS 2.0 手机版使用初体验 ——手机开发者 (Beta版)

12月16日上午10点&#xff0c;华为在北京举办华为开发者日暨HarmonyOS2.0手机开发者Beta版发布活动。华为此次宣布面向手机开发者开放完整的HarmonyOS 2.0系统能力、丰富的API&#xff08;应用开发接口&#xff09;&#xff0c;以及强大的开发工具DevEco Studio等技术装备&…

angular4前后端分离_如何在Angular 4+中使用Apollo客户端GraphQL

angular4前后端分离Apollo Client is the flexible, community-driven GraphQL client for Angular, JavaScript, and native platforms. It is designed from the ground up to make it easy to build UI components that fetch data with GraphQL. This article is a quick s…