Home
Knowledge Base
ClickHouse
How to configure ClickHouse Keeper?

How to configure ClickHouse Keeper?

Configuring ClickHouse Keeper?

ClickHouse Keeper is a built-in solution with ClickHouse Server for implementing ClickHouse Replication solutions for horizontal scalability across nodes and clusters. So you don’t have to worry about ZooKeeper installation and configuration outside the ClickHouse infrastructure. In this blog post, we have explained how to build/configure ClickHouse Keeper across 3 nodes of Linux to evaluate distributed operations.

Configuring Nodes with Keeper settings
▬▬▬▬▬▬▬▬▬▬▬▬▬

Step 1: ClickHouse installation across 3 Linux nodes – We will call them clickhousen1, clickhousen2 and clickhousen3

Step 2: Enter the following details for allowing the external communication through the network interface

<listen_host>0.0.0.0</listen_host>

Step 3: Configure the ClickHouse Keeper across all three servers updating <server_id> setting with values “1” for clickhousen1, “2” for clickhousen2 and “3” for clickhousen3

<keeper_server>
    <tcp_port>9181</tcp_port>
    <server_id>1</server_id>
    <log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>
    <snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>

    <coordination_settings>
        <operation_timeout_ms>10000</operation_timeout_ms>
        <session_timeout_ms>30000</session_timeout_ms>
        <raft_logs_level>warning</raft_logs_level>
    </coordination_settings>

    <raft_configuration>
        <server>
            <id>1</id>
            <hostname>clickhousen1.domain.com</hostname>
            <port>9444</port>
        </server>
        <server>
            <id>2</id>
            <hostname>clickhousen2.domain.com</hostname>
            <port>9444</port>
        </server>
        <server>
            <id>3</id>
            <hostname>clickhousen3.domain.com</hostname>
            <port>9444</port>
        </server>
    </raft_configuration>
</keeper_server>

The detailed description of configuration parameters is copied below ( Source: https://clickhouse.com/docs/en/guides/sre/clickhouse-keeper/ )

Parameter	Description	Example
tcp_port	port to be used by clients of keeper	9181 default equivalent of 2181 as in zookeeper
server_id	Unique identifier for each CLickHouse Keeper server which is used in raft configuration	1
coordination_settings	section to parameters such as timeouts	timeouts: 10000, log level: trace
server	definition of server participating	list of each server definition
raft_configuration	settings for each server in the keeper cluster	server and settings for each
id	numeric id of the server for keeper services	1
hostname	hostname, IP or FQDN of each server in the keeper cluster	clickhousen1.domain.com
port	port to listen on for interserver keeper connections	9444

Step 4: Enabling the ZooKeeper component using ClickHouseKeeper storage emm Enable the Zookeeper component. It will use the ClickHouse Keeper engine:

Parameter	Description	Example
node	list of nodes for ClickHouse Keeper connections	settings entry for each server
host	hostname, IP or FQDN of each ClickHouse keepr node	clickhousen1.domain.com
port	ClickHouse Keeper client port	9181

Step 5: Restart ClickHouse Server:

Restart ClickHouse and verify that each Keeper instance is running. Execute the following command on each server. The ruok command returns imok if Keeper is running and healthy:

P.S. – command on each server. The ruok command returns imok if Keeper is running and healthy:

# echo ruok | nc localhost 9181; echo
imok

The system database has a table named zookeeper that contains the details of your ClickHouse Keeper instances. Let’s view the table:

SELECT *
FROM system.zookeeper
WHERE path IN ('/', '/clickhouse')

The table will look like this:

┌─name───────┬─value─┬─czxid─┬─mzxid─┬───────────────ctime─┬───────────────mtime─┬─version─┬─cversion─┬─aversion─┬─ephemeralOwner─┬─dataLength─┬─numChildren─┬─pzxid─┬─path────────┐
│ clickhouse │       │   618 │   579 │ 2022-05-22 13:17:11 │ 2022-05-22 13:17:11 │       0 │        2 │        0 │              0 │          0 │           2 │  5693 │ /           │
│ task_queue │       │   791 │   681 │ 2022-05-22 13:17:11 │ 2022-05-22 13:17:11 │       0 │        1 │        0 │              0 │          0 │           1 │   126 │ /clickhouse │
│ tables     │       │  2139 │  1259 │ 2022-05-22 13:17:11 │ 2022-05-22 13:17:11 │       0 │        3 │        0 │              0 │          0 │           3 │  6461 │ /clickhouse │
└────────────┴───────┴───────┴───────┴─────────────────────┴─────────────────────┴─────────┴──────────┴──────────┴────────────────┴────────────┴─────────────┴───────┴─────────────┘

How to configure a cluster in ClickHouse?

In this example, we are configuring a very simple cluster with just 2 shards (please update configuration on clickhousen1 and clickhousen2) and a replica on 2 of the modes. We are using the third node to build a quorum for ClickHouse Keeper. The following cluster builds 1 shared on each node for a total of 2 shards with no replication. So some data will be on nod1 and others will be on node2

<cluster_2shards_1Repl>
    <shard>
        <replica>
            <host>clickhousen1.domain.com</host>
            <port>9000</port>
            <user>default</user>
            <password>ChistaDATA@12345</password>
        </replica>
    </shard>
    <shard>
        <replica>
            <host>clickhousen2.domain.com</host>
            <port>9000</port>
            <user>default</user>
            <password>ChistaDATA@12345</password>
        </replica>
    </shard>
</cluster_2shards_1Repl>

Parameter	Description
shard	A larger datasets will be split in smaller chunks and stored across multiple data nodes for scalability and reliability
replica	Retaining additional copies of the database across the nodes for performance, scalability (READs mostly and reliability)
host	hostname, IP or FQDN of server t
port	An endpoint of service for communication purposes.
user	User for login
password	Password for successful authorisation/connectivity

Step 6: Restart ClickHouse Server and validate for the successful creation of Cluster:

SHOW clusters;

Note if your above steps had completed successfully you should be seeing the cluster details as copied below:

cluster_2shards_1Repl

Creating a distributed table to validate clustering

Step 1: create a new database using ClickHouse Client on clickhousen1. This will create a new database on both nodes:

CREATE DATABASE cdat1 ON CLUSTER 'cluster_2shards_1Repl';

Step 2: Create a new table on the cdat1 database:

CREATE TABLE cdat1.tab1 on cluster 'cluster_2shards_1Repl'
(
    `id` UInt64,
    `column1` String
)
ENGINE = MergeTree
ORDER BY column1

Step 3: Please insert clickhousen1 with rows as copied below:

INSERT INTO cdat1.tab1
    (id, column1)
VALUES
    (50, 'dec'),
    (51, 'jan')

Step 4: Please insert clickhousen2 with rows as copied below:

INSERT INTO cdat1.tab1
    (id, column1)
VALUES
    (53, 'mar'),
    (54, 'feb')

Please Note: The SELECTs on each node show only the data on that node:

clickhousen1

┌─id─┬─column1─┐
│ 50 │ dec     │
│ 51 │ jan     │
└────┴─────────┘

clickhousen2

┌─id─┬─column1─┐
│ 53 │ mar     │
│ 54 │ feb     │
└────┴─────────┘

Step 5: To represent the data on both shards you can create a Distributed Table:

CREATE TABLE cdat1.dist_table (
    id UInt64,
    column1 String
)
ENGINE = Distributed(cluster_2shards_1Repl,cdat1,tab1)

Note:

You can query distributed table (cdat1.dist_table) to return all four rows of data from the two shards

SELECT *
FROM cdat1.dist_table


┌─id─┬─column1─┐
│ 50 │ dec     │
│ 51 │ jan     │
└────┴─────────┘
┌─id─┬─column1─┐
│ 53 │ mar     │
│ 54 │ feb     │
└────┴─────────┘

Summary

The objective of this blog post is to explain to you the step-by-step installation and configuration of ClickHouse Keeper. Thanks for reading and your comments.

References: https://clickhouse.com/docs/en/guides/sre/

Was this article helpful?

Yes No

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

ChistaDATA Inc.

Enterprise-class 24*7 ClickHouse Consultative Support and Managed Services

How to configure ClickHouse Keeper?

Configuring ClickHouse Keeper?

Configuring Nodes with Keeper settings
▬▬▬▬▬▬▬▬▬▬▬▬▬

Step 1: ClickHouse installation across 3 Linux nodes – We will call them clickhousen1, clickhousen2 and clickhousen3

Step 2: Enter the following details for allowing the external communication through the network interface

Step 3: Configure the ClickHouse Keeper across all three servers updating <server_id> setting with values “1” for clickhousen1, “2” for clickhousen2 and “3” for clickhousen3

Step 4: Enabling the ZooKeeper component using ClickHouseKeeper storage emm Enable the Zookeeper component. It will use the ClickHouse Keeper engine:

Step 5: Restart ClickHouse Server:

How to configure a cluster in ClickHouse?

Step 6: Restart ClickHouse Server and validate for the successful creation of Cluster:

Creating a distributed table to validate clustering

Step 3: Please insert clickhousen1 with rows as copied below:

Step 4: Please insert clickhousen2 with rows as copied below:

Step 5: To represent the data on both shards you can create a Distributed Table:

Summary

CHISTADATA IS COMMITTED TO OPEN SOURCE SOFTWARE AND BUILDING HIGH PERFORMANCE COLUMNSTORES

Contents

Need Support?

ChistaDATA Inc. Knowledge base is licensed under the Apache License, Version 2.0 (the “License”)

How to configure ClickHouse Keeper?

Configuring ClickHouse Keeper?

Configuring Nodes with Keeper settings ▬▬▬▬▬▬▬▬▬▬▬▬▬

Step 1: ClickHouse installation across 3 Linux nodes – We will call them clickhousen1, clickhousen2 and clickhousen3

Step 2: Enter the following details for allowing the external communication through the network interface

Step 3: Configure the ClickHouse Keeper across all three servers updating <server_id> setting with values “1” for clickhousen1, “2” for clickhousen2 and “3” for clickhousen3

Step 4: Enabling the ZooKeeper component using ClickHouseKeeper storage emm Enable the Zookeeper component. It will use the ClickHouse Keeper engine:

Step 5: Restart ClickHouse Server:

How to configure a cluster in ClickHouse?

Step 6: Restart ClickHouse Server and validate for the successful creation of Cluster:

Creating a distributed table to validate clustering

Step 3: Please insert clickhousen1 with rows as copied below:

Step 4: Please insert clickhousen2 with rows as copied below:

Step 5: To represent the data on both shards you can create a Distributed Table:

Summary

Related Articles

CHISTADATA IS COMMITTED TO OPEN SOURCE SOFTWARE AND BUILDING HIGH PERFORMANCE COLUMNSTORES

Contents

Need Support?

ChistaDATA Inc. Knowledge base is licensed under the Apache License, Version 2.0 (the “License”)

Configuring Nodes with Keeper settings
▬▬▬▬▬▬▬▬▬▬▬▬▬