1. Home
  2. Knowledge Base
  3. ClickHouse
  4. How to configure ClickHouse Keeper?

How to configure ClickHouse Keeper?

Configuring ClickHouse Keeper?


ClickHouse Keeper is a built-in solution with ClickHouse Server for implementing ClickHouse Replication solutions for horizontal scalability across nodes and clusters. So you don’t have to worry about ZooKeeper installation and configuration outside the ClickHouse infrastructure. In this blog post, we have explained how to build/configure ClickHouse Keeper across 3 nodes of Linux to evaluate distributed operations.

Configuring Nodes with Keeper settings
▬▬▬▬▬▬▬▬▬▬▬▬▬

Step 1: ClickHouse installation across 3 Linux nodes – We will call them clickhousen1, clickhousen2 and clickhousen3

Step 2: Enter the following details for allowing the external communication through the network interface

<listen_host>0.0.0.0</listen_host>

Step 3: Configure the ClickHouse Keeper across all three servers updating <server_id> setting with values “1” for clickhousen1, “2” for clickhousen2 and “3” for clickhousen3

<keeper_server>
    <tcp_port>9181</tcp_port>
    <server_id>1</server_id>
    <log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>
    <snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>

    <coordination_settings>
        <operation_timeout_ms>10000</operation_timeout_ms>
        <session_timeout_ms>30000</session_timeout_ms>
        <raft_logs_level>warning</raft_logs_level>
    </coordination_settings>

    <raft_configuration>
        <server>
            <id>1</id>
            <hostname>clickhousen1.domain.com</hostname>
            <port>9444</port>
        </server>
        <server>
            <id>2</id>
            <hostname>clickhousen2.domain.com</hostname>
            <port>9444</port>
        </server>
        <server>
            <id>3</id>
            <hostname>clickhousen3.domain.com</hostname>
            <port>9444</port>
        </server>
    </raft_configuration>
</keeper_server>

The detailed description of configuration parameters is copied below ( Source: https://clickhouse.com/docs/en/guides/sre/clickhouse-keeper/ )

ParameterDescriptionExample
tcp_portport to be used by clients of keeper9181 default equivalent of 2181 as in zookeeper
server_idUnique identifier for each CLickHouse Keeper server which is used in raft configuration1
coordination_settingssection to parameters such as timeoutstimeouts: 10000, log level: trace
serverdefinition of server participatinglist of each server definition
raft_configurationsettings for each server in the keeper clusterserver and settings for each
idnumeric id of the server for keeper services1
hostnamehostname, IP or FQDN of each server in the keeper clusterclickhousen1.domain.com
portport to listen on for interserver keeper connections9444

Step 4: Enabling the ZooKeeper component using ClickHouseKeeper storage emm Enable the Zookeeper component. It will use the ClickHouse Keeper engine:

ParameterDescriptionExample
nodelist of nodes for ClickHouse Keeper connectionssettings entry for each server
hosthostname, IP or FQDN of each ClickHouse keepr nodeclickhousen1.domain.com
portClickHouse Keeper client port9181

Step 5: Restart ClickHouse Server:

Restart ClickHouse and verify that each Keeper instance is running. Execute the following command on each server. The ruok command returns imok if Keeper is running and healthy:

P.S. – command on each server. The ruok command returns imok if Keeper is running and healthy:

# echo ruok | nc localhost 9181; echo
imok

The system database has a table named zookeeper that contains the details of your ClickHouse Keeper instances. Let’s view the table:

SELECT *
FROM system.zookeeper
WHERE path IN ('/', '/clickhouse')

The table will look like this:

┌─name───────┬─value─┬─czxid─┬─mzxid─┬───────────────ctime─┬───────────────mtime─┬─version─┬─cversion─┬─aversion─┬─ephemeralOwner─┬─dataLength─┬─numChildren─┬─pzxid─┬─path────────┐
│ clickhouse │       │   618 │   579 │ 2022-05-22 13:17:11 │ 2022-05-22 13:17:11 │       0 │        2 │        0 │              0 │          0 │           2 │  5693 │ /           │
│ task_queue │       │   791 │   681 │ 2022-05-22 13:17:11 │ 2022-05-22 13:17:11 │       0 │        1 │        0 │              0 │          0 │           1 │   126 │ /clickhouse │
│ tables     │       │  2139 │  1259 │ 2022-05-22 13:17:11 │ 2022-05-22 13:17:11 │       0 │        3 │        0 │              0 │          0 │           3 │  6461 │ /clickhouse │
└────────────┴───────┴───────┴───────┴─────────────────────┴─────────────────────┴─────────┴──────────┴──────────┴────────────────┴────────────┴─────────────┴───────┴─────────────┘

How to configure a cluster in ClickHouse?

In this example, we are configuring a very simple cluster with just 2 shards (please update configuration on clickhousen1 and clickhousen2) and a replica on 2 of the modes. We are using the third node to build a quorum for ClickHouse Keeper. The following cluster builds 1 shared on each node for a total of 2 shards with no replication. So some data will be on nod1 and others will be on node2

 

<cluster_2shards_1Repl>
    <shard>
        <replica>
            <host>clickhousen1.domain.com</host>
            <port>9000</port>
            <user>default</user>
            <password>ChistaDATA@12345</password>
        </replica>
    </shard>
    <shard>
        <replica>
            <host>clickhousen2.domain.com</host>
            <port>9000</port>
            <user>default</user>
            <password>ChistaDATA@12345</password>
        </replica>
    </shard>
</cluster_2shards_1Repl>
ParameterDescription
shardA larger datasets will be split in smaller chunks and stored across multiple data nodes for scalability and reliability
replicaRetaining additional copies of the database across the nodes for performance, scalability (READs mostly and reliability)
hosthostname, IP or FQDN of server t
portAn endpoint of service for communication purposes.
userUser for login
passwordPassword for successful authorisation/connectivity

Step 6: Restart ClickHouse Server and validate for the successful creation of Cluster:

SHOW clusters;

Note if your above steps had completed successfully you should be seeing the cluster details as copied below: 

cluster_2shards_1Repl

Creating a distributed table  to validate clustering

Step 1: create a new database using ClickHouse Client on clickhousen1. This will create a new database on both nodes:

CREATE DATABASE cdat1 ON CLUSTER 'cluster_2shards_1Repl';

Step 2: Create a new table on the cdat1 database:

CREATE TABLE cdat1.tab1 on cluster 'cluster_2shards_1Repl'
(
    `id` UInt64,
    `column1` String
)
ENGINE = MergeTree
ORDER BY column1

Step 3: Please insert clickhousen1 with rows as copied below:

INSERT INTO cdat1.tab1
    (id, column1)
VALUES
    (50, 'dec'),
    (51, 'jan')

Step 4: Please insert clickhousen2 with rows as copied below:

INSERT INTO cdat1.tab1
    (id, column1)
VALUES
    (53, 'mar'),
    (54, 'feb')

Please Note: The SELECTs on each node show only the data on that node:

clickhousen1 

┌─id─┬─column1─┐
│ 50 │ dec     │
│ 51 │ jan     │
└────┴─────────┘

clickhousen2 

┌─id─┬─column1─┐
│ 53 │ mar     │
│ 54 │ feb     │
└────┴─────────┘

Step 5: To represent the data on both shards you can create a Distributed Table:

CREATE TABLE cdat1.dist_table (
    id UInt64,
    column1 String
)
ENGINE = Distributed(cluster_2shards_1Repl,cdat1,tab1)

Note:

You can query distributed table (cdat1.dist_table) to return all four rows of data from the two shards

SELECT *
FROM cdat1.dist_table


┌─id─┬─column1─┐
│ 50 │ dec     │
│ 51 │ jan     │
└────┴─────────┘
┌─id─┬─column1─┐
│ 53 │ mar     │
│ 54 │ feb     │
└────┴─────────┘

Summary

The objective of this blog post is to explain to you the step-by-step installation and configuration of ClickHouse Keeper. Thanks for reading and your comments.

 

References: https://clickhouse.com/docs/en/guides/sre/

Was this article helpful?

Related Articles

CHISTADATA IS COMMITTED TO OPEN SOURCE SOFTWARE AND BUILDING HIGH PERFORMANCE COLUMNSTORES

In the spirit of freedom, independence and innovation. ChistaDATA Corporation is not affiliated with ClickHouse Corporation 

Need Support?

Can't find the answer you're looking for?
Contact Support

ChistaDATA Inc. Knowledge base is licensed under the Apache License, Version 2.0 (the “License”)

Copyright 2022 ChistaDATA Inc

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.