Previous post : Setup ClickHouse Cluster Replication with Zookeeper
Using a single-node ZooKeeper instance is suitable during the development and testing phase. However, in production environments, it is recommended to use an odd number of nodes greater than 1 to adhere to the quorum principle.
Moreover, adding or removing additional ZooKeeper nodes is possible based on the cluster size and load. To make configuration changes in a ZooKeeper node without any downtime, you can make changes and reboot one node at a time while the other nodes continue serving traffic. This approach helps avoid complete downtime and ensures the service remains available.
Also, It is recommended to have dedicated nodes for the zookeeper cluster and have fast storage (SSD) for daraDir
and logDir
to avoid latency.
Configure and Start Zookeeper
Install dependencies and zookeeper packages using apt-get and also use hostnamectl
utility to set hostname for easy identification.
root@ip-10-0-11-49:~# sudo apt-get update root@ip-10-0-11-49:~# sudo apt-get install zookeeper netcat root@ip-10-0-11-49:~# cat /etc/hosts 10.0.11.49 zookeeper1 10.0.8.194 zookeeper2 10.0.4.205 zookeeper3 root@ip-10-0-11-49:~# hostnamectl set-hostname zookeeper1 root@ip-10-0-8-194:~# hostnamectl set-hostname zookeeper2 root@ip-10-0-4-205:~# hostnamectl set-hostname zookeeper3 root@ip-10-0-11-49:~# reboot Configure /etc/hosts on all nodes in cluster
Configure three nodes Zookeeper cluster.
- – The
myid
should be unique on each node. - – Keep
zoo.cfg
file same across all nodes in a cluster. - – Make the necessary optimization as listed below.
- – Start the service.
# Configure this id and It should be unique across cluster. root@zookeeper1:~# cat /etc/zookeeper/conf/myid 1 root@zookeeper2:~# cat /etc/zookeeper/conf/myid 2 root@zookeeper3:~# cat /etc/zookeeper/conf/myid 3 # Keep below configuration same on all nodes in cluster on zookeeper , zookeeper1,zookeeper2,zookeeper3 root@zookeeper1:~# sudo cat /etc/zookeeper/conf/zoo.cfg tickTime=2000 initLimit=20 syncLimit=10 dataDir=/var/lib/zookeeper clientPort=2181 maxSessionTimeout=60000000 maxClientCnxns=2000 server.1=zookeeper1:2888:3888 autopurge.purgeInterval=1 autopurge.snapRetainCount=10 4lw.commands.whitelist=* preAllocSize=131072 snapCount=3000000 ## start service root@ip-10-0-11-49:~# sudo -u zookeeper /usr/share/zookeeper/bin/zkServer.sh restart ZooKeeper JMX enabled by default Using config: /etc/zookeeper/conf/zoo.cfg Starting zookeeper ... STARTED
Verify Zookeeper running status by executing the below commands.
- – Use
ruok , mntr , stat
commands. - – Can try making a connection to zookeeper node.
- – Check node status using: `/bin/zkServer.sh status`.
echo ruok | nc localhost 2181 echo mntr | nc localhost 2181 echo stat | nc localhost 2181 #Connect to Zookeeper /usr/share/zookeeper/bin/bin/zkCli.sh -server 127.0.0.1:2181 # Expect output like added below root@zookeeper1:~# echo ruok | nc localhost 2181 imokroot@zookeeper1:~# root@zookeeper1:~# echo mntr | nc localhost 2181 zk_version 3.4.13-6ubuntu4--1, built on Wed, 23 Mar 2022 13:40:46 +0000 zk_avg_latency 0 zk_max_latency 11 zk_min_latency 0 zk_packets_received 5108 zk_packets_sent 5115 zk_num_alive_connections 2 zk_outstanding_requests 0 zk_server_state leader zk_znode_count 191 zk_watch_count 7 zk_ephemerals_count 6 zk_approximate_data_size 20568 zk_open_file_descriptor_count 37 zk_max_file_descriptor_count 1048576 zk_fsync_threshold_exceed_count 0 zk_followers 2 zk_synced_followers 2 zk_pending_syncs 0 zk_last_proposal_size 36 zk_max_proposal_size 409 zk_min_proposal_size 32 Zookeeper version: 3.4.13-6ubuntu4--1, built on Wed, 23 Mar 2022 13:40:46 +0000 Clients: /0:0:0:0:0:0:0:1:34914[0](queued=0,recved=1,sent=0) /10.0.12.141:58302[1](queued=0,recved=2412,sent=2412) root@zookeeper1:~# echo stat | nc localhost 2181 Zookeeper version: 3.4.13-6ubuntu4--1, built on Wed, 23 Mar 2022 13:40:46 +0000 Clients: /10.0.12.141:58302[1](queued=0,recved=2498,sent=2498) /0:0:0:0:0:0:0:1:39064[0](queued=0,recved=1,sent=0) Latency min/avg/max: 0/0/11 Received: 5198 Sent: 5205 Connections: 2 Outstanding: 0 Zxid: 0x1000000d6 Mode: leader Node count: 191 Proposal sizes last/min/max: 36/32/409 root@zookeeper1:~# ## Also you can try making connections to test root@zookeeper1:~# /usr/share/zookeeper/bin/zkCli.sh -server 127.0.0.1:2181 Connecting to 127.0.0.1:2181 Welcome to ZooKeeper! JLine support is enabled WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: 127.0.0.1:2181(CONNECTED) 0] ## After successful installation you will see nodes with status where one node will be leader and other two nodes acting as follower root@zookeeper1:~# /usr/share/zookeeper/bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /etc/zookeeper/conf/zoo.cfg Mode: follower root@zookeeper1:~# root@zookeeper2:~# /usr/share/zookeeper/bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /etc/zookeeper/conf/zoo.cfg Mode: leader root@zookeeper3:~# /usr/share/zookeeper/bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /etc/zookeeper/conf/zoo.cfg Mode: follower root@zookeeper3:~#
Configuring ClickHouse to use ZooKeeper
Let click house Server know about each zookeeper node in cluster, make below changes in XML file.
root@clickhouse1:/etc/clickhouse-server/config.d# cat zookeeper.xml <yandex> <zookeeper> <node> <host>zookeeper1</host> <port>2181</port> </node> <node> <host>zookeeper2</host> <port>2181</port> </node> <node> <host>zookeeper3</host> <port>2181</port> </node> <session_timeout_ms>30000</session_timeout_ms> <operation_timeout_ms>10000</operation_timeout_ms> <!-- Optional. Chroot suffix. Should exist. --> <!-- <root>/path/to/zookeeper/node</root> --> <!-- Optional. ZooKeeper digest ACL string. --> <!-- <identity>user:password</identity> --> </zookeeper> <!-- Allow to execute distributed DDL queries (CREATE, DROP, ALTER, RENAME) on cluster. --> <!-- Works only if ZooKeeper is enabled. Comment it out if such functionality isn't required. --> <distributed_ddl> <!-- Path in ZooKeeper to queue with DDL queries --> <path>/clickhouse/task_queue/ddl</path> <!-- Settings from this profile will be used to execute DDL queries --> <!-- <profile>default</profile> --> </distributed_ddl> </yandex>
Restart ClickHouse and check the status of the cluster.
Verify Setup
clickhouse2 :) SELECT hostName(), * FROM clusterAllReplicas('cluster_demo_ash', system, one) SELECT hostName(), * FROM clusterAllReplicas('cluster_demo_ash', system, one) Query id: 212b066b-91c3-4ac9-84e3-3253fd4b50dc ┌─hostName()──┬─dummy─┐ │ clickhouse2 │ 0 │ └─────────────┴───────┘ ┌─hostName()──┬─dummy─┐ │ clickhouse1 │ 0 │ └─────────────┴───────┘ 2 rows in set. Elapsed: 0.005 sec.clickhouse1 :) CREATE DATABASE IF NOT EXISTS Example_DB_3 ON CLUSTER `{cluster}` CREATE DATABASE IF NOT EXISTS Example_DB_3 ON CLUSTER `{cluster}` Query id: 65eee69c-e7ef-441a-bd97-6ab72d8b0e0d ┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐ │ clickhouse2 │ 9000 │ 0 │ │ 1 │ 0 │ │ clickhouse1 │ 9000 │ 0 │ │ 0 │ 0 │ └─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘ 2 rows in set. Elapsed: 0.122 sec. clickhouse1 :) CREATE TABLE Example_DB_3.product ON CLUSTER '{cluster}' ( created_at DateTime, product_id UInt32, category UInt32 ) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{cluster}/{shard}/Example_DB_3/product', '{replica}') PARTITION BY toYYYYMM(created_at) ORDER BY (product_id, toDate(created_at), category) SAMPLE BY category; CREATE TABLE Example_DB_3.product ON CLUSTER `{cluster}` ( `created_at` DateTime, `product_id` UInt32, `category` UInt32 ) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{cluster}/{shard}/Example_DB_3/product', '{replica}') PARTITION BY toYYYYMM(created_at) ORDER BY (product_id, toDate(created_at), category) SAMPLE BY category Query id: 2e57061c-9b89-474a-b987-a654986f1e5d ┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐ │ clickhouse2 │ 9000 │ 0 │ │ 1 │ 0 │ │ clickhouse1 │ 9000 │ 0 │ │ 0 │ 0 │ └─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘ 2 rows in set. Elapsed: 0.285 sec. clickhouse2 :) INSERT INTO Example_DB_3.product VALUES (now(),9832,11); INSERT INTO Example_DB_3.product FORMAT Values Query id: d5f7870f-76ce-4da4-acb8-22e67a599dab Ok. 1 row in set. Elapsed: 0.010 sec. clickhouse2 :) INSERT INTO Example_DB_3.product VALUES (now(),123,12); INSERT INTO Example_DB_3.product FORMAT Values Query id: 1bcea5df-9c62-4d2d-a6a5-6791204fa7e8 Ok. 1 row in set. Elapsed: 0.010 sec. clickhouse2 :) INSERT INTO Example_DB_3.product VALUES (now(),0987,13); INSERT INTO Example_DB_3.product FORMAT Values Query id: 7d84d232-1ac2-45c7-909e-8aed69339e88 Ok. 1 row in set. Elapsed: 0.010 sec. clickhouse1 :) select * from Example_DB_3.product; SELECT * FROM Example_DB_3.product Query id: 475fd914-2bbd-470a-956e-f90c836dbb24 ┌──────────created_at─┬─product_id─┬─category─┐ │ 2023-04-24 08:02:17 │ 123 │ 12 │ └─────────────────────┴────────────┴──────────┘ ┌──────────created_at─┬─product_id─┬─category─┐ │ 2023-04-24 08:02:12 │ 9832 │ 11 │ └─────────────────────┴────────────┴──────────┘ ┌──────────created_at─┬─product_id─┬─category─┐ │ 2023-04-24 08:02:21 │ 987 │ 13 │ └─────────────────────┴────────────┴──────────┘ 3 rows in set. Elapsed: 0.003 sec. clickhouse2 :) select * from system.zookeeper WHERE path = '/clickhouse/tables/cluster_demo_ash/1/default/test' FORMAT Vertical SELECT * FROM system.zookeeper WHERE path = '/clickhouse/tables/cluster_demo_ash/1/default/test' FORMAT Vertical Query id: 83ad70ff-1417-4a44-a16a-1a58d9fdd5d0 Row 1: ────── name: alter_partition_version value: path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 2: ────── name: metadata value: metadata format version: 1 date column: sampling expression: userid index granularity: 8192 mode: 0 sign column: primary key: contractid, toDate(timestamp), userid data format version: 1 partition key: toYYYYMM(timestamp) granularity bytes: 10485760 path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 3: ────── name: temp value: path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 4: ────── name: table_shared_id value: d4497dd6-48f3-4a78-ac0e-615cf9ff1450 path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 5: ────── name: log value: path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 6: ────── name: leader_election value: path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 7: ────── name: columns value: columns format version: 1 3 columns: `timestamp` DateTime `contractid` UInt32 `userid` UInt32 path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 8: ────── name: blocks value: path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 9: ─────── name: async_blocks value: path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 10: ─────── name: nonincrement_block_numbers value: path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 11: ─────── name: replicas value: last added replica: clickhouse1 path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 12: ─────── name: quorum value: path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 13: ─────── name: pinned_part_uuids value: {"part_uuids":"[]"} path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 14: ─────── name: block_numbers value: path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 15: ─────── name: mutations value: path: /clickhouse/tables/cluster_demo_ash/1/default/test Row 16: ─────── name: part_moves_shard value: path: /clickhouse/tables/cluster_demo_ash/1/default/test 16 rows in set. Elapsed: 0.004 sec. clickhouse2 :)
To dockerize your application with the ClickHouse cluster and Zookeeper nodes, Please refer https://github.com/ashwini-ahire7/clickhouse-replication-zookeeper_HA