Some operations need to be checked before closing an actively working ClickHouse Node. If you suddenly turn off a working node without paying attention to these works, you may encounter unwanted situations in your cluster structure. In this article, we will explain the transactions/parameters that need to be checked before shutdown a node.
First of all, it is possible to shut down the server on the fly, but that would cause failure in some of the queries. You can follow the steps given below to avoid that;
Remove the server (which is going to be disabled) from the remote_server section of config.xml on all servers.
Remove the server from a load balancer so that new queries wouldn’t come into the node.
If you are using third-party tools like Kafka, Rabbit, etc. do not forget to detach them.
Wait until all running queries finish execution It’s possible to check it via query:
SHOW PROCESSLIST;
Secondly, ensure there is no pending data in distributed tables. Check it via the query:
SELECT * FROM system.distribution_queue; SYSTEM FLUSH DISTRIBUTED table_name;
Run sync replica query in related shard replicas via the query:
SYSTEM SYNC REPLICA db.table;
Now you can shut down your server.
NOTE : SYSTEM SHUTDOWN query by default doesn’t wait until query completion and tries to kill all queries immediately after receiving a signal; if you want to change this behavior, you need to enable setting shutdown_wait_unfinished_queries.
To do that you can run the following command:
SET shutdown_wait_unfinished_queries = 1;
shutdown_wait_unfinished_queries
Enables or disables waiting for unfinished queries when the shutdown server.
Possible values:
- 0 — Disabled.
- 1 — Enabled. The wait time equal shutdown_wait_unfinished config.
Default value: 0.
For more information, please visit the official ClickHouse Docs.