ClickHouse Horizontal Scaling: Optimal Read-Write Split Configuration and Execution

Introduction

Scaling ClickHouse horizontally while optimizing the split between read and write operations is a multi-step process that involves setting up a cluster with sharded and replicated tables. Here’s a detailed guide, including configuration settings and a run-book.

Understanding ClickHouse Horizontal Scaling

  1. Sharding: Distributes the data across multiple nodes to balance the write load.
  2. Replication: Creates copies of data on different nodes to increase read capacity and fault tolerance.

Pre-requisites

  • Multiple ClickHouse nodes (servers).
  • ZooKeeper cluster for coordinating replicas.

Step 1: Configuring the Cluster

Cluster Configuration (config.xml)

  1. Define each shard and its replicas. Example for a cluster with 2 shards, each having 2 replicas:
<yandex>
    ...
    <clickhouse_remote_servers>
        <my_cluster>
            <shard>
                <replica>
                    <host>node1a</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>node1b</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <replica>
                    <host>node2a</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>node2b</host>
                    <port>9000</port>
                </replica>
            </shard>
        </my_cluster>
    </clickhouse_remote_servers>
    ...
</yandex>

2. Configure ZooKeeper nodes. Add this section to config.xml:

<zookeeper-servers>
    <node index="1">
        <host>zk1</host>
        <port>2181</port>
    </node>
    <node index="2">
        <host>zk2</host>
        <port>2181</port>
    </node>
    <node index="3">
        <host>zk3</host>
        <port>2181</port>
    </node>
</zookeeper-servers>
  1. Repeat this configuration on all ClickHouse nodes.

Step 2: Creating Sharded and Replicated Tables

  1. Create a Distributed table that spans the entire cluster:
CREATE TABLE mydb.my_table_global ON CLUSTER my_cluster AS mydb.my_table
ENGINE = Distributed(my_cluster, mydb, my_table, rand());

2. Create a ReplicatedMergeTree table on each shard:

CREATE TABLE mydb.my_table ON CLUSTER my_cluster (
    ...
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/my_table', '{replica}')
ORDER BY ...
  1. The Distributed table will automatically distribute data among shards.

Step 3: Writing Data

  • Write operations should be directed to the Distributed table (my_table_global). It will distribute the inserts across the shards.

Step 4: Reading Data

  • Read operations can be directed to either the Distributed table for querying the entire dataset or directly to the ReplicatedMergeTree tables for shard-local reads.
  • Load balancers can be used to distribute read queries among replicas for load balancing.

Step 5: Monitoring and Maintenance

  • Regularly monitor the performance, balance of the shards, and replication lag.
  • Adjust sharding strategy based on data growth and query patterns.

Best Practices

  • Test your configuration in a staging environment before deploying to production.
  • Monitor query performance and adjust indexes and table schemas as necessary.
  • Regularly back up your data.

Conclusion

This run-book outlines the basic steps to scale ClickHouse horizontally by setting up a sharded and replicated cluster. The key is to balance the write load across shards and distribute read queries among replicas. Continuous monitoring and adjustments are crucial for optimal performance in a dynamically scaling environment.

To read more on Horizontal Scaling in ClickHouse, do consider reading the below articles:  

About Shiv Iyer 206 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.