ClickHouse MergeTree: Understanding Storage Policies and Load Balancing

Introduction

Optimizing data distribution and storage utilization is paramount when working with MergeTree tables in ClickHouse. This involves understanding and leveraging storage policies and load balancing mechanisms to ensure efficient data placement across disks and volumes. In this guide, we delve into the significance of storage policies, exploring load-balancing options and configuration settings.

When working with MergeTree tables in ClickHouse, data distribution is crucial and is primarily based on the storage policy implemented during table creation. The storage policy determines how the data is distributed across the underlying disks and volumes, ensuring efficient load balancing and optimal data placement.

Load balancing policies in MergeTree

Two load-balancing policies are available:
The round_robin policy – distributes new data across disks in a round-robin fashion, ensuring an even distribution of data across disks.
The least_used policy – distributes new data to the disk with the least amount of data written to it so far, ensuring that the disk with the most available space receives the newest data.

 

Let’s consider a scenario where you have a storage policy using an additional volume and underlying using three disks. In that case, By default, ClickHouse uses the round_robin policy for load balancing the rows; and data distribution happens between the disks in a round-robin manner. Each insert (or merge) creates a part on the next disk in the volume.

Alternatively, ClickHouse offers another load-balancing policy called “least_used.” This policy distributes new data to the disk with the least written data. By utilizing the “least_used” policy, ClickHouse ensures that the disk with the most available space receives the newest data, enabling optimal storage utilization.

Although ClickHouse defaults to the “round_robin” policy, you can specify the “least_used” policy by adding the <load_balancing>least_used</load_balancing> line to your storage configuration file, allowing you to customize data distribution according to your requirements.

To gain further insight into your setup and understand how data is distributed, you can execute the following queries –

  1. Retrieve information about the storage policies, including the policy name, volume name, and associated disks:
SELECT policy_name, volume_name, disks FROM system.storage_policies;
  1. Obtain details about the parts of a specific table, including the part name, disk name, and path, while ensuring they are active:
SELECT name, disk_name, path FROM system.parts WHERE (table = 'table_name') AND active;
  1. Fetch information about tables that match a specific name pattern, providing data paths, metadata path, and storage policy details:
SELECT name, data_paths, metadata_path, storage_policy FROM system.tables WHERE name LIKE '%table_name%';

It’s worth noting that while data is distributed across disks, the table metadata remains on the default disk.

Furthermore, when background merges occur, ClickHouse collects data from parts residing on different disks and generates a new larger part on one of the disks within the volume. This process also follows the round-robin approach, ensuring balanced data distribution and efficient storage management.

Adding to the above, when you have a storage policy defined It is important to note that when creating a storage policy, additional settings must be specified while creating the table to ensure that the table engine directs the data storage based on the defined storage configuration. The following scenario exemplifies this:

A client recently encountered an issue where data was not being directed to a newly added disk. Upon investigation, it was discovered that the table in question was missing the necessary setting for the storage_policy. By including the appropriate configuration, such problems can be avoided.

The configuration snippet below outlines the setup:

<storage_configuration>
    <disks>
        <default>
            <keep_free_space_bytes>1024</keep_free_space_bytes>
        </default>
        <data1>
            <path>/data1/clickhouse/</path>
        </data1>
    </disks>
    <policies>
        <op_data>
            <volumes>
                <op_data_volume>
                    <disk>data1</disk>
                </op_data_volume>
            </volumes>
        </op_data>
    </policies>
</storage_configuration>

This configuration establishes a basic storage configuration with a single storage policy named ‘op_data’. The ‘op_data’ policy utilizes the ‘data1’ disk for data storage. Additionally, a default disk setting is specified to reserve a small amount of free space on each disk.

To ensure that the data is directed correctly based on the storage configuration, it is important to include the following settings when creating the table:

CREATE TABLE IF NOT EXISTS TABLE_NAME
(
    column1          UInt8,
    ...
    column8          UInt32,
    column9          UInt32
)
    ENGINE = MergeTree PARTITION BY toYYYYMMDD(toDate(column7))
    PRIMARY KEY (column2, column8)
    storage_policy = 'op_data';

After including the ‘storage_policy’ setting in the table creation statement, the engine will correctly associate the table with the ‘op_data’ storage policy, ensuring the data is stored according to the defined configuration.

Conclusion

Understanding and utilizing storage policies and load balancing in ClickHouse’s MergeTree tables is crucial for optimizing system performance and data distribution. By implementing the appropriate policies and configurations, you can ensure efficient storage utilization and enhance query execution efficiency for better overall performance.

To know more about MergeTree performance in ClickHouse, do read the following articles:

 

You might also like:

About ChistaDATA Inc. 174 Articles
We are an full-stack ClickHouse infrastructure operations Consulting, Support and Managed Services provider with core expertise in performance, scalability and data SRE. Based out of California, Our consulting and support engineering team operates out of San Francisco, Vancouver, London, Germany, Russia, Ukraine, Australia, Singapore and India to deliver 24*7 enterprise-class consultative support and managed services. We operate very closely with some of the largest and planet-scale internet properties like PayPal, Garmin, Honda cars IoT project, Viacom, National Geographic, Nike, Morgan Stanley, American Express Travel, VISA, Netflix, PRADA, Blue Dart, Carlsberg, Sony, Unilever etc