Workload Scheduling In ClickHouse

Workload scheduling

Introduction

Workload scheduling is a sophisticated method to efficiently manage resources among various tasks in ClickHouse. This technique is crucial for balancing different workloads, such as production and development tasks, ensuring optimal resource allocation and system performance.

Understanding Workload Scheduling in ClickHouse

When ClickHouse executes multiple queries simultaneously, these queries may compete for shared resources like disks. Workload scheduling involves applying constraints and policies to regulate resource utilization, ensuring fair access, and preventing any single workload from monopolizing resources.

Example of workload scheduling

Imagine a company with two types of workloads: production and development workloads. Production workloads are critical to the company’s business, while development workloads are used for testing and development. The company could use workload scheduling to ensure that the production workloads always have access to the resources they need, even if the development workloads use many resources. For example, the company could configure the workload scheduler to prioritize production workloads more than development ones. This would mean that the production workloads would be executed first, even if development workloads are waiting.

Benefits of workload scheduling

There are several benefits to using workload scheduling in ClickHouse:

  • Improved performance: Workload scheduling can help improve ClickHouse’s performance by ensuring that all workloads have access to the resources they need.
  • Reduced costs: Workload scheduling can help to reduce the costs of running ClickHouse by ensuring that resources are not wasted on unnecessary workloads.
  • Increased fairness: Workload scheduling can help to increase ClickHouse’s fairness by ensuring that all workloads have an equal chance to access resources.

Scheduling Hierarchy

A scheduling hierarchy is configured for each resource, where the hierarchy root represents a resource, and the leaves are queues for requests exceeding resource capacity. Currently, this method primarily applies to remote disk IO scheduling, with CPU and memory scheduling managed through thread pools, concurrent_threads_soft_limit_num, and memory overcommit settings.

Disk Configuration for IO Scheduling

To enable IO scheduling for specific disks read_resource and/or write_resource must be specified in the storage configuration. This tells ClickHouse which resource to use for read and write requests. This feature is especially useful for managing network bandwidth between different types of workloads, such as separating “production” and “development” activities.

Example Disk Configuration

# To enable IO scheduling for a specific disk, you have to specify read_resource and/or write_resource in storage configuration.
<clickhouse>
    <storage_configuration>
        ...
        <disks>
            <s3>
                <type>s3</type>
                <endpoint>https://clickhouse-public-datasets.s3.amazonaws.com/my-bucket/root-path/</endpoint>
                <access_key_id>your_access_key_id</access_key_id>
                <secret_access_key>your_secret_access_key</secret_access_key>
                <read_resource>network_read</read_resource>
                <write_resource>network_write</write_resource>
            </s3>
        </disks>
        ...
    </storage_configuration>
</clickhouse>

Marking Workloads in ClickHouse

Queries can be marked with the workload setting to distinguish between different types, defaulting to “default” if not specified. Workloads can be consistently marked using settings profiles for better management.

Example Queries with Workload Setting

SELECT count() FROM my_table WHERE value = 42 SETTINGS workload = 'production'; 
SELECT count() FROM my_table WHERE value = 13 SETTINGS workload = 'development';

Resource Scheduling Hierarchy

The scheduling subsystem views a resource as a hierarchy of scheduling nodes, including constraints like inflight_limit and bandwidth_limit, and policies such as fair, priority, and fifo for managing request queues.

Configuring IO Scheduling Hierarchies

The configuration below demonstrates how to define IO scheduling hierarchies, incorporating nodes for fair distribution and prioritization between “production” and “development” workloads.

<clickhouse>
    <resources>
        <network_read>
            <node path="/">
                <type>inflight_limit</type>
                <max_requests>100</max_requests>
            </node>
            <node path="/fair">
                <type>fair</type>
            </node>
            ...
        </network_read>
        <network_write>
            ...
        </network_write>
    </resources>
</clickhouse>

Workload Classifiers in ClickHouse

Workload classifiers map the specified workload of a query to the appropriate leaf-queues for resource usage. This mapping is currently static, defining clear pathways for “production”, “development”, and “default” workloads.

Example Workload Classifier Configuration

<clickhouse>
    <workload_classifiers>
        <production>
            <network_read>/fair/prod</network_read>
            <network_write>/fair/prod</network_write>
        </production>
        ...
    </workload_classifiers>
</clickhouse>

Conclusion

Workload scheduling in ClickHouse is a powerful feature for optimizing resource utilization, enhancing performance, and ensuring fairness among different types of workloads. By carefully configuring disks, marking workloads, and setting up scheduling hierarchies and classifiers, administrators can achieve a balanced and efficient system. This guide provides a foundation for understanding and implementing workload scheduling in your ClickHouse environment.

Reference: https://clickhouse.com/docs/en/operations/workload-scheduling