Table of Contents

How does ClickHouse Query Profiler work?

The core objective of ClickHouse Query Profiler is to measure the performance metrics of SQLs and understand where the application is spending most of the time, This helps to successfully complete ClickHouse performance audit/diagnostics/forensics by tracing CPU time against wall-clock time including idle time. At ChistaDATA we spend a considerable amount of time troubleshooting (of course lots of research included) ClickHouse performance and we are committed to building optimal & reliable ClickHouse infrastructure for customers worldwide.

Configuring ClickHouse Query Profiler

ClickHouse trace_log system table records the profiler operations, This setting is configured by default. But, The data on this table is valid only for a running server so after the ClickHouse server restarts, ClickHouse does not clean up this table and so all the persistent virtual memory addresses will become invalid. We have copied below the default server configuration file (config.xml) for your reference:

<trace_log>
    <database>system</database>
    <table>trace_log</table>
    <partition_by>toYYYYMM(event_date)</partition_by>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
</trace_log>

ID.	Parameter Name	Description
1	database	Database for storing a table.
2	table	Table name.
3	partition_by	Custom partitioning key for a system table. Can't be used if engine defined.
4	engine	MergeTree Engine Definition for a system table. Can't be used if partition_by defined
5	flush_interval_milliseconds	Interval for flushing data from the buffer in memory to the table.

How to configure query_log and other _log tables for set up TTL, and/or some other cleanup procedures:

cat /etc/clickhouse-server/config.d/query_log.xml
<yandex>
    <query_log replace="1">
        <database>system</database>
        <table>query_log</table>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
        <engine>
ENGINE = MergeTree
PARTITION BY event_date
ORDER BY (event_time)
TTL event_date + interval 90 day
SETTINGS ttl_only_drop_parts=1
        </engine>
    </query_log>
</yandex>

The configuration parameters query_profiler_real_time_period_ns and query_profiler_cpu_time_period_ns dump stacktraces all the threads which execute the query. By default, it collects information only about queries when runs longer than 1 sec (and collects stacktraces every second).

To record how RAM is used by ClickHouse queries you can configure the setting memory_profiler_sample_probability

To read more about ClickHouse’s Query Profiler, do consider reading the following articles:

ChistaDATA Inc.

Enterprise-class 24*7 ClickHouse Consultative Support and Managed Services

How ClickHouse Query Profiler works ?

How does ClickHouse Query Profiler work?

Configuring ClickHouse Query Profiler

How does ClickHouse Query Profiler work?

Configuring ClickHouse Query Profiler

Related Articles

Mastering Performance Tuning in ClickHouse: Tips for Inspecting Statistics Objects

Runbook for Migration from Amazon Redshift to ChistaDATA Cloud for ClickHouse: Part 1

ClickHouse Performance Benchmarking – Brown University