Home
Knowledge Base
ClickHouse Troubleshooting
ClickHouse Recommended Troubleshooting Metrics

ClickHouse Recommended Troubleshooting Metrics

Alert Name	Shell or SQL command	Severity
ClickHouse status	$ curl ‘http://localhost:8123/’ Ok.	Critical

Too many simultaneous queries. Maximum: 100 (by default)	select value from system.metrics where metric=’Query’	Critical

Replication status	$ curl ‘http://localhost:8123/replicas_status’ Ok.	High

Read only replicas (reflected by replicas_status as well)	select value from system.metrics where metric=’ReadonlyReplica’	High

Some replication tasks are stuck	select count() from system.replication_queue where num_tries > 100 or num_postponed > 1000	High

ZooKeeper is available	select count() from system.zookeeper where path=’/’	Critical for writes

ZooKeeper exceptions	select value from system.events where event=’ZooKeeperHardwareExceptions’	Medium

Other CH nodes are available	$ for node in `echo “select distinct host_address from system.clusters where host_name !=’localhost'” \| curl ‘http://localhost:8123/’ –silent –data-binary @-`; do curl “http://$node:8123/” –silent ; done \| sort -u Ok.	High

All CH clusters are available (i.e. every configured cluster has enough replicas to serve queries)	for cluster in `echo “select distinct cluster from system.clusters where host_name !=’localhost'” \| curl ‘http://localhost:8123/’ –silent –data-binary @-` ; do clickhouse-client –query=”select ‘$cluster’, ‘OK’ from cluster(‘$cluster’, system, one)” ; done	Critical

There are files in ‘detached’ folders	$ find /var/lib/clickhouse/data///detached/* -type d \| wc -l; \ 19.8+ select count() from system.detached_parts	Medium

Too many parts: \ Number of parts is growing; \ Inserts are being delayed; \ Inserts are being rejected	select value from system.asynchronous_metrics where metric=’MaxPartCountForPartition’;select value from system.events/system.metrics where event/metric=’DelayedInserts’; select value from system.events where event=’RejectedInserts’	Critical

Dictionaries: exception	select concat(name,’: ‘,last_exception) from system.dictionaries where last_exception != ”	Medium

ClickHouse has been restarted	select uptime(); select value from system.asynchronous_metrics where metric=’Uptime’

DistributedFilesToInsert should not be always increasing	select value from system.metrics where metric=’DistributedFilesToInsert’	Medium

A data part was lost	select value from system.events where event=’ReplicatedDataLoss’	High

Data parts are not the same on different replicas	select value from system.events where event=’DataAfterMergeDiffersFromReplica’; \ select value from system.events where event=’DataAfterMutationDiffersFromReplica’	Medium

The following queries are recommended to be included in monitoring:

SELECT * FROM system.replicas. – For more information, see the ClickHouse guide on System Tables. Visit here.

SELECT * FROM system.merges – Checks on the speed and progress of currently executed merges.

SELECT * FROM system.mutations WHERE create_time desc – This is the source of information on the speed and progress of currently executed merges.

Was this article helpful?

Yes No

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

ChistaDATA Inc.

Enterprise-class 24*7 ClickHouse Consultative Support and Managed Services

ClickHouse Recommended Troubleshooting Metrics

Alert Name

Shell or SQL command

Severity

CHISTADATA IS COMMITTED TO OPEN SOURCE SOFTWARE AND BUILDING HIGH PERFORMANCE COLUMNSTORES

Contents

Need Support?

ChistaDATA Inc. Knowledge base is licensed under the Apache License, Version 2.0 (the “License”)

ClickHouse Recommended Troubleshooting Metrics

Alert Name

Shell or SQL command

Severity

Related Articles

CHISTADATA IS COMMITTED TO OPEN SOURCE SOFTWARE AND BUILDING HIGH PERFORMANCE COLUMNSTORES

Contents

Need Support?

ChistaDATA Inc. Knowledge base is licensed under the Apache License, Version 2.0 (the “License”)