1. Home
  2. Knowledge Base
  3. ClickHouse Performance
  4. Monitoring Replication Queue In ClickHouse

Monitoring Replication Queue In ClickHouse

system.replication_queue table contains information about tasks from replication queues stored in ClickHouse Keeper, or ZooKeeper, for tables in the ReplicatedMergeTree family.

You can run the following SQL statements to monitor the replication queue in ClickHouse.

Example 1;

SELECT
    database,
    table,
    type,
    max(last_exception),
    max(postpone_reason),
    min(create_time),
    max(last_attempt_time),
    max(last_postpone_time),
    max(num_postponed) AS max_postponed,
    max(num_tries) AS max_tries,
    min(num_tries) AS min_tries,
    countIf(last_exception != '') AS count_err,
    countIf(num_postponed > 0) AS count_postponed,
    countIf(is_currently_executing) AS count_executing,
    count() AS count_all
FROM system.replication_queue
GROUP BY
    database,
    table,
    type
ORDER BY count_all DESC

Example 2;

SELECT * FROM system.replication_queue LIMIT 1 FORMAT Vertical;
Row 1:
──────
database:               merge
table:                  visits_v2
replica_name:           mtgiga001-1t
position:               15
node_name:              queue-0009325559
type:                   MERGE_PARTS
create_time:            2020-12-07 14:04:21
required_quorum:        0
source_replica:         mtgiga001-1t
new_part_name:          20201130_121373_121384_2
parts_to_merge:         ['20201130_121373_121378_1','20201130_121379_121379_0','20201130_121380_121380_0','20201130_121381_121381_0','20201130_121382_121382_0','20201130_121383_121383_0','20201130_121384_121384_0']
is_detach:              0
is_currently_executing: 0
num_tries:              36
last_exception:         Code: 226, e.displayText() = DB::Exception: Marks file '/opt/clickhouse/data/merge/visits_v2/tmp_fetch_20201130_121373_121384_2/CounterID.mrk' does not exist (version 20.8.7.15 (official build))
last_attempt_time:      2020-12-08 17:35:54
num_postponed:          0
postpone_reason:
last_postpone_time:     1970-01-01 03:00:00

Columns:

  • database (String) — Name of the database.
  • table (String) — Name of the table.
  • replica_name (String) — Replica name in ClickHouse Keeper. Different replicas of the same table have different names.
  • position (UInt32) — Position of the task in the queue.
  • node_name (String) — Node name in ClickHouse Keeper.
  • type (String) — Type of the task in the queue, one of:
    • GET_PART — Get the part from another replica.
    • ATTACH_PART — Attach the part, possibly from our own replica (if found in the detached folder). You may think of it as a GET_PART with some optimizations as they’re nearly identical.
    • MERGE_PARTS — Merge the parts.
    • DROP_RANGE — Delete the parts in the specified partition in the specified number range.
    • CLEAR_COLUMN — NOTE: Deprecated. Drop specific column from specified partition.
    • CLEAR_INDEX — NOTE: Deprecated. Drop specific index from specified partition.
    • REPLACE_RANGE — Drop a certain range of parts and replace them with new ones.
    • MUTATE_PART — Apply one or several mutations to the part.
    • ALTER_METADATA — Apply alter modification according to global /metadata and /columns paths.
  • create_time (Datetime) — Date and time when the task was submitted for execution.
  • required_quorum (UInt32) — The number of replicas waiting for the task to complete with confirmation of completion. This column is only relevant for the GET_PARTS task.
  • source_replica (String) — Name of the source replica.
  • new_part_name (String) — Name of the new part.
  • parts_to_merge (Array (String)) — Names of parts to merge or update.
  • is_detach (UInt8) — The flag indicates whether the DETACH_PARTS task is in the queue.
  • is_currently_executing (UInt8) — The flag indicates whether a specific task is being performed right now.
  • num_tries (UInt32) — The number of failed attempts to complete the task.
  • last_exception (String) — Text message about the last error that occurred (if any).
  • last_attempt_time (Datetime) — Date and time when the task was last attempted.
  • num_postponed (UInt32) — The number of postponed tasks.
  • postpone_reason (String) — The reason why the task was postponed.
  • last_postpone_time (Datetime) — Date and time when the task was last postponed.
  • merge_type (String) — Type of the current merge. Empty if it’s a mutation.

 

Referance;

https://clickhouse.com/docs/en/operations/system-tables/replication_queue/

Was this article helpful?

Related Articles

CHISTADATA IS COMMITTED TO OPEN SOURCE SOFTWARE AND BUILDING HIGH PERFORMANCE COLUMNSTORES

In the spirit of freedom, independence and innovation. ChistaDATA Corporation is not affiliated with ClickHouse Corporation 

Need Support?

Can't find the answer you're looking for?
Contact Support

ChistaDATA Inc. Knowledge base is licensed under the Apache License, Version 2.0 (the “License”)

Copyright 2022 ChistaDATA Inc

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.