1. Home
  2. Knowledge Base
  3. ChistaDATA
  4. Convert existing MergeTree Table into Replication in ClickHouse
  1. Home
  2. Knowledge Base
  3. ClickHouse
  4. Convert existing MergeTree Table into Replication in ClickHouse
  1. Home
  2. Knowledge Base
  3. ClickHouse Performance
  4. Convert existing MergeTree Table into Replication in ClickHouse

Convert existing MergeTree Table into Replication in ClickHouse

Introduction

This knowledge-based article will guide you through converting an existing MergeTree table into a Replication in ClickHouse.

This conversion is beneficial for achieving high availability and load sharing by adding more resources to your database cluster.

Note: If you want to create a replication cluster from a single node or you want to add a new node into replication, you can follow this knowledge base article https://chistadata.com/?post_type=ht_kb&p=7152&preview=true

 

Table Overview

Before diving into the conversion process, let’s take a quick look at the structure of the original MergeTree table, named “orders,” which we will be replicating:

CREATE TABLE orders.orders 
(
    `topics` LowCardinality(String),
    `IdentifierC` LowCardinality(String),
    `TimestampNic` DateTime,
    `resetTimestamp` Nullable(Int64),
    `timestamp_event` Int64,
    `imessage` Int64,
    `sourceevent` Int32,
    `eventID` Int64,
    `StatusComplete` UInt8,
    `Event_type` LowCardinality(String),
    `Event_fields_keys` Array(LowCardinality(String)),
    `EFields_values` Array(String)
)
ENGINE = ReplacingMergeTree
PARTITION BY (IdentifierC, Event_type ,toDate(TimestampNic)
ORDER BY (IdentifierC, Event_type, TimestampNic, topics)
TTL TimestampNic + toIntervalDay(15) ;



┌─table─────────────────────────────────────┬─compressed_size─┬─uncompressed_size─┬─ratio─┐
│ orders                                    │ 10.65 GiB       │ 49.38 GiB         │  4.64 │
  • The table uses the ReplacingMergeTree engine, which efficiently handles data replacement and merging for time-series data.
  • It is partitioned by “IdentifierC” and “Event_type.”
  • The table is sorted by “IdentifierC,” “Event_type,” “TimestampNic,” and “topics” to improve query performance, particularly for range queries.
  • A Time-to-Live (TTL) setting is defined to remove data older than 15 days.

Conversion Steps 

Create a New Replicated Table:

First, create a new replicated table named “orders_replicated” using the ReplicatedReplacingMergeTree engine:

use orders;

CREATE TABLE orders.orders_replicated ON CLUSTER exp_prod_cluster
(
    `topics` LowCardinality(String),
    `IdentifierC` LowCardinality(String),
    `TimestampNic` DateTime,
    `resetTimestamp` Nullable(Int64),
    `timestamp_event` Int64,
    `imessage` Int64,
    `sourceevent` Int32,
    `eventID` Int64,
    `StatusComplete` UInt8,
    `Event_type` LowCardinality(String),
    `Event_fields_keys` Array(LowCardinality(String)),
    `EFields_values` Array(String)
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{cluster}/orders/orders', '{replica}')
PARTITION BY (IdentifierC, Event_type ,toDate(TimestampNic))
ORDER BY (IdentifierC, Event_type, TimestampNic, topics)
TTL TimestampNic + toIntervalDay(15) 
SETTINGS index_granularity = 8192;

Prepare the Attach Partition Query:

Generate the SQL query to attach each partition from the original table to the newly created “orders_replicated” table. This step ensures that data is migrated properly.

use orders;


clickhouse01 :) SELECT DISTINCT 'ALTER TABLE orders_replicated ATTACH PARTITION ID \'' || partition_id || '\' FROM orders;' 
from system.parts 
WHERE table = 'orders' AND active
 
SELECT DISTINCT concat('ALTER TABLE orders_replicated ATTACH PARTITION ID \'', partition_id, '\' FROM orders;')
FROM system.parts
WHERE (table = 'orders') AND active
 
Query id: e5fcedsc-435g-56tf-7ujg-nnde237e2632
 
┌─concat('ALTER TABLE orders_replicated ATTACH PARTITION ID \'', partition_id, '\' FROM orders;')───┐
│ ALTER TABLE orders_replicated ATTACH PARTITION ID '3117c8f06hsgahgfds4567827t583940' FROM orders; │
│ ALTER TABLE orders_replicated ATTACH PARTITION ID 'fscgff5432gsbtvzda4132u308262aq1' FROM orders; │
│ ALTER TABLE orders_replicated ATTACH PARTITION ID '543sabfdacd07cxvsaxfd24567sfrsgb' FROM orders; │
│ ALTER TABLE orders_replicated ATTACH PARTITION ID '7231gsycdtwbhyyev8652ajudxgxfsbs' FROM orders; │
│ ALTER TABLE orders_replicated ATTACH PARTITION ID '28765rfsghn023wdcvghgfvssssxcfds' FROM orders; │
│ ALTER TABLE orders_replicated ATTACH PARTITION ID '1qsyxtgbsvtyxhju7yt562ygbshfsfsg' FROM orders; │
│ ALTER TABLE orders_replicated ATTACH PARTITION ID '654edfghfedcv1bqgsatrfgbssw23q11' FROM orders; │
│ ALTER TABLE orders_replicated ATTACH PARTITION ID '876543wwtyjhfdsxcvbnhtr231234444' FROM orders; │
│ ALTER TABLE orders_replicated ATTACH PARTITION ID '1qazsw23edcvfr45tgbhy7uj8ikm9yde' FROM orders; │
│ ALTER TABLE orders_replicated ATTACH PARTITION ID '32wsedcyegbvcdr567uyghwystrfgxhg' FROM orders; │
│ ALTER TABLE orders_replicated ATTACH PARTITION ID '2238765edftyhgfcvb1qaz2wdc5tghgf' FROM orders; │
│ ALTER TABLE orders_replicated ATTACH PARTITION ID '6543ewdfghgfed2wsdf4ed24rfvcxzss' FROM orders; │
│ ALTER TABLE orders_replicated ATTACH PARTITION ID '87654etyuhfrxfbhwysgdhdgddtdhbgg' FROM orders; │
│ ALTER TABLE orders_replicated ATTACH PARTITION ID '1qazxd34567trtyhgfdfghfrtyhbsvvv' FROM orders; │

---- output truncated -----

Execute the Partition Migration:

Execute the generated SQL query to move each partition to the replicated table:

clickhouse-client -u default --password='********' -d orders  -q 
"SELECT DISTINCT 'ALTER TABLE orders_replicated ATTACH PARTITION ID \'' || partition_id || '\' FROM orders;' 
from system.parts WHERE table = 'orders' format TabSeparatedRaw" |clickhouse-client -mn -u default --password='*********' -d orders

Verify Data Migration:

Check the data migration by comparing the row counts in the original and replicated tables.

Also, execute count on orders on other replicas and verify schema; data should replicate within a few minutes, depending on table size.

SELECT count(1) FROM orders_replicated
SELECT count(1) FROM orders

Rename Tables:

After verifying successful data migration, rename the tables to make the replicated table the primary one:

RENAME TABLE orders TO orders_old, orders_replicated TO orders ON CLUSTER exp_prod_cluster

 

Cleanup (Optional):

If everything works fine with the replicated table, you can keep the “orders_old” copy as a backup for 2-4 days. After ensuring the stability of the new setup, perform any necessary cleanup activities and drop the “orders_old” backup table.

 

Conclusion:

Following these steps, you can seamlessly convert an existing MergeTree table into a Replication in ClickHouse. This conversion enhances high availability and allows for efficient load sharing in your database cluster.

Was this article helpful?

Related Articles

CHISTADATA IS COMMITTED TO OPEN SOURCE SOFTWARE AND BUILDING HIGH PERFORMANCE COLUMNSTORES

In the spirit of freedom, independence and innovation. ChistaDATA Corporation is not affiliated with ClickHouse Corporation 

Need Support?

Can't find the answer you're looking for?
Contact Support

ChistaDATA Inc. Knowledge base is licensed under the Apache License, Version 2.0 (the “License”)

Copyright 2022 ChistaDATA Inc

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.