1. Home
  2. Knowledge Base
  3. ClickHouse Backup Strategies – Part II

ClickHouse Backup Strategies – Part II

In ClickHouse, replication does not protect the system from accidental deletes on tables or some records on the tables. For that reason, backup and recovery strategies are crucial for companies.

In this article, I will discuss the ClickHouse native“Backup/Restore” method for protecting database(s) and table(s) from accidental deletions.

You can find another method for a backup/restore strategy here.

Environment

OS : Ubuntu Linux 22.04

DB: ClickHouse v22.9.2.7

 

Preparations

First of all, we need to specify the backup/restore location where the ClickHouse server runs. To do that, we need to create a “backup_disk.xml” file under ClickHouse’s “config.d” folder. For the default installation, the full path is : “/etc/clickhouse-server/config.d/“.

As shown below, in our example, “/clickHouse/backups” is the backup location and named as “backups“. You can use or create any folder for backup and restore operations.

<clickhouse>
    <storage_configuration>
        <disks>
            <backups>
                <type>local</type>
        <path>/clickHouse/backups/</path>
            </backups>
        </disks>
    </storage_configuration>
    <backups>
        <allowed_disk>backups</allowed_disk>
    <allowed_path>/clickHouse/backups/</allowed_path>
    </backups>
</clickhouse>

 

We need to create a user and give permission for backup and restore operations. Therefore, we connect to the ClickHouse as the “default” user and create a backup user. But first, we have to check the “access_management” parameter in “/etc/clickhouse-server/users.xml“.

If this parameter is commented or the value for it is not “1”, we have to change the configuration as shown below and restart the ClickHouse DB.

<access_management>1</access_management>

 

Then, it is time to create the “backupAdmin” user and give it the related permissions as well.

clickhouse01 :) create user backupAdmin identified with SHA256_PASSWORD by '<your_password>';

clickhouse01 :) grant CREATE TABLE,CREATE DATABASE on *.* to backupAdmin;

clickhouse01 :) grant INSERT ON *.* to backupAdmin;

clickhouse01 :) grant backup on *.* to backupAdmin;

 

For using backup and restore tests, we create a database called “sampleDatasets” . After that, we create the “opensky” table under “sampleDatasets” and insert data as explained here. Use “default” or “admin” users for these operations.

Note: In the scope of these tests, we installed the 4 month data from the “opensky” dataset.

clickhouse01 :) create database sampleDatasets

clickhouse01 :) CREATE TABLE sampleDatasets.opensky
                (
                    callsign String,
                    number String,
                    icao24 String,
                    registration String,
                    typecode String,
                    origin String,
                    destination String,
                    firstseen DateTime,
                    lastseen DateTime,
                    day DateTime,
                    latitude_1 Float64,
                    longitude_1 Float64,
                    altitude_1 Float64,
                    latitude_2 Float64,
                    longitude_2 Float64,
                    altitude_2 Float64
                ) ENGINE = MergeTree ORDER BY (origin, destination, callsign);

 

Steps

Now, it is time to backup table with “backupAdmin” user.

clickhouse01 :) BACKUP TABLE sampleDatasets.opensky TO Disk('backups','opensky_20220929.bak');

BACKUP TABLE sampleDatasets.opensky TO Disk('backups', 'opensky_20220929.bak')

Query id: f3e01544-b48f-43c2-ba17-8077d187a986

┌─id───────────────────────────────────┬─status─────────┐
│ 9f9053bf-1a38-4f76-b260-12bacfcd5aca │ BACKUP_CREATED │
└──────────────────────────────────────┴────────────────┘

1 row in set. Elapsed: 4.323 sec. 

 

At the same time, while the backup is running, I deleted a record from the “opensky” table from another terminal with the “default” user and the backup did not affect or the table was not locked at that time.

clickhouse01 :) select count(*)  from sampleDatasets.opensky where registration='N661JM';

┌─count()─┐
│       1 │
└─────────┘

clickhouse01 :) set allow_experimental_lightweight_delete=1;


clickhouse01 :) delete from sampleDatasets.opensky where registration='N661JM';

Ok.

clickhouse01 :) select count(*)  from sampleDatasets.opensky where registration='N661JM';

┌─count()─┐
│       0 │
└─────────┘

 

You can check backup operations from “system.backups” system table with backup id.

clickhouse01 :) select id,status,start_time,end_time from system.backups where id='9f9053bf-1a38-4f76-b260-12bacfcd5aca';

SELECT
    id,
    status,
    start_time,
    end_time
FROM system.backups
WHERE id = '9f9053bf-1a38-4f76-b260-12bacfcd5aca'

Query id: d552f726-0ebc-445e-8d95-0ae0869d3940

┌─id───────────────────────────────────┬─status─────────┬──────────start_time─┬────────────end_time─┐
│ 9f9053bf-1a38-4f76-b260-12bacfcd5aca │ BACKUP_CREATED │ 2022-09-29 10:13:17 │ 2022-09-29 10:13:21 │
└──────────────────────────────────────┴────────────────┴─────────────────────┴─────────────────────┘

 

It is time to restore the table, so I expect to see the record with the ‘N661JM‘ registration ID,which was deleted while the backup was running, after the table is restored.

For that reason, I truncate or drop table(or both at time same time) with “default” user at first.

clickhouse01 :) truncate table sampleDatasets.opensky;

Ok.

clickhouse01 :) drop table sampleDatasets.opensky;

Ok.

 

Then, restore the table with “backupAdmin” user.

RESTORE TABLE sampleDatasets.opensky FROM Disk('backups', 'opensky_20220929.bak')

Query id: eb4e5033-a0ac-4931-aa68-083c096d22f4

┌─id───────────────────────────────────┬─status───┐
│ 56299e68-8fa0-4129-8dfe-a5511f2bdf47 │ RESTORED │
└──────────────────────────────────────┴──────────┘

 

Finally, I queried for the “opensky” table if it exists and has data.

clickhouse01 :) select count(*) from sampleDatasets.opensky;

┌──count()─┐
│ 11348850 │
└──────────┘

 

Afterward, I checked the record that I deleted during the backup process.

clickhouse01 :) select count(*) from sampleDatasets.opensky where registration='N661JM';

┌─count()─┐
│       1 │
└─────────┘

 

I have also tested “Truncate” or “Drop Table” while the backup is running. In all of the scenarios, I was able to restore the tables.

Backup and restore utilities are not only used for tables but also for databases and even materialized views and dictionaries as well. You can backup tables or databases full or incrementally, compress backups, and secure them with a password.

 

 

 

 

Was this article helpful?

CHISTADATA IS COMMITTED TO OPEN SOURCE SOFTWARE AND BUILDING HIGH PERFORMANCE COLUMNSTORES

In the spirit of freedom, independence and innovation. ChistaDATA Corporation is not affiliated with ClickHouse Corporation 

Need Support?

Can't find the answer you're looking for?
Contact Support

ChistaDATA Inc. Knowledge base is licensed under the Apache License, Version 2.0 (the “License”)

Copyright 2022 ChistaDATA Inc

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.