ClickHouse Parts and Partitions: Part 1

Introduction

In ClickHouse, a physical file on a disk that stores a portion of the table’s data is called a “part”. A “partition,” on the other hand, is a logical division of a table’s data that is created using a partition key. Partitioning can be used to improve the performance and manageability of large tables by allowing users to store, query, and manipulate a subset of the data more efficiently. In the first instalment of this blog series, we will discuss parts and partitions in ClickHouse, as well as their usage and differences.

Parts

As we mentioned, “parts” are physical files on the disk. By default, all data-related files are under the “/var/lib/clickhouse” directory. Every merge tree table in ClickHouse has a unique directory path for storing parts. You can access the actual location of parts, part names, partition information(if any), and some other valuable information from “system.parts” table. An example of the query result from system.parts table is shown below. Here in “part_type,”  “Wide” means that each column is stored in a separate file in a filesystem. On the other hand, “Compact” means all columns are stored in one file in a file-system. Also in “partition” column “tuple()” stands for this table is not partitioned.

SELECT
    substr(table, 1, 22),
    partition AS prt,
    name,
    part_type,
    path
FROM system.parts
WHERE database = 'sampleDatasets'
ORDER BY
    table ASC,
    partition ASC,
    name ASC

Query id: 2a9462a8-7e74-4c0f-a89a-0bd2a31d0a46

┌─substring(table, 1, 22)─┬─prt─────┬─name──────────┬─part_type─┬─path──────────────────────────────────────────────────────────────────────────────┐
│ .inner_id.ca612a5e-5ee  │ tuple() │ all_1_1_1     │ Compact   │ /var/lib/clickhouse/store/4c6/4c6dfe9a-b697-4f9a-928f-f829bf44fb5c/all_1_1_1/     │
│ opensky                 │ tuple() │ all_1_1_1_5   │ Wide      │ /var/lib/clickhouse/store/1af/1afc664b-0a25-443a-a317-532532434753/all_1_1_1_5/   │
│ opensky                 │ tuple() │ all_2_2_1_5   │ Wide      │ /var/lib/clickhouse/store/1af/1afc664b-0a25-443a-a317-532532434753/all_2_2_1_5/   │
│ opensky                 │ tuple() │ all_3_3_0_5   │ Wide      │ /var/lib/clickhouse/store/1af/1afc664b-0a25-443a-a317-532532434753/all_3_3_0_5/   │
│ opensky                 │ tuple() │ all_4_4_0_5   │ Compact   │ /var/lib/clickhouse/store/1af/1afc664b-0a25-443a-a317-532532434753/all_4_4_0_5/   │
│ openskyDeletedRecord    │ tuple() │ all_2_2_2     │ Compact   │ /var/lib/clickhouse/store/5b0/5b0e6758-ae46-4c25-9599-f2cddc362b5e/all_2_2_2/     │
│ opensky_1000000         │ tuple() │ all_1_1_0     │ Wide      │ /var/lib/clickhouse/store/a2a/a2aa62d7-65bb-4bb2-bc5b-2c3d1befa147/all_1_1_0/     │
│ opensky_freeze_restore  │ tuple() │ all_1_1_1     │ Wide      │ /var/lib/clickhouse/store/2d0/2d0e6ad7-6fd4-408e-8c18-b39416ea8ff1/all_1_1_1/     │
│ opensky_freeze_restore  │ tuple() │ all_2_2_1     │ Wide      │ /var/lib/clickhouse/store/2d0/2d0e6ad7-6fd4-408e-8c18-b39416ea8ff1/all_2_2_1/     │
│ opensky_freeze_restore  │ tuple() │ all_3_3_0     │ Wide      │ /var/lib/clickhouse/store/2d0/2d0e6ad7-6fd4-408e-8c18-b39416ea8ff1/all_3_3_0/     │
│ opensky_freeze_restore  │ tuple() │ all_4_4_0     │ Compact   │ /var/lib/clickhouse/store/2d0/2d0e6ad7-6fd4-408e-8c18-b39416ea8ff1/all_4_4_0/     │
│ opensky_redo_test       │ tuple() │ all_1_2_1     │ Compact   │ /var/lib/clickhouse/store/e99/e99f1ab9-6d21-4d01-8f02-9f2aa0e6b45a/all_1_2_1/     │
│ opensky_redo_test2      │ tuple() │ all_103_104_1 │ Compact   │ /var/lib/clickhouse/store/fac/facadfc3-1c5b-4775-bea7-53c262f7e237/all_103_104_1/ │
│ test_table              │ tuple() │ all_1_2_1     │ Compact   │ /var/lib/clickhouse/store/024/024a3a77-4ca9-4761-a2f1-4b36eb4ece0e/all_1_2_1/     │
└─────────────────────────┴─────────┴───────────────┴───────────┴───────────────────────────────────────────────────────────────────────────────────┘

You can also access table’s “parts” under “/var/lib/clickhouse/data/<DBNAME>/<TABLENAME>” folder as a symbolic link.

root@clickhouse01:/var/lib/clickhouse/data/sampleDatasets/opensky# ls -al
total 36
drwxr-x--- 7 clickhouse clickhouse 4096 Nov 21 10:00 .
drwxr-x--- 3 clickhouse clickhouse 4096 Oct 16 13:13 ..
drwxr-x--- 2 clickhouse clickhouse 4096 Nov 21 09:49 all_1_1_1_5
drwxr-x--- 2 clickhouse clickhouse 4096 Nov 21 09:49 all_2_2_1_5
drwxr-x--- 2 clickhouse clickhouse 4096 Nov 21 09:49 all_3_3_0_5
drwxr-x--- 2 clickhouse clickhouse 4096 Nov 21 09:49 all_4_4_0_5
drwxr-x--- 2 clickhouse clickhouse 4096 Nov  9 11:33 detached
-rw-r----- 1 clickhouse clickhouse    1 Oct 16 13:13 format_version.txt
-rw-r----- 1 clickhouse clickhouse  100 Nov 21 09:49 mutation_5.txt

As you can see examples above, “opensky” table under “sampleDatasets” has 4 four parts. Every part has its own directory, and their names start with “all_”. If we talk about “all_3_3_0_5” part,

  • 3 is the minimum number of the data block.
  • 3 is the maximum number of the data block.
  • 0 is the chunk level (the depth of the merge tree it is formed from).
  • 5 is the mutation version (if a part is mutated).

As another option, this information can be reached from the system.parts table as well.

SELECT
    name,
    partition_id,
    min_block_number,
    max_block_number,
    level,
    data_version
FROM system.parts
WHERE (database = 'sampleDatasets') AND (table = 'opensky') AND (name = 'all_3_3_0_5')

Query id: 0f2bb404-11a0-4df3-b1c1-7941698b9560

┌─name────────┬─partition_id─┬─min_block_number─┬─max_block_number─┬─level─┬─data_version─┐
│ all_3_3_0_5 │ all          │                3 │                3 │     0 │            5 │
└─────────────┴──────────────┴──────────────────┴──────────────────┴───────┴──────────────┘

Partitions

Like parts, you can also access partition information for merge tree tables from the system.parts” table. However, partition column is now represented as something other than a “tuple()“. For creating a partitioned table, first of all, you need to use the “PARTITION BY expr” clause when creating a table. For example, the “PARTITION BY toYYYYMMDD(start_time)” clause creates a daily partition with respect to the “start_time” column. In the following example, as you can see, the partition and part names are different. Also, part names do not start with “all,”  but instead start with the partition indicator. Partitioning is something like dividing logically, but parts are located as physical files. One partition can contain one or more parts.

SELECT
    partition,
    name,
    active
FROM system.parts
WHERE (table = 'backups') AND (database = 'RECMAN')


┌─partition─┬─name───────────┬─active─┐
│ 20221017  │ 20221017_1_1_0 │      1 │
│ 20221018  │ 20221018_2_2_0 │      1 │
│ 20221114  │ 20221114_3_3_0 │      1 │
└───────────┴────────────────┴────────┘

Generally, partitioning is used to improve query performance. It gives us the flexibility to manage a subset of data. You can directly query partition(s), drop them, detach them, and so on. We will discuss the manipulation of parts and partitions in the following blog articles.

You can query a specific partition by giving conditions in “where” clause or using hidden “_partition_id” column. Of course, It is better to choose an official way and give partition conditions in the where clause. But in some cases, we need to use “_partition_id”.  Now, let’s have a look at the query example of a partitioned table. Let’s assume our table (recoDB.opensky_partitioned) is partitioned for “lastseen” column. We can access specific partitions both using the partition key column and hidden “_partition_id” column. Also we can query top 10 partition with “_partition_id”.

SELECT count()
FROM recoDB.opensky_partitioned
WHERE toDate(lastseen) = '2019-02-25'

┌─count()─┐
│   69480 │
└─────────┘
#####################################

SELECT count()
FROM recoDB.opensky_partitioned
WHERE _partition_id = '20190225'

┌─count()─┐
│   69480 │
└─────────┘
SELECT
    _partition_id,
    count()
FROM recoDB.opensky_partitioned
GROUP BY _partition_id
ORDER BY 2 DESC
LIMIT 10
┌─_partition_id─┬─count()─┐
│ 20190524      │   90358 │
│ 20190531      │   87917 │
│ 20190523      │   87023 │
│ 20190530      │   86945 │
│ 20190425      │   85524 │
│ 20190516      │   85348 │
│ 20190515      │   85287 │
│ 20190522      │   85056 │
│ 20190517      │   84986 │
│ 20190510      │   84585 │
└───────────────┴─────────┘

Conclusion

Parts and partitions are the main components of the ClickHouse database. In general, parts are used for storing portions of the table’s data and are physical files on the disk. On the other hand, partitions are logical structures that are frequently used for performance and data management in tables. In the first part of this series, we talked about parts and partitions in ClickHouse. In the following parts of this series, we will talk about manipulations, merging, and mutations.

To learn more about ClickHouse Partitions, do consider giving the below articles a read

References

About Emrah Idman 11 Articles
Emrah Idman has considerable experience in relational and NoSQL databases. He has worked in a large-scale financial company for over 15 years. He has significant experience in team management, procurement and capacity planning, database administration and product testing for high-volume systems. He is working at ChistaDATA Inc. as senior database administrator.
Contact: Website