ClickHouse April 2024 Release – v24.4

Image by walpepers.com

Introduction

Every new release includes new features, enhancements, and numerous bug fixes, and the ChistaDATA team always stays on top of the latest releases. On April, 2024, ClickHouse version 24.4 was released, and this version contains the following;

  • 13 new features,
  • 16 performance optimizations,
  • 64 bug fixes.

For further details, please see the official ClickHouse docs here.

This article will look at the critical features of the ClickHouse 24.4 release.  

DROP Multiple Tables

Supports dropping multiple tables at the same time like DROP TABLE a, b, c;

DROP TABLE chista, data, consultant;

Unloading Primary Key

Utilize SYSTEM UNLOAD PRIMARY KEY to optimize memory allocation for the primary key of a table.

SYSTEM UNLOAD PRIMARY KEY db.table

Enabling SYSTEM UNLOAD PRIMARY KEY allows for freeing up memory allocated to the primary key of a table, especially beneficial for tables with infrequent usage. The primary key data will be loaded on demand during subsequent queries to the table.

New Virtual Columns in MergeTree

Introduced a new persistent virtual column called ‘_block_offset,’ which retains the original row number assigned during insertion. Enabling persistence for ‘_block_offset’ can be done through the MergeTree setting ‘enable_block_offset_column.’ Additionally, a new virtual column named ‘_part_data_version’ has been added, containing either the minimum block number or the mutation version of the partition. It’s worth noting that the persistent virtual column ‘_block_number’ is no longer in its experimental phase.

CREATE TABLE ... ENGINE = MergeTree
   SETTINGS enable_block_number_column = 1,
            enable_block_offset_column = 1;

Transform OUTER to INNER JOIN

When the filter applied after a JOIN consistently filters out default values, the OUTER JOIN is automatically optimized to an INNER JOIN. This optimization is governed by the setting ‘query_plan_convert_outer_join_to_inner_join,’ which is enabled by default.

SELECT *
FROM l LEFT JOIN r ON l.id = r.id
WHERE r.value != '' -- this allows the optimization
SELECT *
FROM l INNER JOIN r ON l.id = r.id
WHERE r.value != ''

Conclusion

In summary, these updates, implemented by the ClickHouse database, represent a substantial stride forward in optimizing performance, scalability, and resource efficiency. By focusing on improving parallelism, cache locality, and reducing memory usage, ClickHouse has demonstrated a commitment to enhancing the user experience and meeting the evolving demands of modern data management. The introduction of new modes for distributed DDL output handling further underscores ClickHouse’s dedication to providing flexibility and control to its users. These updates collectively reinforce ClickHouse’s position as a leading solution for high-performance analytical workloads.

These are the ClickHouse 24.4 features. To find out more details, please visit the official ClickHouse Docs.

About Can Sayn 41 Articles
Can Sayın is experienced Database Administrator in open source relational and NoSql databases, working in complicated infrastructures. Over 5 years industry experience, he gain managing database systems. He is working at ChistaDATA Inc. His areas of interest are generally on open source systems.
Contact: Website