ClickHouse July 2022 Release – v22.7

Introduction

ClickHouse version 22.7 (July 2022) was unveiled on 21st July 2022. This release has around 25 new features, 19 performance improvement changes,  40+ other improvements and 50+ bug fixes. Here is the official list of changes.

Source code – GitHub Link

Installation Instructions – Official Documentation

Release Webinar – Youtube Link

Let us look at the key features and improvements in this release.

Key features & improvements

1. Added support for expressions with window functions

Window functions use values from a window of rows (one or multiple rows in a table), compute the result over that window and return the value for each and every row. Let us use the cell towers dataset for this example on the ClickHouse window functions.

SELECT radio, lon, lat,
avg(area) OVER (PARTITION BY lon,lat) AS average_area, 
(average_area>1000 ) AS filtered_avg_area
FROM datasets.cell_towers
LIMIT 10;

In this example, we are selecting the radio, lat(latitude) and long(longitude) columns from the cell_towers table. We are partitioning the data by latitude and longitude columns ad we are finding the average area for each unique lat long value (and limiting the results to 10 rows).

Query id: 7b986063-87cb-4c1e-a055-e786060a764a

┌─radio─┬──────────────lon─┬──────────────lat─┬─average_area─┬─filtered_avg_area─┐
│ GSM   │  -179.9732208252 │ -16.779556274414 │            6 │                 0 │
│ GSM   │ -179.92515563965 │  -16.71501159668 │            3 │                 0 │
│ UMTS  │ -179.92515563965 │  -16.71501159668 │            3 │                 0 │
│ LTE   │       -179.88945 │       -16.696472 │            6 │                 0 │
│ GSM   │ -179.88121032715 │ -16.687545776367 │            1 │                 0 │
│ UMTS  │ -179.87983703613 │ -16.688919067383 │            5 │                 0 │
│ GSM   │ -179.71366882324 │ -16.750717163086 │            6 │                 0 │
│ GSM   │ -179.71366882324 │ -16.750717163086 │            6 │                 0 │
│ GSM   │ -179.14512634277 │  66.403427124023 │         8705 │                 1 │
│ GSM   │  -179.1300201416 │   66.40754699707 │         8705 │                 1 │
└───────┴──────────────────┴──────────────────┴──────────────┴───────────────────┘

10 rows in set. Elapsed: 0.344 sec. Processed 43.28 million rows, 822.25 MB (125.65 million rows/s., 2.39 GB/s.)

The 22.7 version allows the expressions for window functions. The above example can be re-written as follows.

WITH avg(area) OVER (PARTITION BY lon,lat) AS average_area
SELECT radio, lon, lat, average_area, (average_area>1000 ) AS filtered_avg_area
FROM datasets.cell_towers
LIMIT 10;
Query id: ca476dfd-5be0-4743-b8c6-c7390cdf549a

┌─radio─┬──────────────lon─┬──────────────lat─┬─average_area─┬─filtered_avg_area─┐
│ GSM   │  -179.9732208252 │ -16.779556274414 │            6 │                 0 │
│ GSM   │ -179.92515563965 │  -16.71501159668 │            3 │                 0 │
│ UMTS  │ -179.92515563965 │  -16.71501159668 │            3 │                 0 │
│ LTE   │       -179.88945 │       -16.696472 │            6 │                 0 │
│ GSM   │ -179.88121032715 │ -16.687545776367 │            1 │                 0 │
│ UMTS  │ -179.87983703613 │ -16.688919067383 │            5 │                 0 │
│ GSM   │ -179.71366882324 │ -16.750717163086 │            6 │                 0 │
│ GSM   │ -179.71366882324 │ -16.750717163086 │            6 │                 0 │
│ GSM   │ -179.14512634277 │  66.403427124023 │         8705 │                 1 │
│ GSM   │  -179.1300201416 │   66.40754699707 │         8705 │                 1 │
└───────┴──────────────────┴──────────────────┴──────────────┴───────────────────┘

10 rows in set. Elapsed: 0.353 sec. Processed 43.28 million rows, 822.25 MB (122.73 million rows/s., 2.33 GB/s.)

2. New output format – SQLInsert

SQLInsert output format generates an equivalent INSERT statement from the result set.

SELECT lon, lat, area FROM datasets.cell_towers ct LIMIT 10 FORMAT SQLInsert;
Query id: 72d5daf9-c32f-486c-ab01-44789a22b0fe

INSERT INTO table (`lon`, `lat`, `area`) VALUES (85.912436, 55.333682, 45148), (85.921616, 55.33408, 45148),
(85.916815, 55.333769, 45148), (85.899208, 55.332283, 45148), (65.613377, 57.101187, 37205), 
(65.568435, 57.174902, 37204), (65.565277, 57.177299, 37204), (65.60453, 57.101715, 37205),
(-83.027748, 42.299232, 10), (-83.027771, 42.299316, 10)                                                                      ;

10 rows in set. Elapsed: 0.005 sec. 

 

3. MongoDB table function

ClickHouse has introduced a new table function which can be used to execute SQL queries (Clickhouse dialect) directly on a MongoDB server. The below example is directly based on the example from the release webinar.

SELECT COUNT(SearchPhrase)
FROM mongodb('localhost:27017', database='test', collection='hits', user='root', password='', structure='string to search')

4. MeiliSearch integration

MeiliSearch is an open-source search engine which uses LMDB as its storage engine. We can integrate ClickHouse with MeiliSearch with help of the new feature. We can create a new ClickHouse table on top of MeiliSearch, insert data to be indexed and run a SELECT query on MeiliSearch from the ClickHouse server.

5. NATS Engine

NATS is an open-source messaging system written in Golang. Clickhouse has added a new table engine supporting NATS. It is now possible to publish or subscribe messages to the NATS system from ClickHouse.

Other notable features/enhancements

  1. New join algorithms (direct algorithm – for key-value tables, parallel hash algorithm and full sorting merge)
  2. Performance improvement in ORDER BY
  3. Performance improvements in MergeTree merge operation
  4. Speedup in queries with FINAL modifier (CollapsingMergeTree and ReplacingMergeTree)
  5. Improvements in GO drivers – Added HTTP support and the driver is based on high-performant ch-go implementation
  6. Official support for Apache Superset
  7. clickhouse-diagnostics tool
  8. ClickBench – Benchmark for ClickHouse (Click here for more details on ClickBench)

Conclusion

ClickHouse’s July 2022 release (version 22.7) introduces significant enhancements, including support for window function expressions, a new SQLInsert output format, and integrations with MeiliSearch and NATS Engine. With over 25 new features and performance improvements, ClickHouse remains a top choice for analytical database management.

To know more about ClickHouse releases, do visit the following articles:

References: