Aggregate Functions via ClickHouse


Functions that arrive at the resultant value by calculating on a set of values ​​are called aggregation functions. In ClickHouse, aggregate functions work in the normal way as expected by database expert.

In this article, we will give examples with Aggregate Functions, Parametric Aggregate Functions and Aggregate Function Combinators on ClickHouse.

ClickHouse Aggregate Functions

( To more information please visit ClickHouse docs here.)

Standard aggregate functions: ClickHouse-specific aggregate functions:
count anyHeavy groupBitAnd kurtSamp quantileTiming
min anyLast groupBitOr kurtPop quantileTimingWeighted
max argMin groupBitXor uniq quantileDeterministic
sum argMax groupBitmap uniqExact quantileTDigest
avg avgWeighted groupBitmapAnd uniqCombined quantileTDigestWeighted
any topK groupBitmapOr uniqCombined64 quantileBFloat16
stddevPop topKWeighted groupBitmapXor uniqHLL12 quantileBFloat16Weighted
stddevSamp groupArray sumWithOverflow quantile simpleLinearRegression
varPop groupUniqArray sumMap quantiles stochasticLinearRegression
varSamp groupArrayInsertAt minMap quantileExact stochasticLogisticRegression
covarPop groupArrayMovingAvg maxMap quantileExactLow categoricalInformationValue
covarSamp groupArrayMovingSum skewSamp quantileExactHigh  
    skewPop quantileExactWeighted  

Lets explain some of the aggregate functions then give examples with ClickHouse.


Count function, Counts the number of rows or not-NULL values.

ClickHouse supports the following syntaxes for count:

  • count(expr) or COUNT(DISTINCT expr).
  • count() or COUNT(*). The count() syntax is ClickHouse-specific.

Example 1

SELECT count() FROM table


│      47 │

Example 2

SELECT count(DISTINCT number) FROM table


│              8 │

This example shows that count(DISTINCT num) is performed by the uniqExact function according to the count_distinct_implementation setting value.


Aggregate function that calculates the minimum across a group of values.


SELECT min(salary) FROM employees;
SELECT department, min(salary) FROM employees GROUP BY department;


Aggregate function that calculates the maximum across a group of values.

SELECT max(salary) FROM employees;
SELECT department, max(salary) FROM employees GROUP BY department;


Calculates the sum. Only works for numbers.

SELECT sum(salary) FROM employees;


Calculates the arithmetic mean.

SELECT avg(salary) FROM employees;

Parametric Aggregate Functions

Some aggregate functions can accept not only argument columns (used for compression), but a set of parameters – constants for initialization. The syntax is two pairs of brackets instead of one. The first is for parameters, and the second is for arguments.

Parametric Aggregate Functions listed below:

histogram retention
sequenceMatch uniqUpTo
sequenceCount sumMapFiltered

Calculates an adaptive histogram. It does not guarantee precise results.


SELECT histogram(5)(number + 1)
    SELECT *
    FROM system.numbers
    LIMIT 20


┌─histogram(5)(plus(number, 1))───────────────────────────────────────────┐
│ [(1,4.5,4),(4.5,8.5,4),(8.5,12.75,4.125),(12.75,17,4.625),(17,20,3.25)] │

You can visualize a histogram with the bar function, for example:

WITH histogram(5)(rand() % 100) AS hist
    arrayJoin(hist).3 AS height,
    bar(height, 0, 6, 5) AS bar
    SELECT *
    FROM system.numbers
    LIMIT 20


│  2.125 │ █▋    │
│   3.25 │ ██▌   │
│  5.625 │ ████▏ │
│  5.625 │ ████▏ │
│  3.375 │ ██▌   │


Checks whether the sequence contains an event chain that matches the pattern.

sequenceMatch(pattern)(timestamp, cond1, cond2, ...)


SELECT sequenceMatch('(?1)(?2)')(time, number = 1, number = 2) FROM test


┌─sequenceMatch('(?1)(?2)')(time, equals(number, 1), equals(number, 2), equals(number, 3))─┐
│                                                                                        0 │


Counts the number of event chains that matched the pattern. The function searches event chains that do not overlap. It starts to search for the next chain after the current chain is matched.


Consider data in the test table:

│    1 │      1 │
│    2 │      3 │
│    3 │      2 │
│    4 │      1 │
│    5 │      3 │
│    6 │      2 │

Count how many times the number 2 occurs after the number 1 with any amount of other numbers between them:

SELECT sequenceCount('(?1).*(?2)')(time, number = 1, number = 2) FROM t


┌─sequenceCount('(?1).*(?2)')(time, equals(number, 1), equals(number, 2))─┐
│                                                                       2 │

Aggregate Function Combinators

The name of an aggregate function can have a suffix appended to it. This changes the way the aggregate function works.

Aggregate function combinators listed below;

If State
Array Merge
Map MergeState
SimpleState ForEach
Distinct OrDefault
OrNull Resample


The suffix -If can be appended to the name of any aggregate function. In this case, the aggregate function accepts an extra argument – a condition (Uint8 type). The aggregate function processes only the rows that trigger the condition. If the condition was not triggered even once, it returns a default value (usually zeros or empty strings).

Examples: sumIf(column, cond)countIf(cond)avgIf(x, cond)quantilesTimingIf(level1, level2)(x, cond)argMinIf(arg, val, cond) and so on.


    sumIf(value, value % 2 == 1) AS odd_sum,
FROM table;


Sum or arithmetic average, but only for the rows that satisfy the condition passed by the second argument.

Therefore, SUM_IF(value, condition) is a slightly shorter notation for SUM(IF(condition, value)), same for AVG. The argument’s data type expansion is similar to the same-name functions without a suffix.

    avgIf(value, value % 2 == 1) AS odd_avg,
FROM table;


In this article, we gave examples with Aggregate Functions, Parametric Aggregate Functions and Aggregate Function Combinators on ClickHouse. ClickHouse, is a database that has analyzed the deficiencies in the queries written to reach the data in the industry and found solutions to them. We have covered some of the examples of these solutions in this article. We can access the data in the format we want by adding some functions to the same query that we use to access the data and access this data.

To read more functions in ClickHouse, do consider reading the following articles

About Can Sayn 41 Articles
Can Sayın is experienced Database Administrator in open source relational and NoSql databases, working in complicated infrastructures. Over 5 years industry experience, he gain managing database systems. He is working at ChistaDATA Inc. His areas of interest are generally on open source systems.
Contact: Website