ClickHouse’s Solutions: Alternatives to Table-Valued Parameters for Dynamic Data Handling

 

Introduction

Have you ever wondered how SQL databases handle dynamic data manipulation and filtering? Table-Valued Parameters are the remarkable feature that empowers us to do just that. Allowing entire tables or arrays as parameters in functions or queries offers unparalleled flexibility and reusability within SQL code.

PostgreSQL’s Table-Valued Parameters – look at table-valued parameters with a practical example demonstrated in PostgreSQL.  Shiv Iyer has created an insightful blog that sheds light on how table-valued parameters work and their advantages to PostgreSQL databases.

ClickHouse’s Unique Approach: Beyond Table-Valued Parameters While PostgreSQL boasts native support for table-valued parameters, ClickHouse takes a distinctive path. Presently, ClickHouse does not natively support this feature. However, don’t let that deter you! ClickHouse is a powerhouse analytical database system with a wealth of capabilities that deliver unparalleled performance for data analysis.

Embracing Versatility Though ClickHouse doesn’t directly support user-defined scalar functions with custom parameters in SQL syntax, it presents an exciting alternative – user-defined aggregate functions (UDAFs). These functions open new doors for handling complex calculations and data transformations, adding immense versatility to your queries.

Innovation and Adaptation: ClickHouse’s Diverse Solutions While table-valued parameters may not be native to ClickHouse, the journey doesn’t end there. ClickHouse provides a myriad of ingenious methods to achieve similar goals. From leveraging arrays and the IN operator to efficiently filter data based on multiple values, ClickHouse’s flexibility empowers you to master dynamic data handling.

Ready to Explore the Possibilities? Join us to explore SQL’s table-valued parameters and ClickHouse’s diverse capabilities.

Temporary tables

In ClickHouse, you can use temporary tables to achieve similar functionality as table-valued parameters. Temporary tables allow you to store data temporarily during a session, and you can perform various operations on them just like regular tables.

Here’s an example using temporary tables to filter data based on multiple IDs:

  1. Create a temporary table to store the IDs:
clickhouse1 :) CREATE TEMPORARY TABLE tmp_ids ( id Int32 ) ENGINE = Memory;


CREATE TEMPORARY TABLE tmp_ids
(
    `id` Int32
)
ENGINE = Memory

Query id: ef120020-6fa2-48ee-af0a-7ce107fb45b8

Ok.

0 rows in set. Elapsed: 0.001 sec.
  1. Insert data (IDs) into the temporary table:
clickhouse1 :) INSERT INTO tmp_ids VALUES (1), (3), (5);


INSERT INTO tmp_ids FORMAT Values

Query id: 3e9c072e-d8bb-46d3-bf85-0bb4020a411f

Ok.

3 rows in set. Elapsed: 0.002 sec.
  1. Use the temporary table to filter data from the main table (my_table):
clickhouse1 :) SELECT * FROM my_table WHERE id IN (SELECT id FROM tmp_ids);


SELECT *
FROM my_table
WHERE id IN (
    SELECT id
    FROM tmp_ids
)

Query id: 532f4472-144a-4baa-928b-27a6a55289e6

┌─id─┬─name──┐
│  1 │ John  │
│  1 │ John  │
│  3 │ Alice │
│  3 │ Alice │
│  5 │ Eve   │
│  5 │ Eve   │
└────┴───────┘

6 rows in set. Elapsed: 0.005 sec.

In this example, we create a temporary table tmp_ids with a single column id of type Int32. We then insert the IDs 1, 3, and 5 into the temporary table. Finally, we use the temporary table to filter rows from the main table my_table based on the provided IDs.

Remember that temporary tables are session-specific and will be automatically dropped when the session ends. This allows you to temporarily store and manipulate them within a specific session.

Parameterized views

Parameterized views provide a powerful way to create reusable view definitions with dynamic filtering capabilities. Let’s take an example to illustrate how a parameterized view works:

Suppose we have a table called “Employee” with the following structure:

CREATE TABLE Employee (
    employee_id Int32,
    employee_name String,
    department_id Int32,
    salary Decimal(10, 2)
) ENGINE = MergeTree
ORDER BY employee_id;


INSERT INTO Employee (employee_id, employee_name, department_id, salary)
VALUES
    (1, 'John', 1, 50000.00),
    (2, 'Jane', 2, 60000.00),
    (3, 'Alice', 1, 55000.00),
    (4, 'Bob', 3, 70000.00),
    (5, 'Eve', 3, 75000.00);


CREATE TABLE Department (
    department_id Int32,
    department_name String,
    region_id Int32
) ENGINE = MergeTree
ORDER BY department_id;


INSERT INTO Department (department_id, department_name, region_id)
VALUES
    (1, 'IT', 1),
    (2, 'HR', 2),
    (3, 'Sales', 3);


CREATE TABLE Region (
    region_id Int32,
    region_name String
) ENGINE = MergeTree
ORDER BY region_id;

INSERT INTO Region (region_id, region_name)
VALUES
    (1, 'North America'),
    (2, 'Europe'),
    (3, 'Asia');

Now, we want to create a parameterized view that allows us to filter employees based on their department and salary range.

clickhouse1 :) CREATE VIEW param_view AS SELECT *
                                             FROM Employee
                                             WHERE department_id IN (
                                                 SELECT department_id
                                                 FROM Department
                                                 WHERE department_name = {deptname:String} )

CREATE VIEW param_view AS
SELECT *
FROM Employee
WHERE department_id IN (
    SELECT department_id
    FROM Department
    WHERE department_name = {deptname:String}
)

Query id: aed3f78c-52f9-46e6-8ae9-4f3759db4197

Ok.

0 rows in set. Elapsed: 0.007 sec.

Now, we can use this parameterized view as a table function by substituting the parameter values:

clickhouse1 :) CREATE VIEW param_view AS SELECT *
                                             FROM Employee
                                             WHERE department_id IN (
                                                 SELECT department_id
                                                 FROM Department
                                                 WHERE department_name = {deptname:String} )

CREATE VIEW param_view AS
SELECT *
FROM Employee
WHERE department_id IN (
    SELECT department_id
    FROM Department
    WHERE department_name = {deptname:String}
)

Query id: aed3f78c-52f9-46e6-8ae9-4f3759db4197

Ok.

0 rows in set. Elapsed: 0.007 sec.


clickhouse1 :) SELECT * FROM param_view(deptname='HR')

SELECT *
FROM param_view(deptname = 'HR')

Query id: e2105699-90c8-491d-accc-1df24dfb1fc4

┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐
│           2 │ Jane          │             2 │  60000 │
│           2 │ Jane          │             2 │  60000 │
└─────────────┴───────────────┴───────────────┴────────┘

2 rows in set. Elapsed: 0.007 sec.

clickhouse1 :)
clickhouse1 :)
clickhouse1 :) SELECT * FROM param_view(deptname='Sales')

SELECT *
FROM param_view(deptname = 'Sales')

Query id: 9e5ae16b-de57-40a5-aa9e-98c3353bca20

┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐
│           4 │ Bob           │             3 │  70000 │
│           4 │ Bob           │             3 │  70000 │
│           5 │ Eve           │             3 │  75000 │
│           5 │ Eve           │             3 │  75000 │
└─────────────┴───────────────┴───────────────┴────────┘

4 rows in set. Elapsed: 0.005 sec.

clickhouse1 :)

clickhouse1 :) SELECT * FROM param_view(deptname='IT')

SELECT *
FROM param_view(deptname = 'IT')

Query id: 3fef82ae-7e80-45ae-898f-eb31f0ba6305

┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐
│           1 │ John          │             1 │  50000 │
│           1 │ John          │             1 │  50000 │
│           3 │ Alice         │             1 │  55000 │
│           3 │ Alice         │             1 │  55000 │
└─────────────┴───────────────┴───────────────┴────────┘

4 rows in set. Elapsed: 0.005 sec.

Custom settings

You can also make use of Custom Settings in Views. Set custom settings in ClickHouse Config, and it needs DB restart.

root@clickhouse1:~# cat /etc/clickhouse-server/config.d/custom_settings_prefixes.xml
<?xml version="1.0" ?>
<yandex>
    <custom_settings_prefixes>custom_chistadata</custom_settings_prefixes>
</yandex>
root@clickhouse1:~# cat /etc/clickhouse-server/users.d/custom_settings_default.xml
<?xml version="1.0"?>
<yandex>
    <profiles>
        <default>
        <custom_chistadata_dept>'IT'</custom_chistadata_dept>
        <custom_chistadata_reg>'Asia'</custom_chistadata_reg>
        </default>
    </profiles>
</yandex>
root@clickhouse1:~#

-- Restart Database 

root@clickhouse1:~# service clickhouse-server restart


Query View using custom settings :

clickhouse1 :) SELECT
                   e.employee_id,
                   e.employee_name,
                   d.department_name,
                   r.region_name,
                   e.salary
               FROM Employee AS e
               INNER JOIN Department AS d ON e.department_id = d.department_id
               INNER JOIN Region AS r ON d.region_id = r.region_id
               WHERE d.department_name = getSetting('custom_chistadata_dept')

SELECT
    e.employee_id,
    e.employee_name,
    d.department_name,
    r.region_name,
    e.salary
FROM Employee AS e
INNER JOIN Department AS d ON e.department_id = d.department_id
INNER JOIN Region AS r ON d.region_id = r.region_id
WHERE d.department_name = getSetting('custom_chistadata_dept')

Query id: 939f22e4-05c9-4180-bda1-e7e9e6da96fd

┌─e.employee_id─┬─e.employee_name─┬─d.department_name─┬─r.region_name─┬─e.salary─┐
│             1 │ John            │ IT                │ North America │    50000 │
│             3 │ Alice           │ IT                │ North America │    55000 │
└───────────────┴─────────────────┴───────────────────┴───────────────┴──────────┘

2 rows in set. Elapsed: 0.009 sec.

clickhouse1 :)


clickhouse1 :)
clickhouse1 :) SELECT
                   e.employee_id,
                   e.employee_name,
                   d.department_name,
                   r.region_name,
                   e.salary
               FROM Employee AS e
               INNER JOIN Department AS d ON e.department_id = d.department_id
               INNER JOIN Region AS r ON d.region_id = r.region_id
               WHERE r.region_name = getSetting('custom_chistadata_reg')

SELECT
    e.employee_id,
    e.employee_name,
    d.department_name,
    r.region_name,
    e.salary
FROM Employee AS e
INNER JOIN Department AS d ON e.department_id = d.department_id
INNER JOIN Region AS r ON d.region_id = r.region_id
WHERE r.region_name = getSetting('custom_chistadata_reg')

Query id: a4b1fabf-8a39-4116-b748-a76ba5aa8a35

┌─e.employee_id─┬─e.employee_name─┬─d.department_name─┬─r.region_name─┬─e.salary─┐
│             4 │ Bob             │ Sales             │ Asia          │    70000 │
│             5 │ Eve             │ Sales             │ Asia          │    75000 │
└───────────────┴─────────────────┴───────────────────┴───────────────┴──────────┘

2 rows in set. Elapsed: 0.008 sec.


clickhouse1 :) set custom_chistadata_reg='Europe'

SET custom_chistadata_reg = 'Europe'

Query id: 2a93ef85-82e2-4187-87c6-2196a38c9e5f

Ok.

0 rows in set. Elapsed: 0.001 sec.

clickhouse1 :) SELECT
                   e.employee_id,
                   e.employee_name,
                   d.department_name,
                   r.region_name,
                   e.salary
               FROM Employee AS e
               INNER JOIN Department AS d ON e.department_id = d.department_id
               INNER JOIN Region AS r ON d.region_id = r.region_id
               WHERE r.region_name = getSetting('custom_chistadata_reg')

SELECT
    e.employee_id,
    e.employee_name,
    d.department_name,
    r.region_name,
    e.salary
FROM Employee AS e
INNER JOIN Department AS d ON e.department_id = d.department_id
INNER JOIN Region AS r ON d.region_id = r.region_id
WHERE r.region_name = getSetting('custom_chistadata_reg')

Query id: 974146e6-bc92-4030-ba71-a0bb5ab1872e

┌─e.employee_id─┬─e.employee_name─┬─d.department_name─┬─r.region_name─┬─e.salary─┐
│             2 │ Jane            │ HR                │ Europe        │    60000 │
│             2 │ Jane            │ HR                │ Europe        │    60000 │
└───────────────┴─────────────────┴───────────────────┴───────────────┴──────────┘

2 rows in set. Elapsed: 0.008 sec.

clickhouse1 :)

Pass custom Settings in Parameterised and Materialised Views

====== Parametrzied Views ====== 
clickhouse1 :) CREATE VIEW my_new_view AS SELECT *
               FROM Employee
               WHERE department_id IN (
                   SELECT department_id
                   FROM Department
                   WHERE department_name = getSetting('custom_chistadata_dept')
               )

CREATE VIEW my_new_view AS
SELECT *
FROM Employee
WHERE department_id IN (
    SELECT department_id
    FROM Department
    WHERE department_name = getSetting('custom_chistadata_dept')
)

Query id: c0a3c77c-5782-41e4-8579-f40610b01049

Ok.

0 rows in set. Elapsed: 0.006 sec.

clickhouse1 :)


clickhouse1 :) SELECT *
               FROM my_new_view

SELECT *
FROM my_new_view

Query id: ecde2fa7-3035-494b-a7d4-c4e1afaac55e

┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐
│           1 │ John          │             1 │  50000 │
│           3 │ Alice         │             1 │  55000 │
└─────────────┴───────────────┴───────────────┴────────┘

2 rows in set. Elapsed: 0.007 sec.

clickhouse1 :)
clickhouse1 :)
clickhouse1 :) SELECT *
               FROM my_new_view
               SETTINGS custom_chistadata_dept = 'IT'

SELECT *
FROM my_new_view
SETTINGS custom_chistadata_dept = 'IT'

Query id: 55524774-44df-42a6-bacf-661df9a79ee7

┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐
│           1 │ John          │             1 │  50000 │
│           3 │ Alice         │             1 │  55000 │
└─────────────┴───────────────┴───────────────┴────────┘

2 rows in set. Elapsed: 0.004 sec.


clickhouse1 :) SELECT *
               FROM my_new_view
               SETTINGS custom_chistadata_dept = 'HR'

SELECT *
FROM my_new_view
SETTINGS custom_chistadata_dept = 'HR'

Query id: 8a738189-585a-416d-b576-19cf2f7246d2

┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐
│           2 │ Jane          │             2 │  60000 │
└─────────────┴───────────────┴───────────────┴────────┘

1 row in set. Elapsed: 0.008 sec.

clickhouse1 :)
clickhouse1 :) SELECT *
               FROM my_new_view
               SETTINGS custom_chistadata_dept = 'Sales'

SELECT *
FROM my_new_view
SETTINGS custom_chistadata_dept = 'Sales'

Query id: 941a9c97-f857-49ea-bf33-9ff72f15ad00

┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐
│           4 │ Bob           │             3 │  70000 │
│           5 │ Eve           │             3 │  75000 │
└─────────────┴───────────────┴───────────────┴────────┘

2 rows in set. Elapsed: 0.005 sec.




====== MATERIALIZED VIEWS ====== 

clickhouse1 :) CREATE MATERIALIZED VIEW MAT_view_NEW
               ENGINE = MergeTree
               ORDER BY employee_id AS
               SELECT *
               FROM Employee
               WHERE department_id IN (
                   SELECT department_id
                   FROM Department
                   WHERE department_name = getSetting('custom_chistadata_dept')
               )


CREATE MATERIALIZED VIEW MAT_view_NEW
ENGINE = MergeTree
ORDER BY employee_id AS
SELECT *
FROM Employee
WHERE department_id IN (
    SELECT department_id
    FROM Department
    WHERE department_name = getSetting('custom_chistadata_dept')
)

Query id: bbf34852-2979-42a0-9153-23dd299fe98a

Ok.

0 rows in set. Elapsed: 0.014 sec.

clickhouse1 :)
clickhouse1 :) select * from MAT_view_NEW;

SELECT *
FROM MAT_view_NEW

Query id: 134403b3-ac5a-40ed-9fe1-53c9cdab9258

Ok.

0 rows in set. Elapsed: 0.002 sec.

clickhouse1 :)
clickhouse1 :) INSERT INTO Employee (employee_id, employee_name, department_id, salary)
               VALUES
                   (1, 'John', 1, 50000.00),
                   (2, 'Jane', 2, 60000.00),
                   (3, 'Alice', 1, 55000.00),
                   (4, 'Bob', 3, 70000.00),
                   (5, 'Eve', 3, 75000.00);

INSERT INTO Employee (employee_id, employee_name, department_id, salary) FORMAT Values

Query id: 01135403-e074-484d-8f72-086974d8a1eb

Ok.

5 rows in set. Elapsed: 0.011 sec.

clickhouse1 :)


clickhouse1 :) SELECT *
               FROM MAT_view_NEW

SELECT *
FROM MAT_view_NEW

Query id: 05ae9ac0-390e-4fa3-a3d6-43a12e009117

┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐
│           1 │ John          │             1 │  50000 │
│           3 │ Alice         │             1 │  55000 │
└─────────────┴───────────────┴───────────────┴────────┘

2 rows in set. Elapsed: 0.002 sec.

By using the parameterized view, we can easily change the filter conditions without modifying the view’s definition. This provides flexibility and reusability, making it easier to query data with different criteria without recreating views or writing complex queries each time.

Conclusion 

ClickHouse offers versatile solutions for dynamic data handling, such as temporary tables, parameterized views, and custom settings. These alternatives provide flexibility and efficiency in managing and querying data, even without native support for table-valued parameters.

To know more about materialized views in ClickHouse, do visit the following articles:

 

References :

https://github.com/ClickHouse/ClickHouse/issues/34478