Introduction
Have you ever wondered how SQL databases handle dynamic data manipulation and filtering? Table-Valued Parameters are the remarkable feature that empowers us to do just that. Allowing entire tables or arrays as parameters in functions or queries offers unparalleled flexibility and reusability within SQL code.
PostgreSQL’s Table-Valued Parameters – look at table-valued parameters with a practical example demonstrated in PostgreSQL. Shiv Iyer has created an insightful blog that sheds light on how table-valued parameters work and their advantages to PostgreSQL databases.
ClickHouse’s Unique Approach: Beyond Table-Valued Parameters While PostgreSQL boasts native support for table-valued parameters, ClickHouse takes a distinctive path. Presently, ClickHouse does not natively support this feature. However, don’t let that deter you! ClickHouse is a powerhouse analytical database system with a wealth of capabilities that deliver unparalleled performance for data analysis.
Embracing Versatility Though ClickHouse doesn’t directly support user-defined scalar functions with custom parameters in SQL syntax, it presents an exciting alternative – user-defined aggregate functions (UDAFs). These functions open new doors for handling complex calculations and data transformations, adding immense versatility to your queries.
Innovation and Adaptation: ClickHouse’s Diverse Solutions While table-valued parameters may not be native to ClickHouse, the journey doesn’t end there. ClickHouse provides a myriad of ingenious methods to achieve similar goals. From leveraging arrays and the IN
operator to efficiently filter data based on multiple values, ClickHouse’s flexibility empowers you to master dynamic data handling.
Ready to Explore the Possibilities? Join us to explore SQL’s table-valued parameters and ClickHouse’s diverse capabilities.
Temporary tables
In ClickHouse, you can use temporary tables to achieve similar functionality as table-valued parameters. Temporary tables allow you to store data temporarily during a session, and you can perform various operations on them just like regular tables.
Here’s an example using temporary tables to filter data based on multiple IDs:
- Create a temporary table to store the IDs:
clickhouse1 :) CREATE TEMPORARY TABLE tmp_ids ( id Int32 ) ENGINE = Memory; CREATE TEMPORARY TABLE tmp_ids ( `id` Int32 ) ENGINE = Memory Query id: ef120020-6fa2-48ee-af0a-7ce107fb45b8 Ok. 0 rows in set. Elapsed: 0.001 sec.
- Insert data (IDs) into the temporary table:
clickhouse1 :) INSERT INTO tmp_ids VALUES (1), (3), (5); INSERT INTO tmp_ids FORMAT Values Query id: 3e9c072e-d8bb-46d3-bf85-0bb4020a411f Ok. 3 rows in set. Elapsed: 0.002 sec.
- Use the temporary table to filter data from the main table (
my_table
):
clickhouse1 :) SELECT * FROM my_table WHERE id IN (SELECT id FROM tmp_ids); SELECT * FROM my_table WHERE id IN ( SELECT id FROM tmp_ids ) Query id: 532f4472-144a-4baa-928b-27a6a55289e6 ┌─id─┬─name──┐ │ 1 │ John │ │ 1 │ John │ │ 3 │ Alice │ │ 3 │ Alice │ │ 5 │ Eve │ │ 5 │ Eve │ └────┴───────┘ 6 rows in set. Elapsed: 0.005 sec.
In this example, we create a temporary table tmp_ids
with a single column id
of type Int32
. We then insert the IDs 1
, 3
, and 5
into the temporary table. Finally, we use the temporary table to filter rows from the main table my_table
based on the provided IDs.
Remember that temporary tables are session-specific and will be automatically dropped when the session ends. This allows you to temporarily store and manipulate them within a specific session.
Parameterized views
Parameterized views provide a powerful way to create reusable view definitions with dynamic filtering capabilities. Let’s take an example to illustrate how a parameterized view works:
Suppose we have a table called “Employee” with the following structure:
CREATE TABLE Employee ( employee_id Int32, employee_name String, department_id Int32, salary Decimal(10, 2) ) ENGINE = MergeTree ORDER BY employee_id; INSERT INTO Employee (employee_id, employee_name, department_id, salary) VALUES (1, 'John', 1, 50000.00), (2, 'Jane', 2, 60000.00), (3, 'Alice', 1, 55000.00), (4, 'Bob', 3, 70000.00), (5, 'Eve', 3, 75000.00); CREATE TABLE Department ( department_id Int32, department_name String, region_id Int32 ) ENGINE = MergeTree ORDER BY department_id; INSERT INTO Department (department_id, department_name, region_id) VALUES (1, 'IT', 1), (2, 'HR', 2), (3, 'Sales', 3); CREATE TABLE Region ( region_id Int32, region_name String ) ENGINE = MergeTree ORDER BY region_id; INSERT INTO Region (region_id, region_name) VALUES (1, 'North America'), (2, 'Europe'), (3, 'Asia');
Now, we want to create a parameterized view that allows us to filter employees based on their department and salary range.
clickhouse1 :) CREATE VIEW param_view AS SELECT * FROM Employee WHERE department_id IN ( SELECT department_id FROM Department WHERE department_name = {deptname:String} ) CREATE VIEW param_view AS SELECT * FROM Employee WHERE department_id IN ( SELECT department_id FROM Department WHERE department_name = {deptname:String} ) Query id: aed3f78c-52f9-46e6-8ae9-4f3759db4197 Ok. 0 rows in set. Elapsed: 0.007 sec.
Now, we can use this parameterized view as a table function by substituting the parameter values:
clickhouse1 :) CREATE VIEW param_view AS SELECT * FROM Employee WHERE department_id IN ( SELECT department_id FROM Department WHERE department_name = {deptname:String} ) CREATE VIEW param_view AS SELECT * FROM Employee WHERE department_id IN ( SELECT department_id FROM Department WHERE department_name = {deptname:String} ) Query id: aed3f78c-52f9-46e6-8ae9-4f3759db4197 Ok. 0 rows in set. Elapsed: 0.007 sec. clickhouse1 :) SELECT * FROM param_view(deptname='HR') SELECT * FROM param_view(deptname = 'HR') Query id: e2105699-90c8-491d-accc-1df24dfb1fc4 ┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐ │ 2 │ Jane │ 2 │ 60000 │ │ 2 │ Jane │ 2 │ 60000 │ └─────────────┴───────────────┴───────────────┴────────┘ 2 rows in set. Elapsed: 0.007 sec. clickhouse1 :) clickhouse1 :) clickhouse1 :) SELECT * FROM param_view(deptname='Sales') SELECT * FROM param_view(deptname = 'Sales') Query id: 9e5ae16b-de57-40a5-aa9e-98c3353bca20 ┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐ │ 4 │ Bob │ 3 │ 70000 │ │ 4 │ Bob │ 3 │ 70000 │ │ 5 │ Eve │ 3 │ 75000 │ │ 5 │ Eve │ 3 │ 75000 │ └─────────────┴───────────────┴───────────────┴────────┘ 4 rows in set. Elapsed: 0.005 sec. clickhouse1 :) clickhouse1 :) SELECT * FROM param_view(deptname='IT') SELECT * FROM param_view(deptname = 'IT') Query id: 3fef82ae-7e80-45ae-898f-eb31f0ba6305 ┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐ │ 1 │ John │ 1 │ 50000 │ │ 1 │ John │ 1 │ 50000 │ │ 3 │ Alice │ 1 │ 55000 │ │ 3 │ Alice │ 1 │ 55000 │ └─────────────┴───────────────┴───────────────┴────────┘ 4 rows in set. Elapsed: 0.005 sec.
Custom settings
You can also make use of Custom Settings in Views. Set custom settings in ClickHouse Config, and it needs DB restart.
root@clickhouse1:~# cat /etc/clickhouse-server/config.d/custom_settings_prefixes.xml <?xml version="1.0" ?> <yandex> <custom_settings_prefixes>custom_chistadata</custom_settings_prefixes> </yandex> root@clickhouse1:~# cat /etc/clickhouse-server/users.d/custom_settings_default.xml <?xml version="1.0"?> <yandex> <profiles> <default> <custom_chistadata_dept>'IT'</custom_chistadata_dept> <custom_chistadata_reg>'Asia'</custom_chistadata_reg> </default> </profiles> </yandex> root@clickhouse1:~# -- Restart Database root@clickhouse1:~# service clickhouse-server restart
Query View using custom settings :
clickhouse1 :) SELECT e.employee_id, e.employee_name, d.department_name, r.region_name, e.salary FROM Employee AS e INNER JOIN Department AS d ON e.department_id = d.department_id INNER JOIN Region AS r ON d.region_id = r.region_id WHERE d.department_name = getSetting('custom_chistadata_dept') SELECT e.employee_id, e.employee_name, d.department_name, r.region_name, e.salary FROM Employee AS e INNER JOIN Department AS d ON e.department_id = d.department_id INNER JOIN Region AS r ON d.region_id = r.region_id WHERE d.department_name = getSetting('custom_chistadata_dept') Query id: 939f22e4-05c9-4180-bda1-e7e9e6da96fd ┌─e.employee_id─┬─e.employee_name─┬─d.department_name─┬─r.region_name─┬─e.salary─┐ │ 1 │ John │ IT │ North America │ 50000 │ │ 3 │ Alice │ IT │ North America │ 55000 │ └───────────────┴─────────────────┴───────────────────┴───────────────┴──────────┘ 2 rows in set. Elapsed: 0.009 sec. clickhouse1 :) clickhouse1 :) clickhouse1 :) SELECT e.employee_id, e.employee_name, d.department_name, r.region_name, e.salary FROM Employee AS e INNER JOIN Department AS d ON e.department_id = d.department_id INNER JOIN Region AS r ON d.region_id = r.region_id WHERE r.region_name = getSetting('custom_chistadata_reg') SELECT e.employee_id, e.employee_name, d.department_name, r.region_name, e.salary FROM Employee AS e INNER JOIN Department AS d ON e.department_id = d.department_id INNER JOIN Region AS r ON d.region_id = r.region_id WHERE r.region_name = getSetting('custom_chistadata_reg') Query id: a4b1fabf-8a39-4116-b748-a76ba5aa8a35 ┌─e.employee_id─┬─e.employee_name─┬─d.department_name─┬─r.region_name─┬─e.salary─┐ │ 4 │ Bob │ Sales │ Asia │ 70000 │ │ 5 │ Eve │ Sales │ Asia │ 75000 │ └───────────────┴─────────────────┴───────────────────┴───────────────┴──────────┘ 2 rows in set. Elapsed: 0.008 sec. clickhouse1 :) set custom_chistadata_reg='Europe' SET custom_chistadata_reg = 'Europe' Query id: 2a93ef85-82e2-4187-87c6-2196a38c9e5f Ok. 0 rows in set. Elapsed: 0.001 sec. clickhouse1 :) SELECT e.employee_id, e.employee_name, d.department_name, r.region_name, e.salary FROM Employee AS e INNER JOIN Department AS d ON e.department_id = d.department_id INNER JOIN Region AS r ON d.region_id = r.region_id WHERE r.region_name = getSetting('custom_chistadata_reg') SELECT e.employee_id, e.employee_name, d.department_name, r.region_name, e.salary FROM Employee AS e INNER JOIN Department AS d ON e.department_id = d.department_id INNER JOIN Region AS r ON d.region_id = r.region_id WHERE r.region_name = getSetting('custom_chistadata_reg') Query id: 974146e6-bc92-4030-ba71-a0bb5ab1872e ┌─e.employee_id─┬─e.employee_name─┬─d.department_name─┬─r.region_name─┬─e.salary─┐ │ 2 │ Jane │ HR │ Europe │ 60000 │ │ 2 │ Jane │ HR │ Europe │ 60000 │ └───────────────┴─────────────────┴───────────────────┴───────────────┴──────────┘ 2 rows in set. Elapsed: 0.008 sec. clickhouse1 :)
Pass custom Settings in Parameterised and Materialised Views
====== Parametrzied Views ====== clickhouse1 :) CREATE VIEW my_new_view AS SELECT * FROM Employee WHERE department_id IN ( SELECT department_id FROM Department WHERE department_name = getSetting('custom_chistadata_dept') ) CREATE VIEW my_new_view AS SELECT * FROM Employee WHERE department_id IN ( SELECT department_id FROM Department WHERE department_name = getSetting('custom_chistadata_dept') ) Query id: c0a3c77c-5782-41e4-8579-f40610b01049 Ok. 0 rows in set. Elapsed: 0.006 sec. clickhouse1 :) clickhouse1 :) SELECT * FROM my_new_view SELECT * FROM my_new_view Query id: ecde2fa7-3035-494b-a7d4-c4e1afaac55e ┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐ │ 1 │ John │ 1 │ 50000 │ │ 3 │ Alice │ 1 │ 55000 │ └─────────────┴───────────────┴───────────────┴────────┘ 2 rows in set. Elapsed: 0.007 sec. clickhouse1 :) clickhouse1 :) clickhouse1 :) SELECT * FROM my_new_view SETTINGS custom_chistadata_dept = 'IT' SELECT * FROM my_new_view SETTINGS custom_chistadata_dept = 'IT' Query id: 55524774-44df-42a6-bacf-661df9a79ee7 ┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐ │ 1 │ John │ 1 │ 50000 │ │ 3 │ Alice │ 1 │ 55000 │ └─────────────┴───────────────┴───────────────┴────────┘ 2 rows in set. Elapsed: 0.004 sec. clickhouse1 :) SELECT * FROM my_new_view SETTINGS custom_chistadata_dept = 'HR' SELECT * FROM my_new_view SETTINGS custom_chistadata_dept = 'HR' Query id: 8a738189-585a-416d-b576-19cf2f7246d2 ┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐ │ 2 │ Jane │ 2 │ 60000 │ └─────────────┴───────────────┴───────────────┴────────┘ 1 row in set. Elapsed: 0.008 sec. clickhouse1 :) clickhouse1 :) SELECT * FROM my_new_view SETTINGS custom_chistadata_dept = 'Sales' SELECT * FROM my_new_view SETTINGS custom_chistadata_dept = 'Sales' Query id: 941a9c97-f857-49ea-bf33-9ff72f15ad00 ┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐ │ 4 │ Bob │ 3 │ 70000 │ │ 5 │ Eve │ 3 │ 75000 │ └─────────────┴───────────────┴───────────────┴────────┘ 2 rows in set. Elapsed: 0.005 sec. ====== MATERIALIZED VIEWS ====== clickhouse1 :) CREATE MATERIALIZED VIEW MAT_view_NEW ENGINE = MergeTree ORDER BY employee_id AS SELECT * FROM Employee WHERE department_id IN ( SELECT department_id FROM Department WHERE department_name = getSetting('custom_chistadata_dept') ) CREATE MATERIALIZED VIEW MAT_view_NEW ENGINE = MergeTree ORDER BY employee_id AS SELECT * FROM Employee WHERE department_id IN ( SELECT department_id FROM Department WHERE department_name = getSetting('custom_chistadata_dept') ) Query id: bbf34852-2979-42a0-9153-23dd299fe98a Ok. 0 rows in set. Elapsed: 0.014 sec. clickhouse1 :) clickhouse1 :) select * from MAT_view_NEW; SELECT * FROM MAT_view_NEW Query id: 134403b3-ac5a-40ed-9fe1-53c9cdab9258 Ok. 0 rows in set. Elapsed: 0.002 sec. clickhouse1 :) clickhouse1 :) INSERT INTO Employee (employee_id, employee_name, department_id, salary) VALUES (1, 'John', 1, 50000.00), (2, 'Jane', 2, 60000.00), (3, 'Alice', 1, 55000.00), (4, 'Bob', 3, 70000.00), (5, 'Eve', 3, 75000.00); INSERT INTO Employee (employee_id, employee_name, department_id, salary) FORMAT Values Query id: 01135403-e074-484d-8f72-086974d8a1eb Ok. 5 rows in set. Elapsed: 0.011 sec. clickhouse1 :) clickhouse1 :) SELECT * FROM MAT_view_NEW SELECT * FROM MAT_view_NEW Query id: 05ae9ac0-390e-4fa3-a3d6-43a12e009117 ┌─employee_id─┬─employee_name─┬─department_id─┬─salary─┐ │ 1 │ John │ 1 │ 50000 │ │ 3 │ Alice │ 1 │ 55000 │ └─────────────┴───────────────┴───────────────┴────────┘ 2 rows in set. Elapsed: 0.002 sec.
By using the parameterized view, we can easily change the filter conditions without modifying the view’s definition. This provides flexibility and reusability, making it easier to query data with different criteria without recreating views or writing complex queries each time.
Conclusion
ClickHouse offers versatile solutions for dynamic data handling, such as temporary tables, parameterized views, and custom settings. These alternatives provide flexibility and efficiency in managing and querying data, even without native support for table-valued parameters.
To know more about materialized views in ClickHouse, do visit the following articles:
- Introduction to Materialized Views in ClickHouse
- ClickHouse Materialized Views: Using Live Views to Compute Real-time Moving Averages
- Materialized Views in ClickHouse for Optimal Server Performance
References :
https://github.com/ClickHouse/ClickHouse/issues/34478