1. Home
  2. Knowledge Base
  3. ClickHouse
  4. ClickHouse User Defined Functions (UDFs) from an external script
  1. Home
  2. Knowledge Base
  3. ClickHouse DBA
  4. ClickHouse User Defined Functions (UDFs) from an external script

ClickHouse User Defined Functions (UDFs) from an external script

ClickHouse supports user-defined functions (UDF) that can either be created based on the built-in functions in ClickHouse or from an external script or program. ClickHouse uses STDIN and STDOUT data streams to communicate with the program/script. Let us focus on ClickHouse UDFs from an external Python program in this example. For the sake of simplicity, let us create a Python function that can calculate the volume of a cylinder.

Step 1 – Create the UDF Python script

The following Python script will be used to create the Clickhouse UDF which will calculate the volume of the cylinder.

#!/usr/bin/python3

import math
import sys

if __name__ == '__main__':
    for line in sys.stdin:
        radius, height = line.split('\t')
        radius = float(radius)
        height = float(height)
        res = str(radius * radius * height * math.pi)
        print(res, end='')
        sys.stdout.flush()

Save the script as volume_cylinder.py in /var/lib/clickhouse/user_scripts directory.

Both the height and the radius of the cylinder are received from STDIN. Once the file is ready, convert it as an executable.

chmod +x volume_cylinder.py

Step 2 – Add Function Configuration

Note down the setting value for user_defined_executable_functions_config in the config.xml file. You may have something similar to below.

<user_defined_executable_functions_config>*_function.*ml</user_defined_executable_functions_config>

This is the path where the UDF config XML is stored. For the above example setting, the config file will be in /etc/clickhouse-server/*_function.*ml.

Let us create an XML file called /etc/clickhouse-server/udf_cylinder_volume_function.xml with the below contents.

<functions>
    <function>
        <type>executable</type>
        <name>cylinder_volume</name>
        <return_type>String</return_type>
        <argument><type>String</type></argument>
        <argument><type>String</type></argument>
        <format>TabSeparated</format>
        <command>volume_cylinder.py</command>
        <execute_direct>1</execute_direct>
    </function>
</functions>

We are creating a ClickHouse UDF named cylinder_volume based on the volume_cylinder.py file. The descriptions for the configurations used are below.

  • type – executable or executable_pool 
  • name – the name of the UDF in ClickHouse
  • return_type – The data type of the returned Value
  • argument – Arguments and data type for the argument that will be passed via STDIN to the function script
  • format – One of the supported data formats in ClickHouse
  • command – Script name or command to execute
  • execute_direct – If true, the script is executed from the user_script folder, else the command is passed as an argument for bin/sh -c

Step 3 – Test the UDF

Once the configs and the scripts are in place, try calling the UDF from ClickHouse SQL.

SELECT cylinder_volume(2,3);

References

https://clickhouse.com/docs/en/sql-reference/functions/udf

https://stackoverflow.com/questions/71236415/how-to-send-multiple-arguments-to-executable-udf-in-clickhouse

Was this article helpful?

Related Articles

CHISTADATA IS COMMITTED TO OPEN SOURCE SOFTWARE AND BUILDING HIGH PERFORMANCE COLUMNSTORES

In the spirit of freedom, independence and innovation. ChistaDATA Corporation is not affiliated with ClickHouse Corporation 

Need Support?

Can't find the answer you're looking for?
Contact Support

ChistaDATA Inc. Knowledge base is licensed under the Apache License, Version 2.0 (the “License”)

Copyright 2022 ChistaDATA Inc

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.