ClickHouse hosts its data in file systems like most systems. Unauthorized individuals or organizations should be prevented from accessing this data.
You can strengthen your storage system with the following categories:
- Host-Level Security
- Volume Level Encryption
- Column Level Encryption
- Log File Protection
Host-Level Security
The files required to run ClickHouse should be restricted as much as possible.
- ClickHouse does not require
root
access to the file system and runs by default as the userclickhouse
. - The following directories should be restricted to the minimum number of users:
/etc/clickhouse-server: Used for ClickHouse settings and account credentials created by default.
/var/lib/clickhouse: Used for ClickHouse data and new credentials.
/var/log/clickhouse-server: Log files that may display privileged information through queries.
Volume Level Encryption
Encrypting data on the file system prevents unauthorized users who may have gained access to the file system that your ClickHouse database is stored on from being able to access the data itself. Depending on your environment, different encryption options may be required.
Cloud Storage
If your ClickHouse database is stored in a cloud service such as AWS or Azure, verify that the cloud supports encrypting the volume. For example, Amazon AWS provides a method to encrypt new Amazon EBS volumes by default.
Local Storage
LUKS is a recommended solution for organizations that host ClickHouse clusters on their own managed systems. Linux distributions, including Red Hat and Ubuntu are available. Check with your organization’s distribution for instructions on how to encrypt those volumes.
Kubernetes Encryption
If Kubernetes manage your ClickHouse cluster, the StorageClass used may be encrypted. For more information, please check the Kubernetes Storage Class documentation.
Column Level Encryption
Organizations running ClickHouse 20.11 or later can encrypt individual columns with AES functions. For complete information, check the official ClickHouse documentation.
Applications are responsible for their own keys. Before enabling column level encryption, test to verify that encryption does not negatively impact performance.
The following functions are available:
FunctionFunction | MySQL AES Compatible |
encrypt(mode, plaintext, key, [iv, aad]) | |
decrypt(mode, ciphertext, key, [iv, aad]) | |
aes_encrypt_mysql(mode, plaintext, key, [iv]) | * |
aes_decrypt_mysql(mode, ciphertext, key, [iv]) | * |
Encryption function arguments:
Argument | Description | Type |
mode | Encryption mode. | String |
plaintext | Text that needs to be encrypted. | String |
key | Encryption key. | String |
iv | Initialization vector. Required for -gcm modes, optional for others. | String |
aad | Additional authenticated data. It isn’t encrypted, but it affects decryption. Works only in -gcm modes, for others would throw an exception | String |
Column Encryption Examples
This example displays how to encrypt information using a hashed key.
- Takes a hex value, unhexes it and stores it as
key
. - Select the value and encrypt it with the
key
, then displays the encrypted value.
WITH unhex('658bb26de6f8a069a3520293a572078f') AS key SELECT hex(encrypt('aes-128-cbc', 'Hello Its ChistaDATA', key)) AS encrypted Query id: e4bc03aa-84c4-4d86-a370-004e4e0259df ┌─encrypted────────────────────────────────────────────────────────┐ │ 5D14622633D048749EB666A71594CE6BC913E4A73636CE232D11E16377196C22 │ └──────────────────────────────────────────────────────────────────┘ 1 row in set. Elapsed: 0.009 sec.
This shows how to decrypt encrypted data:
- Takes a hex value, unhexes it and stores it as
key
. - Decrypts the selected value with the
key
as text.
WITH unhex('658bb26de6f8a069a3520293a572078f') AS key SELECT decrypt('aes-128-cbc', unhex('5D14622633D048749EB666A71594CE6BC913E4A73636CE232D11E16377196C22'), key) AS plaintext Query id: 61dbe648-0ea9-47aa-aa48-90ae0233df91 ┌─plaintext────────────┐ │ Hello Its ChistaDATA │ └──────────────────────┘ 1 row in set. Elapsed: 0.020 sec.
Log File Protection
To prevent certain queries from appearing in log files or to hide sensitive information:
- Update the configuration files, located by default in
/etc/clickhouse-server/config.d
. - Add the element
query_masking_rules
. - Set each
rule
with the following:name
: The name of the rule.regexp
: The regular expression to search for.replace
: The replacement value that matches the rule’s regular expression.
For example, the following will hide encryption and decryption functions in the log file:
<query_masking_rules> <rule> <name>hide encrypt/decrypt arguments</name> <regexp> ((?:aes_)?(?:encrypt|decrypt)(?:_mysql)?)\s*\(\s*(?:'(?:\\'|.)+'|.*?)\s*\) </regexp> <!-- or more secure, but also more invasive: (aes_\w+)\s*\(.*\) --> <replace>\1(???)</replace> </rule> </query_masking_rules>