Use S3 Storage in ClickHouse : DB Configuration

In the previous part of this series, we talked about AWS S3 configuration to use in ClickHouse. As you remember from the first part of this series, we created a folder under the S3 bucket and stored its URL, as well as the access key and secret access key for the IAM user.

Configuration Steps

Now it is time to configure ClickHouse to use S3 as a disk.

To do that, first of all, we need to create the storage.xml file under the ClickHouse configuration directory, which is “/etc/clickhouse-server/config.d/” by default. The xml file should be like that:

<clickhouse>
  <storage_configuration>
    <disks>
      <s3_disk>
        <type>s3</type>
        <endpoint>https://YOUR_S3_URL/</endpoint>
        <access_key_id>YOUR_ACCESS_KEY</access_key_id>
        <secret_access_key>YOUR_SECRET_KEY</secret_access_key>
        <metadata_path>/var/lib/clickhouse/disks/s3_disk/</metadata_path>
        <cache_enabled>true</cache_enabled>
        <data_cache_enabled>true</data_cache_enabled>
        <cache_path>/var/lib/clickhouse/disks/s3_disk/cache/</cache_path>
      </s3_disk>
    </disks>
    <policies>
      <s3_policy>
        <volumes>
          <main>
            <disk>s3_disk</disk>
          </main>
        </volumes>
      </s3_policy>
    </policies>
  </storage_configuration>
</clickhouse>

Here, endpoint, access_key_id and secret_access_key information are stored in the S3 configuration step. The XML file should read from “clickhouse” user.

After creating this XML file, we need to restart the ClickHouse instance to have the changes take effect.

service clickhouse-server restart

To check if “s3_disk” was created successfully, we need to connect ClickHouse and run the following command:

SELECT
    name,
    path
FROM system.disks
WHERE name = 's3_disk'

┌─name────┬─path───────────────────────────────┐
│ s3_disk │ /var/lib/clickhouse/disks/s3_disk/ │
└─────────┴────────────────────────────────────┘

We also need to check the storage policy with the command below:

SELECT
    policy_name,
    volume_name,
    disks
FROM system.storage_policies


┌─policy_name─┬─volume_name─┬─disks───────┐
│ default     │ default     │ ['default'] │
│ s3_policy   │ main        │ ['s3_disk'] │
└─────────────┴─────────────┴─────────────┘

Here you can see “s3_disk” is created with “s3_policy“. That means we can create a table with this policy. You can find the table creation script with “s3_policy” as follow:

CREATE TABLE myS3Table
(
    `id` UInt64,
    `name` String
)
ENGINE = MergeTree
ORDER BY tuple()
SETTINGS storage_policy = 's3_policy'

This table is created in the S3 object store right now. Let’s try to insert data and query it.

insert into myS3Table values(1,'ChistaDATA');

SELECT *
FROM myS3Table
┌─id─┬─name───────┐
│ 1  │ ChistaDATA │
└────┴────────────┘

Now, you can directly write to and read from S3 as the table’s data storage.

With the help of these articles, we would like to configure S3 Object Store as a ClickHouse disk and use it as well.

References

https://clickhouse.com/docs/en/guides/sre/configuring-s3-for-clickhouse-use/

Was this article helpful?

Yes No

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

ChistaDATA Inc.

Enterprise-class 24*7 ClickHouse Consultative Support and Managed Services

Use S3 Storage in ClickHouse : DB Configuration

Configuration Steps

References

CHISTADATA IS COMMITTED TO OPEN SOURCE SOFTWARE AND BUILDING HIGH PERFORMANCE COLUMNSTORES

Contents

Need Support?

ChistaDATA Inc. Knowledge base is licensed under the Apache License, Version 2.0 (the “License”)

Use S3 Storage in ClickHouse : DB Configuration

Configuration Steps

References

Related Articles

CHISTADATA IS COMMITTED TO OPEN SOURCE SOFTWARE AND BUILDING HIGH PERFORMANCE COLUMNSTORES

Contents

Need Support?

ChistaDATA Inc. Knowledge base is licensed under the Apache License, Version 2.0 (the “License”)