Introduction: In this tutorial, we will walk you through integrating ClickHouse, a high-performance analytics database, with PostgreSQL, a powerful open-source relational database management system. This integration will enable you to replicate data from PostgreSQL to ClickHouse for efficient data analysis. We’ll provide detailed steps to set up both databases and transfer data seamlessly.
Step 1: Setting Up PostgreSQL
1.1. Launch a Docker Container for PostgreSQL: (Make sure port 15432 is open for incoming traffic)
$ docker run -d --name postgres-container -e POSTGRES_PASSWORD=your_password -p 15432:5432 postgres:latest
$ docker exec -it postgres-container bash
1.3. Connect to PostgreSQL:
$ psql -h 10.0.6.24 -p 15432 -U postgres
1.4. Create a New Database:
postgres=# CREATE DATABASE db_in_psg;
1.5. Connect to the New Database:
postgres=# \c db_in_psg
1.6. Create a Sample Table and Insert Data into the Table
db_in_psg=# CREATE TABLE table1 ( id integer primary key, column1 varchar(10) ); db_in_psg=# INSERT INTO table1 (id, column1) VALUES (1, 'abc'), (2, 'def');
1.7. Verify Data in the Table:
db_in_psg=# select * from table1; id | column1 ----+--------- 1 | abc 2 | def (2 rows)
Step 2: Setting Up ClickHouse
2.1. Access ClickHouse Client:
$ clickhouse-client
2.2. Connect to ClickHouse Server:
clickhouse1 :) show databases;
2.3. Create a New Database:
clickhouse1 :) CREATE DATABASE db_in_ch;
2.4. Create a Table in ClickHouse Linked to PostgreSQL:
clickhouse1 :) CREATE TABLE db_in_ch.table1 ( id UInt64, column1 String ) ENGINE = PostgreSQL('10.0.6.24:15432', 'db_in_psg', 'table1', 'postgres', 'your_password');
2.5. Query Data in the ClickHouse Table:
clickhouse1 :) select * from db_in_ch.table1;
Example 2: If you want to set up a complete environment using docker-compose for quick setup, here you go.
root@ip-10-0-6-24:~/ashwini# cat docker-compose.yaml version: '3.8' services: postgres: image: postgres:latest container_name: postgres-container-n environment: POSTGRES_PASSWORD: your_password ports: - "15432:5432" clickhouse: image: yandex/clickhouse-server container_name: clickhouse-container-n ports: - "18123:8123" - "19000:9000" root@ip-10-0-6-24:~/ashwini# docker-compose up -d Removing clickhouse-container-n postgres-container-n is up-to-date Recreating 527391c2b870_clickhouse-container ... done root@ip-10-0-6-24:~/ashwini# docker-compose ps -a Name Command State Ports ----------------------------------------------------------------------------------------------------------------------------------------------------------------- clickhouse-container-n /entrypoint.sh Up 0.0.0.0:18123->8123/tcp,:::18123->8123/tcp, 0.0.0.0:19000->9000/tcp,:::19000->9000/tcp, 9009/tcp postgres-container-n docker-entrypoint.sh postgres Up 0.0.0.0:15432->5432/tcp,:::15432->5432/tcp root@ip-10-0-6-24:~/ashwini# Check the IP -- root@ip-10-0-6-24:~/ashwini# docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' postgres-container-n 172.22.0.3 root@ip-10-0-6-24:~/ashwini# docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' clickhouse-container-n 172.22.0.2 root@ip-10-0-6-24:~/ashwini# docker exec -it postgres-container-n psql -U postgres psql (15.4 (Debian 15.4-1.pgdg120+1)) Type "help" for help. postgres=# postgres=# postgres=# postgres=# postgres=# CREATE DATABASE db_in_psg; CREATE DATABASE postgres=# \c db_in_psg You are now connected to database "db_in_psg" as user "postgres". db_in_psg=# CREATE TABLE table1 ( id integer primary key, column1 varchar(10) ); CREATE TABLE db_in_psg=# INSERT INTO table1 (id, column1) VALUES (1, 'abc'), (2, 'def'); INSERT 0 2 db_in_psg=# select * from table1; id | column1 ----+--------- 1 | abc 2 | def (2 rows) db_in_psg=# \q root@ip-10-0-6-24:~/ashwini# root@ip-10-0-6-24:~# docker exec -it clickhouse-container-n clickhouse-client ClickHouse client version 22.1.3.7 (official build). Connecting to localhost:9000 as user default. Connected to ClickHouse server version 22.1.3 revision 54455. 7fb980bc355f :) 7fb980bc355f :) CREATE TABLE db_in_ch.table1 ( `id` UInt64, `column1` String ) ENGINE = PostgreSQL('172.22.0.3:5432', 'db_in_psg', 'table1', 'postgres', 'your_password') CREATE TABLE db_in_ch.table1 ( `id` UInt64, `column1` String ) ENGINE = PostgreSQL('172.22.0.3:5432', 'db_in_psg', 'table1', 'postgres', 'your_password') Query id: 3a56a158-e0d1-4cf8-bb27-e5799f4f399a Ok. 0 rows in set. Elapsed: 0.007 sec. 7fb980bc355f :) select * from db_in_ch.table1; SELECT * FROM db_in_ch.table1 Query id: 0ca64a9c-3208-441c-b027-a8d06009fc29 ┌─id─┬─column1─┐ │ 1 │ abc │ │ 2 │ def │ └────┴─────────┘ 2 rows in set. Elapsed: 0.011 sec.
Conclusion: Following these steps, you’ve successfully integrated PostgreSQL with ClickHouse, allowing seamless data replication and analysis between these robust database systems. This integration can benefit real-time analytics and reporting, making your data analysis tasks more efficient and scalable.
For more information and advanced configurations, refer to the official ClickHouse documentation on PostgreSQL integration: ClickHouse PostgreSQL Integration.