Buffer Engine in ClickHouse stores the data in RAM and flushes the data periodically to another permanent table, based on pre-configured threshold limits. Before the Async inserts were introduced in ClickHouse, the buffer table engines served as temporary write buffers for the tables. Even now, this table engine is used where async inserts are impossible or when different tables require different buffer flush settings. Data is read from both the buffer table and the destination table and the inserts are performed in the buffer table.
Syntax
CREATE TABLE buffer_table_name AS final_table_name Buffer(final_database_name, final_table_name, num_layers, min_time, max_time, min_rows, max_rows, min_bytes, max_bytes)
- final_database_name – database in which the destination table exits
- final_table_name – destination table in which the data is flushed
- num_layers – Number of independent buffers (parallelism)
- min_time & max_time – Time limit for which the data will stay in the buffer table before being flushed to the destination table
- min_row & max_row – The row limit for the data to stay in the buffer table before being flushed to the destination table
- min_bytes & max_bytes – Limit for the size of the data in buffer
Example
Create a Destination table
CREATE TABLE buffer_example_mergetree ( ID UInt8, Name String ) Engine = MergeTree ORDER BY ID;
Create the buffer table
CREATE TABLE buffer_engine_example AS buffer_example_mergetree Engine = Buffer(default, buffer_example_mergetree, 16, 10, 100, 16, 1024, 100000, 2000000);
- final_database_name – default
- final_table_name – buffer_engine_example
- num_layers – 16
- min_time & max_time – 10 & 100 seconds
- min_row & max_row – 16 & 1024
- min_bytes & max_bytes – 100000 & 2000000 bytes
Try it yourself
Execute the below SQL statements and query the data from both the buffer table and destination table to understand the workings of the settings.
INSERT INTO buffer_example SELECT * FROM generateRandom() LIMIT 20;
For this statement, the data will be on the buffer table for 100 seconds (Since we are inserting 20 rows and the data will be on the buffer for max allowed time)
INSERT INTO buffer_example SELECT * FROM generateRandom() LIMIT 2000;
For this statement, the data will be on the buffer table for 10 seconds (Since we are inserting 2000 rows and the data will be on the buffer for min allowed time)
Caveats
- Data in the buffer table is Volatile (Data loss is inevitable in case of server restart/crashes)
- Can not be used if underlying table is Replicated
References
https://clickhouse.com/docs/en/engines/table-engines/special/buffer
https://clickhouse.com/docs/en/optimize/asynchronous-inserts
https://stackoverflow.com/questions/69147028/is-clickhouse-buffer-table-appropriate-for-realtime-ingestion-of-many-small-inse