Is It Possible to Delete Old Records from a ClickHouse Table?
Yes, It is possible. In ClickHouse we have multiple ways of freeing up the disk space by removing the old data. It’s just that we have to pick the appropriate method.
TTL
In ClickHouse you can automatically delete the records matching the conditions. These conditions are configured as individual expressions based on patterns of columns, This usually will be static offset for any timestamp column. The advantage of this method is the deletion of records is not governed by Database Triggers (Triggers are scary and super expensive from a performance perspective). TTL is configured as a static variable and data removal happens automatically in the background.
Note: TTL can be used to move data from /dev/ to different storage media devices( from HDD to SSD)
ALTER DELETE
ClickHouse is not an OLTP database so it does not support real-time deletes like PostgreSQL, MySQL or any other transaction processing Database Systems. In ClickHouse, We have something called mutations which are issued as ALTER… DELETE or ALTER … UPDATE queries to differentiate from normal DELETE or UPDATE operations, These are asynchronous batch operations (not real-time). Please remember, The mutations are extremely expensive from a performance perspective as they rewrite the entire schema parts even when there is only a single row to be deleted. This is also a common method to make your ClickHouse infrastructure GDPR-ready
DROP PARTITION
DROP PARTITION is not flexible as ALTER DELETE operation and it is purely orchestrated/biased towards partition methodology. Though operationally complex, It works most often.
TRUNCATE
This activity deletes entire ClickHouse table data but occasionally that works well to (purely based on situation)