The article explains different methods for handling updates and deletes in ClickHouse, including Lightweight Deletes and Mutations.
Below, I’ll give a comparison between the table formats for these operations with examples:
Lightweight Deletes vs. Mutation Deletes:
Aspect | Lightweight Deletes/Updates | Mutation Deletes/Updates |
---|---|---|
Syntax | DELETE FROM table WHERE condition
UPDATE table SET column = value WHERE condition |
ALTER TABLE table DELETE WHERE condition
ALTER TABLE table UPDATE column = value WHERE condition |
Concurrency | Asynchronous by default | Asynchronous by default |
Immediate Space Savings | Not immediate – data is removed during merge cycles | Immediate – space is freed up immediately |
Efficiency | Efficient for deleting a small number of rows | Less efficient, especially for large deletions |
Usage | Preferred for normal deletes | Used when immediate space savings are crucial or for compliance reasons |
An essential point to note –
- Both Lightweight Deletes/Updates and Mutation Deletes/Updates are asynchronous by default unless the setting mutations_sync is set to 1.
- Mutation method – all columns with rows are rewritten, and in the Lightweight method – Use _row_exists column for the mask.
- Mutation can cause considerable I/O and cluster overhead; keep monitoring using system.mutations table,
Example of Lightweight Delete:
-- Delete rows from table where col1 = 'Hi' and col2 = 2 DELETE FROM table WHERE col1 = 'Hi' AND col2 = 2;
Example of Mutation Delete:
-- Delete rows from table where col2 = 3 ALTER TABLE table DELETE WHERE col2 = 3;
Example of Lightweight Update:
-- Update col1 to 'Hi' where col2 = 2 UPDATE table SET col1 = 'Hi' WHERE col2 = 2;
Example of Mutation Update:
-- Update col1 to 'Hi' where col2 = 2 ALTER TABLE table UPDATE col1 = 'Hi' WHERE col2 = 2;
In summary, Lightweight Deletes and Updates are more efficient for normal operations and are asynchronous by default. At the same time, Mutation Deletes and Updates are used when immediate space or data changes are required or for compliance reasons, but they are less efficient for large-scale operations.