Introduction
In ClickHouse, memory management is handled by the MemoryTracker class, which is responsible for allocating and deallocating memory to different parts of the system. The MemoryTracker class uses a hierarchical approach to manage memory, with different levels of memory allocation, such as global, per thread, and per block.
- Global memory allocation: This is the highest level of memory allocation in ClickHouse. It is used for the memory-mapped files, the read and write buffers, and the compression and encoding buffers. The global memory allocation is done at the start of the process and is not released until the end of the process.
- Per-thread memory allocation: This level of memory allocation is used for the memory allocated by each thread. The per-thread memory allocation is done on demand, when a thread needs to perform a specific task, such as sorting or merging data.
- Per-block memory allocation: This is the lowest level of memory allocation in ClickHouse. It is used for the memory allocated by each block of data. The per-block memory allocation is done on demand, when a block of data is loaded into memory.
In addition to managing the memory allocation, MemoryTracker class also monitors the memory usage of the system, and takes action if the memory usage exceeds a certain threshold. If the memory usage exceeds the threshold, MemoryTracker will try to release memory by flushing the read and write buffers to disk, releasing the memory allocated by the threads, and releasing the memory allocated by the blocks of data.
Python script to monitor top ten ClickHouse queries by memory consumption
#import the necessary libraries import clickhouse_driver import pymysql #establish a connection to the clickhouse server conn = clickhouse_driver.connect(host='localhost', port=9000, user='user', password='password') #fetch the top 10 ClickHouse queries by memory consumption query = ''' select query, memory_usage from system.query_log order by memory_usage desc limit 10 ''' cur = conn.cursor() cur.execute(query) #fetch and print the top 10 results for row in cur: print(row) #close the connection cur.close() conn.close() #end of code
Conclusion
In summary, ClickHouse manages memory using the MemoryTracker class, which uses a hierarchical approach to manage memory allocation. Memory is allocated globally, per-thread, and per-block. MemoryTracker also monitors the memory usage of the system, and releases memory if the usage exceeds a certain threshold. This approach allows ClickHouse to efficiently manage memory and ensure that the system stays within the boundaries of the available physical memory.
To read more about Memory in ClickHouse, please do consider reading the below articles: