How is Query Cache implemented in Database Systems like ClickHouse?

Introduction

In database systems, a query cache, also known as a result cache, is implemented by storing the result set of a query along with the query itself in a cache, so that if the same query is executed again, the cached result can be returned instead of executing the query again. The cache is typically stored in memory for fast access, and it may have a limited size to prevent it from consuming too much memory. The cache can be implemented at the database engine level, or at the application level, where a query cache library is used. The cache is usually implemented as a hash table where the key is the query and the value is the result set.

Algorithms for Query Cache Implementation

There are several algorithms that can be used to implement a query cache in a database system. Some of the most common algorithms include:
  1. Least Recently Used (LRU): This algorithm removes the least recently used items from the cache when the cache reaches its maximum size. This is a simple and effective algorithm for maintaining a cache that is used frequently.
  2. Least Frequently Used (LFU): This algorithm removes the items that are used least frequently from the cache when the cache reaches its maximum size. This algorithm is useful for maintaining a cache that contains items that are not used as frequently.
  3. First In, First Out (FIFO): This algorithm removes the oldest items from the cache when the cache reaches its maximum size. This algorithm is useful for maintaining a cache where the items have a fixed lifespan.
  4. Random replacement: This algorithm removes random items from the cache when the cache reaches its maximum size. This algorithm is useful for maintaining a cache where the items have no defined relationship.
  5. Time-based eviction: This algorithm removes items from the cache based on a set time limit. This algorithm is useful for maintaining a cache where the items have a defined lifespan.
These are some of the common algorithms used in implementing query cache in database systems, different systems might use different algorithms based on the requirements of the system.

Least Recently Used (LRU) algorithm

The Least Recently Used (LRU) algorithm is a cache replacement algorithm that removes the least recently used items from the cache when the cache reaches its maximum size.
The basic idea behind the LRU algorithm is that the items that are accessed most recently are the ones that are most likely to be accessed again in the future. So, when the cache is full and a new item needs to be added, the item that was accessed least recently is removed from the cache to make room for the new item.
The LRU algorithm is typically implemented using a doubly linked list and a hash table. The doubly linked list is used to store the items in the order in which they were accessed, with the most recently accessed item at the front of the list and the least recently accessed item at the back of the list. The hash table is used to store the keys of the items and their corresponding list nodes, allowing for fast access to the items.
When an item is accessed, it is moved to the front of the list, indicating that it is the most recently used item. When a new item needs to be added to the cache and the cache is full, the item at the back of the list, which is the least recently used item, is removed from the cache and the new item is added to the front of the list.

The LRU algorithm is widely used in cache management, it is simple to implement, but it has a disadvantage that it does not take into account the frequency of access to the item.

 

Here is an example implementation of the Least Recently Used (LRU) algorithm in C++:

#include <unordered_map>
#include <list>

template <typename K, typename V>
class LRUCache {
public:
    LRUCache(size_t maxSize) : maxSize_(maxSize) {}

    V get(const K& key) {
        auto it = cache_.find(key);
        if (it == cache_.end()) {
            return V();
        }
        touch(it);
        return it->second.first;
    }

    void put(const K& key, const V& value) {
        auto it = cache_.find(key);
        if (it != cache_.end()) {
            touch(it);
        } else {
            if (cache_.size() == maxSize_) {
                cache_.erase(used_.back());
                used_.pop_back();
            }
            used_.push_front(key);
        }
        cache_[key] = { value, used_.begin() };
    }

private:
    typedef std::pair<V, typename std::list<K>::iterator> VItPair;
    std::unordered_map<K, VItPair> cache_;
    std::list<K> used_;
    size_t maxSize_;

    void touch(typename std::unordered_map<K, VItPair>::iterator it) {
        auto key = it->first;
        used_.erase(it->second.second);
        used_.push_front(key);
        it->second.second = used_.begin();
    }
};
This implementation uses a combination of an unordered_map and a list to implement the LRU cache. The unordered_map, cache_, is used to store the key-value pairs, while the list, used_, is used to store the keys in the order they were last accessed. The maxSize_ variable stores the maximum size of the cache.
The get() function is used to retrieve a value from the cache. If the key is found in the cache, the corresponding value is returned and the key is moved to the front of the used_ list. If the key is not found in the cache, an empty value is returned.
The put() function is used to add a key-value pair to the cache. If the key already exists in the cache, its value is updated and the key is moved to the front of the used_ list. If the key does not exist in the cache, and the cache is full, the least recently used key is removed from the cache and the new key-value pair is added to the front of the used_ list.
The touch() function is used to move a key to the front of the used_ list when it is accessed.
This is an example of how you can implement LRU cache using c++. It is not an optimized version and can be improved based on the requirements of the system.

Least Frequently Used (LFU) algorithm

The Least Frequently Used (LFU) algorithm is a cache replacement algorithm that removes the items that are used least frequently from the cache when the cache reaches its maximum size.
The basic idea behind the LFU algorithm is that the items that are accessed more frequently are more likely to be accessed again in the future, while the items that are accessed less frequently are less likely to be accessed again. So, when the cache is full and a new item needs to be added, the item that is used least frequently is removed from the cache to make room for the new item.
The LFU algorithm is typically implemented using a hash table and a priority queue. The hash table is used to store the keys of the items and their corresponding frequency count, allowing for fast access to the items. The priority queue is used to store the items in the order of their frequency count, with the least frequently used item at the front of the queue.
When an item is accessed, its frequency count is incremented and its position in the priority queue is updated accordingly. When a new item needs to be added to the cache and the cache is full, the item at the front of the priority queue, which is the least frequently used item, is removed from the cache and the new item is added to the hash table with a frequency count of 1.
The LFU algorithm is a more advanced algorithm than LRU, it takes into account the frequency of access to the item, but it can be more complex to implement.
It is used in situations where the items in the cache have different access frequencies.

 

Here is an example implementation of the Least Frequently Used (LFU) algorithm in C++:

#include <unordered_map>
#include <list>
#include <queue>

template <typename K, typename V>
class LFU_Cache {
    struct Node {
        int freq;
        std::list<K> keys;
    };
    struct Compare {
        bool operator()(Node& a, Node& b) {
            return a.freq < b.freq;
        }
    };

public:
    LFU_Cache(int cap) : capacity(cap) {
    }

    V get(const K& key) {
        if (cache.count(key) == 0) return -1;
        auto& node = *cache[key].second;
        freq_list.erase(cache[key].second);
        node.keys.remove(key);
        if (node.keys.size() == 0) freq_list.erase(node);
        if (freq_list.empty() || freq_list.rbegin()->freq != node.freq + 1)
            freq_list.push_back({ node.freq + 1, { key } });
        else freq_list.rbegin()->keys.push_front(key);
        cache[key] = { node.freq + 1, --freq_list.end() };
        return data[key];
    }

    void put(const K& key, const V& value) {
        if (capacity <= 0) return;
        if (cache.count(key) > 0) {
            data[key] = value;
            get(key);
            return;
        }
        if (data.size() == capacity) {
            auto& node = *freq_list.begin();
            K k = node.keys.back();
            node.keys.pop_back();
            if (node.keys.size() == 0) freq_list.erase(node);
            cache.erase(k);
            data.erase(k);
        }
        data[key] = value;
        freq_list.push_back({ 1, { key } });
        cache[key] = { 1, --freq_list.end() };
    }

private:
    int capacity;
    std::list<Node> freq_list;
    std::unordered_map<K, std::pair<int, typename std::list<Node>::iterator>> cache;
    std::unordered_map<K, V> data;
};
This implementation uses a combination of an unordered_map, list and a priority_queue to implement the LFU cache. The unordered_map, cache, is used to store the key-value pairs, while the list, freq_list, is used to store the frequency of the keys in the order they are accessed. The priority_queue is used to store the items in the order of their frequency count, with the least frequently used item at the front of the queue.
The get() function is used to retrieve a value from the cache. If the key is found in the cache, the corresponding value is returned, the key is moved to the front of

First In First Out (FIFO) algorithm

The First In, First Out (FIFO) algorithm is a cache replacement algorithm that removes the oldest items from the cache when the cache reaches its maximum size.
The basic idea behind the FIFO algorithm is that items that were added to the cache first are removed first when the cache is full. The FIFO algorithm operates on the principle of “first in, first out”, similar to a queue data structure.
The FIFO algorithm is typically implemented using a queue data structure such as a linked list or a deque. A queue stores items in the order they are added, and the first item added to the queue is the first one to be removed. When a new item needs to be added to the cache and the cache is full, the oldest item in the cache is removed from the front of the queue and the new item is added to the back of the queue.
The FIFO algorithm is simple to implement, but it does not take into account the frequency of access to the item. It can be useful in situations where the items in the cache have a fixed lifespan, for example, when items are added to a cache for a specific time period and then removed.

 

Here is an example implementation of the First In, First Out (FIFO) algorithm in C++:

#include <unordered_map>
#include <list>

template <typename K, typename V>
class FIFOCache {
public:
    FIFOCache(size_t maxSize) : maxSize_(maxSize) {}

    V get(const K& key) {
        auto it = cache_.find(key);
        if (it == cache_.end()) {
            return V();
        }
        return it->second;
    }

    void put(const K& key, const V& value) {
        if (cache_.size() == maxSize_) {
            auto oldest = queue_.back();
            queue_.pop_back();
            cache_.erase(oldest);
        }
        queue_.push_front(key);
        cache_[key] = value;
    }

private:
    std::unordered_map<K, V> cache_;
    std::list<K> queue_;
    size_t maxSize_;
};
This implementation uses a combination of an unordered_map and a list to implement the FIFO cache. The unordered_map, cache_, is used to store the key-value pairs, while the list, queue_, is used to store the keys in the order they were added to the cache. The maxSize_ variable stores the maximum size of the cache.
The get() function is used to retrieve a value from the cache. If the key is found in the cache, the corresponding value is returned. If the key is not found in the cache, an empty value is returned.
The put() function is used to add a key-value pair to the cache. If the cache is full, the oldest key is removed from the cache and the new key-value pair is added to the front of the queue.
This is an example of how you can implement FIFO cache using c++. It is not an optimized version and can be improved based on the requirements of the system.

Random Replacement algorithms

The Random replacement algorithm is a cache replacement algorithm that removes random items from the cache when the cache reaches its maximum size.
The basic idea behind the random replacement algorithm is that any item in the cache has an equal probability of being removed when the cache is full. This algorithm can be implemented by randomly selecting an item from the cache and removing it when a new item needs to be added to the cache and the cache is full.
The random replacement algorithm is typically implemented using a hash table or an array to store the items in the cache, and a random number generator to select an item to be removed from the cache.

The random replacement algorithm is simple to implement and does not require any additional data structure like a linked list or a queue. It is useful in situations where the items in the cache have no defined relationship or where it does not matter which item is removed from the cache. However, this algorithm does not take into account the frequency of access to the item.

 

Here is an example implementation of the Random replacement algorithm in C++:

#include <unordered_map>
#include <random>

template <typename K, typename V>
class RandomCache {
public:
    RandomCache(size_t maxSize) : maxSize_(maxSize) {}

    V get(const K& key) {
        auto it = cache_.find(key);
        if (it == cache_.end()) {
            return V();
        }
        return it->second;
    }

    void put(const K& key, const V& value) {
        if (cache_.size() == maxSize_) {
            std::uniform_int_distribution<size_t> distribution(0, cache_.size() - 1);
            auto random_it = std::next(cache_.begin(), distribution(generator_));
            cache_.erase(random_it);
        }
        cache_[key] = value;
    }

private:
    std::unordered_map<K, V> cache_;
    std::mt19937 generator_;
    size_t maxSize_;
};
This implementation uses an unordered_map to store the key-value pairs, and a random number generator to randomly select an item to be removed from the cache when the cache is full.
The get() function is used to retrieve a value from the cache. If the key is found in the cache, the corresponding value is returned. If the key is not found in the cache, an empty value is returned.
The put() function is used to add a key-value pair to the cache. If the cache is full, a random item is removed from the cache and the new key-value pair is added.
Please note that this is a basic example and in real-world scenarios, a more optimized version would need to be implemented based on the requirements of the system.

Time-based Eviction algorithm

The Time-based eviction algorithm is a cache replacement algorithm that removes items from the cache based on the time they were last accessed.
The basic idea behind the Time-based eviction algorithm is that items that were accessed a long time ago are less likely to be accessed again and can be safely removed from the cache. This algorithm can be implemented by keeping track of the time each item was last accessed and removing items that have not been accessed for a certain period of time, called the time-to-live (TTL) or time-to-idle (TTI).
The Time-based eviction algorithm is typically implemented using a hash table or an array to store the items in the cache, and a timestamp to store the time each item was last accessed.

The Time-based eviction algorithm is useful in situations where the items in the cache have a defined lifespan or where the items in the cache are not frequently accessed. It can be useful in situations where the cache is used to store items that are not frequently accessed, such as session data or temporary files.

 

Here is an example implementation of the Time-based eviction algorithm in C++:

#include <unordered_map>
#include <chrono>

template <typename K, typename V>
class TimeBasedCache {
public:
    TimeBasedCache(size_t maxSize, std::chrono::seconds ttl) : maxSize_(maxSize), ttl_(ttl) {}

    V get(const K& key) {
        auto it = cache_.find(key);
        if (it == cache_.end()) {
            return V();
        }
        //update the timestamp of the accessed key
        it->second.second = std::chrono::system_clock::now();
        return it->second.first;
    }

    void put(const K& key, const V& value) {
        if (cache_.size() == maxSize_) {
            evict();
        }
        //Insert the key with its timestamp
        cache_[key] = { value, std::chrono::system_clock::now() };
    }

    void evict() {
        auto now = std::chrono::system_clock::now();
        for (auto it = cache_.begin(); it != cache_.end(); ) {
            //if the time passed since last access is greater than ttl
            if (now - it->second.second >= ttl_) {
                it = cache_.erase(it);
            } else {
                ++it;
            }
        }
    }

private:
    std::unordered_map<K, std::pair<V, std::chrono::system_clock::time_point>> cache_;
    size_t maxSize_;
    std::chrono::seconds ttl_;
};
This implementation uses an unordered_map to store the key-value pairs along with a timestamp of the last access time.
The get() function is used to retrieve a value from the cache. If the key is found in the cache, the corresponding value is returned and the timestamp is updated. If the key is not found in the cache, an empty value is returned.
The put() function is used to add a key-value pair to the cache. If the cache is full, the function calls the evict() function to remove expired items from the cache before adding the new item.

The evict() function is used to remove items from the cache that have not been accessed for longer than the specified time-to-live (TTL). It iterates over the cache and remove any key-value pair whose timestamp is older than the specified TTL.

Conclusion

Query caching can significantly improve the performance of a database system by reducing the number of times the same query needs to be executed and the amount of time spent executing those queries.
When a query is executed, the results of the query are stored in the query cache along with the query itself. The next time the same query is executed, the results can be retrieved from the query cache rather than executing the query again. This can significantly reduce the time spent executing queries, especially for queries that are executed frequently or take a long time to execute.
Additionally, query caching can also reduce the load on the database server by reducing the number of queries that need to be executed. This can lead to improved scalability and increased capacity for handling a larger number of concurrent users.
However, it’s important to keep in mind that query caching can also have a negative impact on performance if the cache becomes full and the eviction algorithm used starts removing useful queries, or if the cache is not properly invalidated when the data changes.

About Shiv Iyer 219 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.