Boosting Materialized View Performance

Aggregation States: Boosting Materialized View Performance by 10x in ClickHouse



ClickHouse has long been celebrated for its lightning-fast analytical query performance, especially when handling massive datasets in real time. One of the most powerful features enabling this performance is the materialized view—a precomputed structure that transforms and aggregates data as it’s ingested. However, traditional materialized views can suffer from data explosion, high storage costs, and performance degradation. The solution? Aggregation states.

By leveraging ClickHouse’s AggregateFunction data types, engineers can store intermediate aggregation states instead of final scalar values. This approach dramatically reduces storage footprint, minimizes I/O, and accelerates query performance—often by up to 10x.

Why Aggregation States Matter

Standard materialized views in ClickHouse compute and store final values (e.g., sum, count, average) for each group. While effective for simple metrics, this approach becomes unwieldy when dealing with high-cardinality dimensions or complex aggregations like quantiles, unique user counts (uniq), or top-k values. Each new combination generates a new row, leading to data explosion and bloated storage.

Aggregation states solve this by storing serialized intermediate representations of ongoing computations. Instead of storing the final count, ClickHouse stores the state of the uniqCombined or quantileTDigest function, which can later be merged with other states or finalized into a result.

For example:

  • Instead of storing count = 1000 per group, ClickHouse stores the state of uniqCombinedState(user_id).
  • These states can be combined across partitions or time windows, enabling efficient rollups.

Materialized views in ClickHouse are updated in real time as data flows into the base table, functioning more like continuously updating indexes 2. By storing aggregation states, these views avoid recomputing everything from raw data and instead merge compact state objects incrementally.

Use Cases and Performance Gains

Storing aggregation states is particularly effective for:

  • Time-series rollups: Pre-aggregating metrics over time while preserving the ability to compute percentiles.
  • High-cardinality unique counting: Tracking distinct users without exploding group-by combinations.
  • Chained materialized views: One view stores raw events, while another stores aggregation states for downstream processing 3.

This method significantly reduces storage requirements. In one documented case, a materialized view that expanded 20GB of raw data into 190GB of precomputed results was optimized using state-based aggregation, cutting both storage and query latency 1. The result? Faster inserts, smaller tables, and sub-second analytical queries even on petabyte-scale datasets.

Querying Final Results

To retrieve final values from aggregation states, use the -Merge or -MergeState suffixes:

SELECT quantileTDigestMerge(0.95)(value_state) 
FROM daily_quantiles 
WHERE date = '2025-11-20'

Or finalize unique counts:

SELECT uniqCombinedMerge(user_state) 
FROM user_aggregates

These operations efficiently combine all partial states and produce accurate results—without scanning raw data.

Best Practices

  • Use AggregateFunction types (e.g., uniqCombinedState, sumMapState) in materialized view destinations.
  • Merge states periodically to avoid holding millions of them in memory 4.
  • Monitor insert performance using system.query_log to detect CPU spikes or timeouts 1.
  • Combine with TTLs and partitioning to manage lifecycle and performance.

Conclusion

ClickHouse’s support for aggregation states transforms materialized views from static summaries into dynamic, space-efficient, and scalable analytics engines. By storing computation states instead of final values, organizations can achieve up to 10x performance gains, reduce storage costs, and maintain high query responsiveness. As data volumes continue to grow, leveraging state-based aggregation isn’t just an optimization—it’s a necessity for real-time analytics at scale.

References

  1. Lessons – materialized views | ClickHouse Docs
  2. Incremental materialized view | ClickHouse Docs
  3. Chaining Materialized Views in ClickHouse

Further Reading



☛ ClickHouse Enterprise Support (24*7)

You get access to our seasoned ClickHouse support team 24*7 for an fraction of cost to hiring a full-time Sr. level ClickHouse consultant . We will help you in building an planet-scale data analytics platform using ClickHouse which is optimal, scalable and highly available.

  • Enterprise-Class ClickHouse Support
    • Technical Account Manager to clearly understand your business goals and orchestrate our support operations.
    • 30 Minute Response Time on Severity 1 (Urgent) Issues.
    • 10 Named Customer Contacts.
    • Support channels – Phone, Email, Slack, and Ticketing Systems.
    • Technical support — 30 minute response time (S1)
      • Support -levels – We have very well defined support infrastructure operations function:
        • Severity 1– Immediate attention needed, The customer’s business is severely impacted and database infrastructure is unavailable.
          • Response time (SLA) – 30 minutes.
        • Severity 2– Customer database infrastructure is available (up and running) but performance / scalability issues are directly impacting business.
          • Response time (SLA) – 12 hours. 
        • Severity 3– Low impact situation, Customer business and production infrastructure is functioning normally, but the problem is impacting the development ecosystems, also causing delay in production deployment.
          • Response time (SLA) – 24 hours.
        • Severity 4– Low to no impact situation, It is more about knowing the features and capability of components before considering the adoption.
          • Response time (SLA) – 48 hours. 
  • ClickHouse DBA Consultative Support
    • Recommendations for database architecture and design.
    • Recommendations for optimal SQL engineering.
    • Recommendations for ClickHouse Performance optimization and tuning.
    • Recommendation for index design, optimization and usage.
    • Recommendations for ClickHouse backup and disaster recovery.
    • Recommendations for ClickHouse high availability and auto failover.
    • Recommendations for ClickHouse data archiving and partitioning.
    • Recommendations for ClickHouse maintenance operations.

ChistaDATA ClickHouse Enterprise SupportRate
( plus GST / Goods and Services Tax where relevant )
Unlimited ClickHouse InstancesUS $75,000 / Year

You might also like:

About ChistaDATA Inc. 180 Articles
We are an full-stack ClickHouse infrastructure operations Consulting, Support and Managed Services provider with core expertise in performance, scalability and data SRE. Based out of California, Our consulting and support engineering team operates out of San Francisco, Vancouver, London, Germany, Russia, Ukraine, Australia, Singapore and India to deliver 24*7 enterprise-class consultative support and managed services. We operate very closely with some of the largest and planet-scale internet properties like PayPal, Garmin, Honda cars IoT project, Viacom, National Geographic, Nike, Morgan Stanley, American Express Travel, VISA, Netflix, PRADA, Blue Dart, Carlsberg, Sony, Unilever etc