1. Home
  2. Knowledge Base
  3. ClickHouse
  4. Why ClickHouse works better than MapReduce?
  1. Home
  2. Knowledge Base
  3. ClickHouse Performance
  4. Why ClickHouse works better than MapReduce?

Why ClickHouse works better than MapReduce?

Why we used ClickHouse for Real-Time Analytics and not something like MapReduce 


Systems like MapReduce is  distributed computing ecosystem built to reduce Data Infrastructure Operations based on Distributed Sorting. Distributed Sorting is definitely not an ideal solution if the result of operations and other intermediate results are located in the RAM of a single server, which is usually the case for online queries. To address these performance bottleneck associated with distributed computing platform we use hash table. Most MapReduce implementations allow you to execute arbitrary code on a cluster but OLAP systems are optimised to run declarative query language which is the most compelling reason for using ClickHouse in Real-Time Analytics. The following below are strong reasons for using ClickHouse over MapReduce:

  • ClickHouse stores and process data in columns (also known as vectored query execution).  This helps for cost-efficient CPU cache utilization allows for SIMD CPU instructions usage
  • ClickHouse architecture is built for scale: Capable of using all available CPU cores and disks to execute every single query.
  • ClickHouse retains Data Structure in memory so this allows reading used columns and all the row ranges of those columns optimally using available system resources optimally
Was this article helpful?

Related Articles

CHISTADATA IS COMMITTED TO OPEN SOURCE SOFTWARE AND BUILDING HIGH PERFORMANCE COLUMNSTORES

In the spirit of freedom, independence and innovation. ChistaDATA Corporation is not affiliated with ClickHouse Corporation 

Need Support?

Can't find the answer you're looking for?
Contact Support

ChistaDATA Inc. Knowledge base is licensed under the Apache License, Version 2.0 (the “License”)

Copyright 2022 ChistaDATA Inc

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.