Introduction
ChistaDATA Anansi is a log analyzing tool built for ClickHouse. It is a CLI tool written in Go. The reports generated by the profiler shed light on the different aspects of queries like execution time, memory usage, bytes read, and other essential but crucial information.
ChistaDATA Anansi Architecture Overview
The profiler has two stages. Query parsing stage and report generation stage. The reading and information extraction from files is done in the query parsing stage. Report generation using the extracted information is done in the next.
Query parsing stage
Every file is read line by line. Information in each line is extracted using regular expressions.
Which builds a list of query information. Information that can be extracted as follows:
- Timestamp
- Query Id
- Client Host
- Client Port
- User
- Query
- Completed
- Peak Memory Usage
- No Rows Read
- No Bytes Read
- Duration
- Initial Query Id
- Databases
- Tables
- ThreadIds
- Error Stacktrace
- Complete Error Text
- Error Code
- Error Message
Report generation stage
In this stage, the list of queries is processed to get information. All the similar queries are grouped and create, insert statements are discarded. After the grouping and filtering, the data is more refined, and we use this to generate reports.
Features
The profiles come with the following features
- Multiple log file support
- Report customization using CLI arguments
- Multiple report formats are available (text, markdown)
- Can be compiled to binaries (multiple OS and Architecture compatible)
- Support for Huge files
Upcoming Features:
- Compressed file support
- S3 Support
Installation
You can get an already built binary from here.
If you want to build it yourself, Follow these steps
Prerequisites:
- GO 1.19 (Might work on other versions not tested)
- Git
- make
Build Steps:
- Clone the repo
git clone https://github.com/ChistaDATA/ChistaDATA-Anansi-Profiler.git
- Make ChistaDATA-Anansi-Profiler your working directory.
cd ChistaDATA-Anansi-Profiler
- Build
make build
Your executable anansi-profiler will be generated in
the working directory.
Supported log files
The profiler expects the main log of clickhouse-server as input. The file is usually located at /var/log/clickhouse-server/clickhouse-server.log . If you can’t find the log file there, please check your
server configuration to determine its location. If you haven’t enabled logging, please refer to this blog for instructions on how to set it up. In order for the query profiler to work, the query log must be enabled. You can find more on the query log here.
Usage
anansi-profiler <file-paths> ... Arguments: <file-paths> ... Paths of log files Flags: -h, --help Show context-sensitive help. -n, --top-query-count=10 Count of queries for top x table -r, --report-type="text" Report type to be generated, types: md, text -c, --minimum-query-call-count=1 Minimum no of query calls needed --database-name="clickhouse" database type --database-version="0" database version --sort-field="ExecTime" Sort queries by the given field, possible values: ExecTime, RowsRead, BytesRead, PeakMemory, QPS, QueryCount --sort-field-operation="max" Sort queries by the given operation on field, possible values: sum, min, max, avg, per95, stdDev, median --sort-order="desc" Sort order, possible values: asc, desc --log-level="error" log level, possible values: panic, fatal, error, warn, info, debug, trace
Examples
Let’s use the profiler to generate a sample report. I am using a sample log file with 1GB data./anansi-profiler -n 2 sample.log
Now the profiler will generate a report with three parts
- Overall statistics
- Top n queries ranked
- Details of the ranked queries
Output:
# Current date: 2022-12-22 17:38:03.212385 +0530 IST m=+5.157956709 # Hostname: MacBook-Pro.local # Files: * sample.log # Overall: 148398, Unique: 17 , QPS: 9.07 # Time range: 2022-10-03 18:42:08.324308 +0000 UTC to 2022-10-03 23:14:49.970276 +0000 UTC # Attribute total min max avg 95% stddev median # ============ ======= ======= ======= ======= ======= ======= ======= # Exec time 8.35ks 0.00s 0.93s 0.06s 0.72s 0.18s 0.00s # Rows read 32.20G 0.00 520.69k 216.96k 513.73k 161.34k 254.06k # Bytes read 0.86TB 0.00B 25.41MB 6.09MB 24.73MB 7.65MB 2.22MB # Peak Memory 0.00B 1.67GB 0.12GB 1.54GB 0.39GB 4.00MB # Profile # Rank Response time Calls R/Call Query # ==== =============== ===== ====== ===== # 1 0.02s 0.06% 10314 0.00s SELECT COUNT(*) FROM salary # 2 0.04s 0.21% 10366 0.00s SELECT AVG(amount) FROM salary # Query 1 : 1.236 QPS # Time range: From 2022-10-03 18:42:08.801737 +0000 UTC to 2022-10-03 23:14:47.425628 +0000 UTC # ==================================================================== # Attribute total min max avg 95% stddev median # ============ ======= ======= ======= ======= ======= ======= ======= # Count 10.31k # Exec time 4.81s 0.00s 0.02s 0.00s 0.00s 0.00s 0.00s # Rows read 73.99k 1.00 8.00 7.17 8.00 0.48 7.00 # Bytes read 40.81MB 4.01KB 4.06KB 4.05KB 4.06KB 4.44B 4.05KB # Peak Memory 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B # ==================================================================== # Databases default (10314/10314) # Hosts [::1] (10314/10314) # Users default (10314/10314) # Completion 10314/10314 # Query_time distribution # ==================================================================== # 1us # 10us # 100us ########################################################### # 1ms # 10ms # 100ms # 1s # 10s+ # ==================================================================== # Query SELECT COUNT(*) FROM salary # Query 2 : 1.242 QPS # Time range: From 2022-10-03 18:42:09.511146 +0000 UTC to 2022-10-03 23:14:49.293234 +0000 UTC # ==================================================================== # Attribute total min max avg 95% stddev median # ============ ======= ======= ======= ======= ======= ======= ======= # Count 10.37k # Exec time 17.62s 0.00s 0.04s 0.00s 0.00s 0.00s 0.00s # Rows read 2.64G 1.00k 260.35k 254.91k 259.86k 10.36k 255.41k # Bytes read 19.69GB 7.81KB 1.99MB 1.94MB 1.98MB 81.00KB 1.95MB # Peak Memory 29.85KB 1.54MB 1.34MB 1.35MB 53.38KB 1.34MB # ==================================================================== # Databases default (10366/10366) # Hosts [::1] (10366/10366) # Users default (10366/10366) # Completion 10366/10366 # Query_time distribution # ==================================================================== # 1us # 10us # 100us # 1ms ########################################################### # 10ms # 100ms # 1s # 10s+ # ==================================================================== # Query SELECT AVG(amount) FROM salary
For the sake of keeping it precise, the following outputs will discard the overall statistics and detailed information of queries.
The no of queries listed can be controlled using -n argument.
If you need the top 5 queries, then run following:./anansi-profiler -n 5 sample.log
Output:
# Profile # Rank Response time Calls R/Call Query # ==== =============== ===== ====== ===== # 1 0.93s 90.37% 10419 0.72s select distinct(city) from salary s1 LEFT JOIN salary s2 ON s1.event_id = s2.eve # 2 0.13s 0.31% 10340 0.00s select min(amount) from salary where rand > 10000 # 3 0.11s 2.49% 10357 0.02s select name, min(amount), max(amount), count(*) from salary GROUP BY name # 4 0.10s 0.33% 10596 0.00s SELECT COUNT(*) FROM salary where timestamp < now64() - 60 # 5 0.08s 2.96% 10398 0.02s select AVG(amount) FROM salary s1 INNER JOIN salary2 s2 on s1.name = s2.city
The previous one had two queries, and this one has five queries.
The sorting of queries is based on the following parameters:
- sort-field
- sort-field-operation
- sort-order
By default sort will be based on maximum execution time in descending order.
We can configure these values to generate the report we need.
Queries with minimum response time:./anansi-profiler -n 5 --sort-order=asc sample.log
Output:
# Profile # Rank Response time Calls R/Call Query # ==== =============== ===== ====== ===== # 1 0.02s 0.06% 10314 0.00s SELECT COUNT(*) FROM salary # 2 0.04s 0.21% 10366 0.00s SELECT AVG(amount) FROM salary # 3 0.04s 0.28% 10397 0.00s select AVG(amount) FROM salary WHERE good = 1 # 4 0.04s 0.30% 10400 0.00s select * from salary where amount > rand order by timestamp limit 10 # 5 0.06s 1.30% 10470 0.01s select * from salary where amount > rand order by event_id DESC limit 10
Queries with maximum cumulative time:./anansi-profiler -n 5 --sort-field-operation=sum sample.log
Output:
# Profile # Rank Response time Calls R/Call Query # ==== =============== ===== ====== ===== # 1 0.93s 90.37% 10419 0.72s select distinct(city) from salary s1 LEFT JOIN salary s2 ON s1.event_id = s2.eve # 2 0.08s 2.96% 10398 0.02s select AVG(amount) FROM salary s1 INNER JOIN salary2 s2 on s1.name = s2.city # 3 0.11s 2.49% 10357 0.02s select name, min(amount), max(amount), count(*) from salary GROUP BY name # 4 0.06s 1.30% 10470 0.01s select * from salary where amount > rand order by event_id DESC limit 10 # 5 0.08s 0.68% 10510 0.01s select distinct(name) from salary where ilike(name, '%cross%') or ilike(name, '%
Queries ordered by the number of calls:./anansi-profiler -n 5 --sort-field=QueryCount sample.log
Output:
# Profile # Rank Response time Calls R/Call Query # ==== =============== ===== ====== ===== # 1 0.10s 0.33% 10596 0.00s SELECT COUNT(*) FROM salary where timestamp < now64() - 60 # 2 0.08s 0.68% 10510 0.01s select distinct(name) from salary where ilike(name, '%cross%') or ilike(name, '% # 3 0.06s 1.30% 10470 0.01s select * from salary where amount > rand order by event_id DESC limit 10 # 4 0.93s 90.37% 10419 0.72s select distinct(city) from salary s1 LEFT JOIN salary s2 ON s1.event_id = s2.eve # 5 0.04s 0.30% 10400 0.00s select * from salary where amount > rand order by timestamp limit 10
To get information on all possible configs, refer to usage.
Performance
A minimal performance analysis was done using files with different sizes.
Size (MB) | Time (s) |
---|---|
1 | 0.02 |
10 | 0.08 |
50 | 0.37 |
100 | 0.79 |
256 | 1.91 |
512 | 3.93 |
1024 | 7.87 |
5120 | 40.5 |
The test was done in a system with specs:
Chip: Apple M1 Pro
Memory: 16GB
Conclusion
The profiler can be used to extract crucial information from the ClickHouse log file. It can be used to process huge log files with millions of lines. It is blazing fast. Overall a simple yet powerful profiler.
To know more about Clickhouse Query Profiler, please do consider reading the following articles:
- ClickHouse Query Profiler for Performance Troubleshooting
- Configuring ClickHouse Query Profiler to Monitor Performance
- ChistaDATA’s Anansi Query Profiler for ClickHouse Performance: Part 2
Contribution
At ChistaDATA, we value your input and always look for ways to improve our products and services. If you have any feedback or ideas for new features, we encourage you to create a GitHub issue with an RFC (Request for Comment). This will allow us to review and consider your request as we continue to develop and enhance our offerings. Thank you for your support!