Successful ClickHouse® deployment plan
Successful ClickHouse® deployment plan
Stage 0. Build POC
- Install single node ClickHouse
- Start with creating a single table (the biggest one), use MergeTree engine. Create ‘some’ schema (most probably it will be far from optimal). Prefer denormalized approach for all immutable dimensions, for mutable dimensions - consider dictionaries.
- Load some amount of data (at least 5 Gb, and 10 mln rows) - preferable the real one, or as close to real as possible. Usually the simplest options are either through CSV / TSV files (or
insert into clickhouse_table select * FROM mysql(...) where ...
) - Create several representative queries.
- Check the columns cardinality, and appropriate types, use minimal needed type
- Review the partition by and order by. https://kb.altinity.com/engines/mergetree-table-engine-family/pick-keys/
- Create the schema(s) with better/promising order by / partitioning, load data in. Pick the best schema.
- consider different improvements of particular columns (codecs / better data types etc.) https://kb.altinity.com/altinity-kb-schema-design/codecs/altinity-kb-how-to-test-different-compression-codecs/
- If the performance of certain queries is not enough - consider using PREWHERE / skipping indexes
- Repeat 2-9 for next big table(s). Avoid scenarios when you need to join big tables.
- Pick the clients library for you programming language (the most mature are python / golang / java / c++), build some pipeline - for inserts (low QPS, lot of rows in singe insert, check acknowledgements & retry the same block on failures), ETLs if needed, some reporting layer (https://kb.altinity.com/altinity-kb-integrations/bi-tools/)
Stage 1. Planning the production setup
- Collect more data / estimate insert speed, estimate the column sizes per day / month.
- Measure the speed of queries
- Consider improvement using materialized views / projections / dictionaries.
- Collect requirements (ha / number of simultaneous queries / insert pressure / ’exactly once’ etc)
- Do a cluster sizing estimation, plan the hardware
- plan the network, if needed - consider using LoadBalancers etc.
- If you need sharding - consider different sharding approaches.
Stage 2. Preprod setup & development
- Install ClickHouse in cluster - several nodes / VMs + zookeeper
- Create good config & automate config / os / restarts (ansible / puppet etc)
- Set up monitoring / log processing / alerts etc.
- Set up users.
- Think of schema management. Deploy the schema.
- Design backup / failover strategies:
- Develop pipelines / queries, create test suite, CI/CD
- Do benchmark / stress tests
- Test configuration changes / server restarts / failovers / version upgrades
- Review the security topics (tls, limits / restrictions, network, passwords)
- Document the solution for operations
Stage 3. Production setup
- Deploy the production setup (consider also canary / blue-greed deployments etc)
- Schedule ClickHouse upgrades every 6 to 12 months (if possible)