Successful ClickHouse deployment plan

Successful ClickHouse deployment plan.

Successful ClickHouse deployment plan

Stage 0. Build POC

  1. Install single node clickhouse
  2. Start with creating a single table (the biggest one), use MergeTree engine. Create ‘some’ schema (most probably it will be far from optimal). Prefer denormalized approach for all immutable dimensions, for mutable dimensions - consider dictionaries.
  3. Load some amount of data (at least 5 Gb, and 10 mln rows) - preferable the real one, or as close to real as possible. Usully the simplest options are either through CSV / TSV files (or insert into clickhouse_table select * FROM mysql(...) where ...)
  4. Create several representative queries.
  5. Check the columns cardinality, and appropriate types, use minimal needed type
  6. Review the partition by and order by.
  7. Create the schema(s) with better/promising order by / partitioning, load data in. Pick the best schema.
  8. consider different improvements of particular columns (codecs / better data types etc.)
  9. If the performance of certain queries is not enough - consider using PREWHERE / skipping indexes
  10. Repeat 2-9 for next big table(s). Avoid scenarios when you need to join big tables.
  11. Pick the clients library for you programming language (the most mature are python / golang / java / c++), build some pipeline - for inserts (low QPS, lot of rows in singe insert, check acknowledgements & retry the same block on failures), ETLs if needed, some reporting layer (

Stage 1. Planning the production setup

  1. Collect more data / estimate insert speed, estimate the column sizes per day / month.
  2. Measure the speed of queries
  3. Consider improvement using materialized views / projections / dictionaries.
  4. Collect requirements (ha / number of simultaneous queries / insert pressure / ’exactly once’ etc)
  5. Do a cluster sizing estimation, plan the hardware 
  6. plan the network, if needed - consider using LoadBalancers etc.
  7. If you need sharding - consider different sharding approaches.

Stage 2. Preprod setup & developement

  1. Install clickhouse in cluster - several nodes / VMs + zookeeper
  2. Create good config & automate config / os / restarts (ansible / puppet etc)
  3. Set up monitoring / log processing / alerts etc.
  4. Set up users.
  5. Think of schema management. Deploy the schema.
  6. Design backup / failover strategies:
  7. Develop pipelines / queries, create test suite, CI/CD
  8. Do benchmark / stress tests
  9. Test configuration changes / server restarts / failovers / version upgrades
  10. Review the security topics (tls, limits / restrictions, network, passwords)
  11. Document the solution for operations

Stage 3. Production setup

  1. Deploy the production setup (consider also canary / blue-greed deployments etc)
  2. Schedule ClickHouse upgrades every 6 to 12 months (if possible)