Can detached parts be dropped?
Here is what different statuses mean:
- Parts are renamed to ‘ignored’ if they were found during ATTACH together with other, bigger parts that cover the same blocks of data, i.e. they were already merged into something else.
- parts are renamed to ‘broken’ if ClickHouse was not able to load data from the parts. There could be different reasons: some files are lost, checksums are not correct, etc.
- parts are renamed to ‘unexpected’ if they are present locally, but are not found in ZooKeeper, in case when an insert was not completed properly. The part is detached only if it’s old enough (5 minutes), otherwise CH registers this part in ZooKeeper as a new part.
- parts are renamed to ‘cloned’ if ClickHouse have had some parts on local disk while repairing lost replica so already existed parts being renamed and put in detached directory. Controlled by setting
‘Ignored’ parts are safe to delete. ‘Unexpected’ and ‘broken’ should be investigated, but it might not be an easy thing to do, especially for older parts. If the
system.part_log table is enabled you can find some information there. Otherwise you will need to look in
clickhouse-server.log for what happened when the parts were detached.
If there is another way you could confirm that there is no data loss in the affected tables, you could simply delete all detached parts.
Here is a query that can help with investigations. It looks for active parts containing the same data blocks that the detached parts:
SELECT *, concat('alter table ',database,'.',table,' drop detached part ''',a.name,''' settings allow_drop_detached=1;') as drop FROM system.detached_parts a ALL LEFT JOIN (SELECT database, table, partition_id, name, active, min_block_number, max_block_number FROM system.parts WHERE active ) b USING (database, table, partition_id) WHERE a.min_block_number >= b.min_block_number AND a.max_block_number <= b.max_block_number