How to change ORDER BY
Create a new table and copy data through an intermediate table. Step by step procedure.
We want to add column3 to the ORDER BY in this table:
CREATE TABLE example_table
(
date Date,
column1 String,
column2 String,
column3 String,
column4 String
)
ENGINE = ReplicatedMergeTree('/clickhouse/{cluster}/tables/{shard}/default/example_table', '{replica}')
PARTITION BY toYYYYMM(date)
ORDER BY (column1, column2)
Stop publishing/INSERT into
example_table.Rename table example_table to example_table_oldCreate the new table with the old name. This will preserve all dependencies like materialized views.
CREATE TABLE example_table as example_table_old
ENGINE = ReplicatedMergeTree('/clickhouse/{cluster}/tables/{shard}/default/example_table_new', '{replica}')
PARTITION BY toYYYYMM(date)
ORDER BY (column1, column2, column3)
Copy data from
example_table_oldintoexample_table_tempa. Use this query to generate a list of INSERT statements
-- old Clickhouse versions before a support of `where _partition_id` select concat('insert into example_table_temp select * from example_table_old where toYYYYMM(date)=',partition) as cmd, database, table, partition, sum(rows), sum(bytes_on_disk), count() from system.parts where database='default' and table='example_table_old' group by database, table, partition order by partition -- newer Clickhouse versions with a support of `where _partition_id` select concat('insert into example_table_temp select * from ', table,' where _partition_id = \'',partition_id, '\';') as cmd, database, table, partition, sum(rows), sum(bytes_on_disk), count() from system.parts where database='default' and table='example_table_old' group by database, table, partition_id, partition order by partition_idb. Create an intermediate table
CREATE TABLE example_table_temp as example_table_old ENGINE = MergeTree PARTITION BY toYYYYMM(date) ORDER BY (column1, column2, column3)c. Run the queries one by one
After each query compare the number of rows in both tables. If the INSERT statement was interrupted and failed to copy data, drop the partition in
example_tableand repeat the INSERT. If a partition was copied successfully, proceed to the next partition.Here is a query to compare the tables:
select database, table, partition, sum(rows), sum(bytes_on_disk), count() from system.parts where database='default' and table like 'example_table%' group by database, table, partition order by partitionAttach data from the intermediate table to
example_tablea. Use this query to generate a list of ATTACH statements
select concat('alter table example_table attach partition id ''',partition,''' from example_table_temp') as cmd, database, table, partition, sum(rows), sum(bytes_on_disk), count() from system.parts where database='default' and table='example_table_temp' group by database, table, partition order by partitionb. Run the queries one by one
Here is a query to compare the tables:
select hostName(), database, table, partition, sum(rows), sum(bytes_on_disk), count() from clusterAllReplicas('my-cluster',system.parts) where database='default' and table like 'example_table%' group by hostName(), database, table, partition order by partitionDrop
example_table_oldandexample_table_temp