How to achieve data read consistency in ClickHouse?
Question
I'm writing data into ClickHouse Cloud and need to be able, when reading data, to guarantee that I'm getting the latest complete information.
Answer
Talking to the same node
If you are using the native protocol, or a session to do your write/read, you should then be connected to the same replica: in this scenario, you're reading directly from the node where you're writing, and so your read will always be consistent.
Talking to a random node
If you can't guarantee you're talking to the same node (for example, talking to the node via HTTPS calls which get shuffled via a load balancer), you can either:
A)
- write your data
- connect to a new replica
- run SYSTEM SYNC REPLICA db.table_name LIGHTWEIGHT
- read the latest data
See SYSTEM commands reference
OR
B) read anytime with sequential consistency
Note that when using ClickHouse Cloud and its default SharedMergeTree table engine, using insert_quorum_parallel is not required — all inserts to SharedMergeTree are quorum inserts (by design).
Using SYSTEM SYNC REPLICAS or select_sequential_consistency will increase the load on ClickHouse Keeper and might have slower performance depending on the load on the service.
The recommended approach is to do the write/read using the same session or the native protocol (sticky connection).
