ClickHouse: a column-oriented database that scans billions of rows in seconds

Install on Debian / Ubuntu

sudo apt-get install -y apt-transport-https ca-certificates curl gnupg
curl -fsSL https://packages.clickhouse.com/rpm/lts/repodata/repomd.xml.key \
    | sudo gpg --dearmor -o /usr/share/keyrings/clickhouse-keyring.gpg

echo "deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg] \
      https://packages.clickhouse.com/deb stable main" \
    | sudo tee /etc/apt/sources.list.d/clickhouse.list

sudo apt-get update
sudo apt-get install -y clickhouse-server clickhouse-client

The install asks for a default-user password during setup. Start the service:

sudo systemctl enable --now clickhouse-server
clickhouse-client --password         # interactive prompt

Create a MergeTree table

MergeTree is the engine. Variants include ReplacingMergeTree (deduplicates by sorting key on background merges), SummingMergeTree (sums numeric columns), AggregatingMergeTree (stores incremental aggregate states). Plain MergeTree first:

CREATE TABLE events (
    ts         DateTime64(3) CODEC(DoubleDelta, LZ4),
    user_id    UInt64,
    event      LowCardinality(String),
    url        String,
    duration_ms UInt32,
    country    LowCardinality(FixedString(2))
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(ts)
ORDER BY (event, user_id, ts)
SETTINGS index_granularity = 8192;

Two concepts that pay off immediately:

LowCardinality(...) — for columns with a small number of distinct values (event names, country codes), ClickHouse builds a dictionary and stores integer references. Reduces storage and accelerates GROUP BY.
CODEC(DoubleDelta, LZ4) — per-column compression. Timestamps compress 50–100× with delta encoding. Numeric series benefit from Gorilla / FPC codecs. Strings get LZ4 by default.

ORDER BY sets the primary key (sparse, every 8192 rows by default) and the sort order on disk. Queries that filter on the leftmost prefix of ORDER BY use index seeks; others fall back to full scans of relevant partitions.

Insert

INSERT INTO events VALUES
    ('2026-04-10 14:32:17.123', 1001, 'signup', '/welcome', 1240, 'CA'),
    ('2026-04-10 14:32:19.456', 1002, 'pageview', '/products', 320, 'US');

ClickHouse loves big batches. Inserting one row at a time is fine for testing; for real ingest, batch a few thousand to a few hundred thousand rows per INSERT. Background merges combine the resulting "parts" into larger ones over time — the merge in MergeTree.

Query

-- Top events of last 7 days
SELECT event, count() AS c
FROM events
WHERE ts >= now() - INTERVAL 7 DAY
GROUP BY event
ORDER BY c DESC
LIMIT 10;

-- Page-view distribution by country
SELECT country, count() AS pv
FROM events
WHERE event = 'pageview'
  AND ts >= now() - INTERVAL 30 DAY
GROUP BY country
ORDER BY pv DESC;

-- Per-day uniques
SELECT toDate(ts) AS day,
       uniq(user_id) AS unique_users
FROM events
WHERE event = 'pageview'
  AND ts >= now() - INTERVAL 90 DAY
GROUP BY day
ORDER BY day;

Even on a single laptop these queries return in milliseconds against tens of millions of rows. uniq() is HyperLogLog-based; for exact counts use uniqExact().

Materialized views for incremental aggregation

For repeating queries, materialized views pre-aggregate on the way in. They are incremental — the view is updated on every insert to the source table:

CREATE TABLE events_daily (
    day            Date,
    event          LowCardinality(String),
    pv             UInt64,
    unique_users   AggregateFunction(uniq, UInt64)
)
ENGINE = AggregatingMergeTree
PARTITION BY toYYYYMM(day)
ORDER BY (event, day);

CREATE MATERIALIZED VIEW events_daily_mv TO events_daily AS
SELECT
    toDate(ts)             AS day,
    event,
    count()                AS pv,
    uniqState(user_id)     AS unique_users
FROM events
GROUP BY day, event;

Now SELECT ... FROM events_daily reads from the pre-aggregated table. Use uniqMerge() to combine the partial AggregateFunction states across rows:

SELECT day,
       event,
       sum(pv) AS pv,
       uniqMerge(unique_users) AS unique_users
FROM events_daily
GROUP BY day, event
ORDER BY day DESC;

Streaming ingest from Kafka

CREATE TABLE kafka_source (
    ts DateTime64(3), user_id UInt64, event String, url String,
    duration_ms UInt32, country FixedString(2)
)
ENGINE = Kafka
SETTINGS kafka_broker_list = 'kafka:9092',
         kafka_topic_list  = 'events',
         kafka_group_name  = 'clickhouse',
         kafka_format      = 'JSONEachRow',
         kafka_num_consumers = 4;

CREATE MATERIALIZED VIEW kafka_to_events TO events AS
SELECT * FROM kafka_source;

Now every message in the Kafka topic events is consumed by the kafka_source engine, then the materialized view writes it into the regular events table. End-to-end latency: a few seconds.

Reading directly from S3 / files

-- One-shot query against Parquet on S3
SELECT count() FROM s3(
    'https://my-bucket.s3.us-east-1.amazonaws.com/logs/2025/*.parquet',
    'AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY',
    'Parquet'
);

-- Same against a local file
SELECT * FROM file('/tmp/events.csv', 'CSV', 'id Int64, name String, ts DateTime');

The same dialect works against object storage, local files, HTTP URLs, MySQL / Postgres tables (federated), and other ClickHouse instances. INSERT INTO events SELECT ... FROM s3(...) is the canonical bulk-load.

Cluster topology

For one node, what's above is everything. For multi-node:

Replication — ReplicatedMergeTree engine with ZooKeeper or ClickHouse Keeper for consensus. Two replicas in two AZs covers most needs.
Sharding — Distributed engine in front of N shards, each of which is replicated. Queries fan out across shards and merge results.

Real distributed ClickHouse is heavier; for "I want fast analytics on data that fits on one big box," single-node ClickHouse is the sweet spot.

When ClickHouse is the wrong tool

OLTP workloads — many small concurrent writes, per-row updates, transactions. ClickHouse handles updates via ALTER TABLE ... UPDATE as background mutations — not interactive.
Highly mutable data — "update this row right now" is a slow operation. ReplacingMergeTree is the closest thing, but updates are eventually-consistent on merge.
Operational simplicity over scan speed — if Postgres + TimescaleDB (see that tutorial) handles the query latency, the cost of a second database engine isn't worth it. Add ClickHouse when the analytics queries actually outgrow Postgres.

The query log is the secret weapon

Every query is logged to a system table that ClickHouse itself indexes:

SELECT query_duration_ms,
       read_rows,
       memory_usage,
       query
FROM system.query_log
WHERE event_time >= now() - INTERVAL 1 HOUR
  AND type = 'QueryFinish'
ORDER BY query_duration_ms DESC
LIMIT 20;

"Show me the slowest queries in the last hour, with rows read and memory consumed" is one SQL query. The query log is the entry point for any kind of database-level investigation.