Ceph at homelab scale: distributed storage you can actually run

Hardware: 3 nodes, 3+ disks each, 10 GbE

The minimum for production Ceph is 3 nodes (it needs a quorum). Per node:

A boot drive (separate from OSD drives).
2+ disks dedicated to OSDs (object storage daemons). One per disk. Mix HDD and SSD; Ceph handles per-pool placement.
At least 16 GB RAM (more is better; ~4 GB per OSD is the rule).
10 GbE recommended; 1 GbE works for tiny homelab loads but is the bottleneck.
Time sync (chrony; see that tutorial) — Ceph is unhappy with skewed clocks.

Install cephadm on the first node

sudo apt install cephadm

# Or fetch the latest directly
CEPH_RELEASE=19.2.0
curl --silent --remote-name --location \
    https://download.ceph.com/rpm-${CEPH_RELEASE}/el9/noarch/cephadm
sudo install -m 755 cephadm /usr/local/sbin/

# Add the Ceph apt repo so further packages come from there
sudo cephadm add-repo --release squid    # squid = 19.x; latest is rolling

Bootstrap

# IP of the first manager / monitor
sudo cephadm bootstrap --mon-ip 10.0.6.10 \
    --initial-dashboard-user admin \
    --initial-dashboard-password "<long-random>"

This pulls all Ceph daemons as containers (Podman by default), sets up the mon + mgr + an admin shell, prints the dashboard URL + login. Takes ~5 minutes.

Add the other 2 nodes

# From the first node
sudo ceph cephadm get-pub-key | sudo tee /tmp/cephkey

# Copy that public key to root@node-2 and root@node-3 authorized_keys

# Then add them
sudo ceph orch host add node-2 10.0.6.11
sudo ceph orch host add node-3 10.0.6.12

Within a few minutes, ceph orchestrates monitors + managers across all three. Verify:

sudo ceph -s
# cluster:
#     id:     ...
#     health: HEALTH_OK
#   services:
#     mon: 3 daemons, quorum node-1,node-2,node-3
#     mgr: node-1(active), standbys: node-2, node-3
#     osd: 0 osds: 0 up, 0 in

Add OSDs (one per data disk)

# Auto-discover and add all available disks
sudo ceph orch apply osd --all-available-devices

# Or per-disk
sudo ceph orch daemon add osd node-1:/dev/sdb
sudo ceph orch daemon add osd node-1:/dev/sdc
sudo ceph orch daemon add osd node-2:/dev/sdb
# ... etc

# Watch
sudo ceph osd tree

Each OSD takes ~30 seconds to come up. After all are running:

sudo ceph -s
# osd: 9 osds: 9 up, 9 in (assuming 3 nodes × 3 disks)

The dashboard

Browse to https://<node-1>:8443 with the admin / password from bootstrap. The dashboard shows cluster health, per-OSD status, pool usage, performance graphs. Most operational tasks (create pools, mount CephFS, manage RGW) are clickable here.

Create a pool + RBD block device

For VM disk images / Kubernetes PVs:

sudo ceph osd pool create rbd-pool 64 64
sudo ceph osd pool application enable rbd-pool rbd
sudo rbd pool init rbd-pool

# Create a 100 GB volume
sudo rbd create --size 100G rbd-pool/my-vm-disk

# Map to a Linux block device
sudo rbd map rbd-pool/my-vm-disk
# /dev/rbd0

# Format and use like any block device
sudo mkfs.xfs /dev/rbd0
sudo mount /dev/rbd0 /mnt/vm-data

For Kubernetes, the ceph-csi driver plus a StorageClass pointed at this pool gives you dynamic PVC provisioning.

CephFS: POSIX filesystem

# Create the metadata + data pools
sudo ceph osd pool create cephfs_data 32
sudo ceph osd pool create cephfs_meta 32

# Create the filesystem
sudo ceph fs new myfs cephfs_meta cephfs_data

# Deploy MDS daemons (metadata servers)
sudo ceph orch apply mds myfs --placement=3

# Get a client key
sudo ceph fs authorize myfs client.admin / rw > /etc/ceph/ceph.client.admin.keyring

# Mount with the kernel client
sudo mount -t ceph node-1.lab:6789,node-2.lab:6789,node-3.lab:6789:/ /mnt/ceph \
    -o name=admin,secretfile=/etc/ceph/admin.secret

CephFS is a POSIX-compliant network filesystem: ls, cat, hard/soft links, ownership, permissions all work. Multiple clients can mount concurrently with consistent semantics.

S3-compatible Object Gateway (RGW)

# Deploy RGW
sudo ceph orch apply rgw default --placement=3

# Now a S3 endpoint listens on port 80 of each node
# Create an S3 user
sudo radosgw-admin user create --uid=amir --display-name='Amir Eslampanah'
# Output includes access_key and secret_key

# Use with any S3 client (aws-cli, rclone, restic):
aws s3 --endpoint-url=http://node-1.lab ls

Now you have an S3-compatible store backed by the same Ceph pool that holds your RBD and CephFS. Pair with restic (see that tutorial) for backup storage hosted on your own Ceph cluster.

Replication, healing, redundancy

Default pool replication is 3-way (each object lives on 3 OSDs across different hosts). Lose a disk: Ceph detects, marks the OSD down, automatically re-replicates the affected objects to other healthy OSDs to restore 3 copies. Lose a whole node: same idea at the node level.

# See replication policy on a pool
sudo ceph osd pool get rbd-pool size               # size: 3 (replicas)
sudo ceph osd pool get rbd-pool min_size           # min_size: 2 (can serve reads with 2 alive)

# For larger setups, switch to erasure coding for capacity efficiency
sudo ceph osd erasure-code-profile set ec42 k=4 m=2 plugin=jerasure technique=reed_sol_van
sudo ceph osd pool create archive 32 32 erasure ec42

Backups + disaster recovery

Ceph protects against disk and node failures within a cluster but doesn't replace off-cluster backups. For DR:

RBD mirroring — replicate block volumes to a second Ceph cluster.
CephFS snapshots + rsync to S3.
RGW multi-site replication — one cluster replicates to another asynchronously.

When Ceph is overkill for a homelab

For "I need NFS for my media library" — just run NFS on a single host. Replicating across 3 boxes for a watch-once media collection is overkill.
For "I need cheap S3 storage" — MinIO (see that tutorial) on one box is simpler.
For "I want shared storage between 2-3 VMs," ZFS (see that tutorial) on one box + NFS export covers it.

When Ceph earns its complexity

You want one storage pool serving block + file + object simultaneously.
You're running Kubernetes (k3s in that tutorial) at multi-node scale and want native StorageClasses with replication.
You want storage that survives the loss of a whole node automatically.
You're going to grow into more than ~10 TB of storage.

For homelabs that hit those bars, Ceph is the most production-credible self-hosted option. For everything below that, simpler alternatives win.