ZFS is a combined filesystem and volume manager originally built by Sun Microsystems for Solaris. It's the gold standard for storage on Linux servers, NAS appliances, and homelabs — combining RAID, checksumming, snapshots, compression, and caching into one coherent system.
| Core Concept | What It Means |
|---|---|
| Pool (zpool) | A collection of drives providing total storage. Carve it into datasets. |
| vdev | A group of drives — single drive, mirror pair, or RAID-Z group. |
| Dataset | A filesystem-like namespace with its own properties (compression, quota, recordsize). |
| Copy-on-Write | Data is never overwritten in place. Enables instant snapshots. |
| Checksumming | Every block has a checksum. Catches silent bit rot that other filesystems miss. |
# Single drive (no redundancy)
zpool create tank /dev/sda
# Mirror (2 drives, 1 can fail)
zpool create tank mirror /dev/sda /dev/sdb
# RAID-Z1 (3+ drives, 1 parity)
zpool create tank raidz /dev/sda /dev/sdb /dev/sdc
# RAID-Z2 (4+ drives, 2 parity)
zpool create tank raidz2 /dev/sda /dev/sdb /dev/sdc /dev/sdd
# RAID-Z3 (5+ drives, 3 parity)
zpool create tank raidz3 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde
# Stripe of mirrors (best performance)
zpool create tank mirror /dev/sda /dev/sdb mirror /dev/sdc /dev/sdd
| Layout | Min Drives | Efficiency | IOPS | Best For |
|---|---|---|---|---|
| Single | 1 | 100% | 1x | Scratch, cache, logs |
| Mirror | 2 | 50% | 2x read | VMs, databases, Docker |
| RAID-Z1 | 3 | N-1/N | 1x | Media storage, 3-5 drives |
| RAID-Z2 | 4 | N-2/N | 1x | Important data, 5-8 drives |
| RAID-Z3 | 5 | N-3/N | 1x | Enterprise, 8+ drives |
| Stripe of mirrors | 4+ | 50% | 2x+ | High-performance, HA |
ashift=12 at pool creation for 4K sector alignment: zpool create -o ashift=12 tank mirror /dev/sda /dev/sdbzpool list # List pools
zpool status # Detailed status
zpool status -v # + corrupt files
zpool iostat -v 2 # Live IO stats
zpool add tank mirror /dev/sde /dev/sdf # Add vdev
zpool replace tank /dev/sdb /dev/sdg # Replace drive
zpool online tank /dev/sdg # Bring replaced drive online
zpool detach tank /dev/sdb # Remove from mirror
zpool scrub tank # Verify all data
zpool export tank # Export pool
zpool import tank # Import pool
zpool destroy tank # ⚠ Destroy pool
zpool get all tank # All properties
# Monthly scrub via cron:
0 2 1 * * zpool scrub tank
# Check for errors
zpool status tank | grep -E "(READ|WRITE|CKSUM)"
# Check capacity
zpool list -H -o capacity,name
zfs create tank/docker
zfs create tank/media
zfs create tank/backups
zfs create -o compression=lz4 -o atime=off tank/docker
zfs list
zfs list -r tank
zfs list -t snapshot
zfs get all tank/docker
zfs get compression,atime tank/docker
zfs set compression=lz4 tank/media
zfs set atime=off tank/docker
zfs set recordsize=1M tank/media
zfs set recordsize=16K tank/docker
zfs set quota=500G tank/docker
zfs set mountpoint=/var/lib/docker tank/docker
zfs destroy tank/old-stuff
| Property | Common Values | Use Case |
|---|---|---|
compression | lz4, zstd-3 | lz4 everywhere — near-zero overhead. zstd for better media compression. |
atime | on, off | off for Docker/databases (saves writes). |
recordsize | 16K, 64K, 1M | 16K for databases. 1M for media. 128K default for general. |
quota | 500G, 1T | Hard limit including snapshots. |
refquota | 100G | Hard limit excluding snapshots (safer). |
logbias | latency, throughput | latency for databases. |
snapdir | hidden, visible | visible makes .zfs/snapshot browsable. |
zfs get compressratio,logicalused,used tank/media
# Example:
# tank/media compressratio 1.45x
# tank/media logicalused 1.2T
# tank/media used 830G
zfs snapshot tank/docker@2026-07-02
zfs list -t snapshot
zfs list -t snapshot -o name,creation,used
zfs rollback tank/docker@2026-07-02 # Destroys intermediate snaps
zfs rollback -r tank/docker@2026-07-02 # Keeps intermediate snaps
zfs clone tank/docker@2026-07-02 tank/docker-test
zfs promote tank/docker-test
zfs destroy tank/docker@2026-07-01
# Rotate snapshots older than 30 days
zfs list -H -o name -t snapshot -S creation tank/docker \
| awk 'NR>30' | xargs -r zfs destroy -v
# Full send
zfs send tank/docker@2026-07-02 | zfs recv backup-tank/docker
# Incremental (changes since last snapshot)
zfs send -i tank/docker@2026-07-01 tank/docker@2026-07-02 \
| zfs recv backup-tank/docker
# Over SSH
zfs send tank/docker@2026-07-02 \
| ssh backup-server zfs recv backup-tank/docker
# Compressed
zfs send -c tank/docker@2026-07-02 \
| ssh backup-server zfs recv backup-tank/docker
# With progress (need pv installed)
pv <(zfs send tank/docker@2026-07-02) \
| ssh backup-server zfs recv backup-tank/docker
Uses up to 50% of system RAM by default. Check and limit:
cat /proc/spl/kstat/zfs/arcstats | grep -E "^(size|c_max|c_min)"
# Limit to 4GB — add to /etc/modprobe.d/zfs.conf:
options zfs zfs_arc_max=4294967296
# Apply without reboot:
echo 4294967296 > /sys/module/zfs/parameters/zfs_arc_max
Caches frequently-read data from HDDs to a fast SSD.
zpool add tank cache /dev/nvme1n1p1
zpool remove tank /dev/nvme1n1p1 # To remove
Note: ~70 bytes of ARC per L2ARC entry. A 1TB L2ARC needs ~70MB of ARC. Skip if RAM-constrained.
For sync writes (databases, NFS). 10–20GB partition is plenty.
zpool add tank log /dev/nvme1n1p2
zpool add tank log mirror /dev/nvme1n1p2 /dev/nvme2n1p1
Critical: SLOG device must have power-loss protection (Optane, enterprise NVMe). Consumer SSDs without PLP can corrupt data during power loss.
Fast NVMe for metadata + small files pair with slow HDDs for bulk.
zpool add tank special mirror /dev/nvme1n1p3 /dev/nvme2n1p2
zfs set special_small_blocks=64K tank
| Problem | Likely Cause | Fix |
|---|---|---|
pool DEGRADED | A drive failed | zpool status → zpool replace |
pool FAULTED | Too many drives failed | Replace drives, zpool export + import, zpool clear |
| CKSUM errors | Corruption or bad cable | zpool scrub. If errors persist, replace drive. All drives CKSUM = bad backplane/HBA. |
| No space despite free pool | Dataset quota hit | zfs get quota,refquota → zfs set quota=none |
dataset is full | Pool ran out of space | Delete snapshots: zfs list -t snapshot -o name,used |
| Deleting snapshot doesn't free space | Blocks shared with other snapshots | Space freed when no snapshot references the block. Nothing to fix — it's by design. |
| Slow writes | No SLOG or slow SLOG | Add fast SLOG (Optane/enterprise NVMe) or zfs set sync=disabled for non-critical data. |
| Pool not found after reboot | Drive letters changed | Use /dev/disk/by-id/ paths. zpool import -d /dev/disk/by-id/ to recover. |
| ARC using too much RAM | Default 50% | echo 4294967296 > /sys/module/zfs/parameters/zfs_arc_max |
| Mountpoint not empty | Leftover files | mv /var/lib/docker /var/lib/docker.old && zfs mount tank/docker |
ashift=12 at pool creation. Modern drives use 4K sectors internally. Wrong alignment kills performance.zfs snapshot tank/data@before-update before upgrades or migrations./dev/disk/by-id/ not /dev/sdX. Drive letters change between reboots.zpool get all tank > zpool-config.txt