Databases 7 min read

CB‑SQL Backup and Restore: Logical and Physical Methods

This article explains CB‑SQL's two backup approaches—logical (using DUMP and IMPORT) and physical (using BACKUP and RESTORE)—detailing their mechanisms, supported formats, storage options, performance characteristics, and how they ensure reliable data recovery for large‑scale distributed databases.

JD Retail Technology
JD Retail Technology
JD Retail Technology
CB‑SQL Backup and Restore: Logical and Physical Methods

Although CB‑SQL uses Raft consensus to guarantee strong consistency across replicas, production environments still require regular backups as a final safeguard against catastrophic failures; backups enable cluster recovery when automatic restoration is impossible.

DUMP is the logical backup command that captures a global snapshot timestamp, then reads table schemas and all data (excluding indexes) via SELECT and writes them in row format to a local file. It behaves like MySQL’s dump, but if the dump runs longer than the data TTL (default 25 hours), data loss can occur; logical backups are slow for large tables and can only be stored locally.

The logical restore command IMPORT accepts CSV/TSV, Postgres, MySQL, and CockroachDB dump files, which may reside on Amazon S3, Azure, Google Cloud, HTTP, NFS/Local, or S3‑compatible services. It also supports gzip and bzip compression. IMPORT bypasses the SQL layer, converting files directly into KV pairs and generating RocksDB SST files for fast ingestion, but it cannot import into an existing table and can run parallel imports across multiple nodes.

BACKUP is the physical backup command that exports data as SSTable files. It provides range‑level concurrent export, remote storage write‑back, support for single or multiple databases/tables, point‑in‑time snapshots, and both full and incremental backups. Because CB‑SQL can store up to 4 EB, backups are typically written to remote media such as NFS, Amazon S3, Azure, Google Cloud, HTTP, or S3‑compatible services. The backup process filters out obsolete MVCC versions, reducing data size, and leverages SST file compression that RocksDB can read directly.

The physical restore command RESTORE loads the full backup files, creates the corresponding tables on the target cluster (which must not already contain those tables), and re‑encodes TableIDs in the SST keys to match the new cluster’s IDs. After re‑encoding, the SST files are imported via IMPORT , benefiting from the same parallel import mechanism for rapid data recovery.

CB‑SQL offers both logical and physical backup/restore paths to satisfy external data import/export needs while providing high‑performance, horizontally scalable recovery for massive datasets, fully exploiting RocksDB’s SST import/export capabilities and the system’s parallel processing framework.

BackupRocksDBSSTableCB-SQLLogical BackupPhysical Backuprestore
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.