Operations 10 min read

Understanding NetApp SnapDiff and SnapVault Backup Technologies

The article explains how NetApp's SnapDiff engine and SnapVault solution enable efficient, agent‑less incremental backups by comparing snapshot differences directly on storage, reducing backup windows and improving performance for Windows, Linux, UNIX, and heterogeneous environments.

Architects' Tech Alliance

Mar 24, 2017

Understanding NetApp SnapDiff and SnapVault Backup Technologies

Amid the cloud‑computing wave, data backup often emphasizes openness, standardized interfaces, and generic products to lower management and operations costs, but such standardization can sacrifice execution efficiency and performance; therefore, practical deployments must balance these trade‑offs. Previously we discussed an open, agent‑less backup solution, and now we introduce a less common private backup technology.

SnapDiff is NetApp’s internal Data ONTAP engine that quickly identifies file and directory differences between two snapshot copies. By locating changes between snapshots, SnapDiff eliminates the need for traditional backup software to scan the entire file system for incremental changes, offloading part of the backup workload to the NetApp storage and reducing the time required to process new and changed data.

When SnapDiff is used for incremental backups, the backup software communicates with NetApp’s SnapDiff engine via its programming interface to discover new, modified, and deleted files between two snapshots. The engine uses namespace and namespace mirroring to generate lists of these files, allowing the backup software to copy only the identified data deltas.

Enabling SnapDiff is mandatory for any backup software that wishes to store data on backup media using this feature; only third‑party backup products that explicitly support SnapDiff can take advantage of it.

Currently IBM’s Tivoli Storage Manager (TSM) supports the SnapDiff option. By pairing the SnapDiff flag with the incremental command, TSM simplifies the incremental backup process: NetApp reports the changed file deltas instead of TSM scanning entire volumes, and the solution supports Windows, Linux, and UNIX systems.

During the first incremental backup using the SnapDiff option, the backup software creates a base snapshot and uses it as the source for a traditional incremental backup; the snapshot name is recorded in the TSM database.

On the second run, a newer snapshot (or an existing one) is created, called a diffsnapshot, and TSM retrieves the file changes from NetApp, incrementally backing them up. The base snapshot name from the previous run remains registered on the TSM server.

In practice, IBM TSM combined with NetApp SnapDiff can scan small files at a rate of at least 25 MB per second, and the standard SnapDiff API allows third‑party developers to build custom integrations that dramatically shorten backup windows.

SnapDiff differs from traditional backup in that the incremental comparison is performed on the NetApp NAS itself, so performance depends on NAS storage speed, whereas conventional hardware‑snapshot backups compare files via a mounted CV, resulting in lower performance.

NetApp’s SnapDiff can be combined with SnapMirror (remote replication) to back up source or target volumes.

Users configure a backup‑archive client to protect source files and data. On the TSM backup‑archive client, NFS‑exported shares can be used to access the NetApp source file system.

On Windows, snapshot directories appear under ~snapshot; on AIX and Linux, they appear under .snapshot. If a snapshot is not created by TSM, TSM will not delete it.

When SnapDiff incremental backup is started, NetApp creates a new diff‑snapshot on the volume to be backed up, compares it with the base snapshot, and registers the base snapshot name on the TSM server after the previous backup completes.

SnapVault is another NetApp disk‑based backup feature. It allows read‑only snapshot copies from multiple systems to be efficiently backed up to a central auxiliary system using block‑level incremental replication, providing reliable, low‑overhead disk‑to‑disk (D2D) backup. Only data blocks that changed since the last backup are copied.

SnapVault consists of a primary system (the source storage, which can be a NetApp device or a third‑party open system) and a secondary system (the backup target, which must be a NetApp storage device running Data ONTAP).

SnapVault supports both QTree and non‑QTree (LUN) configurations. Before starting SnapVault backup, administrators must plan the primary system’s QTree or directory structure and the corresponding secondary system QTree, schedule backup windows, and estimate initial backup duration.

For third‑party platforms (non‑NetApp primary storage), NetApp offers Open Systems SnapVault to provide a unified heterogeneous backup solution.

Open Systems SnapVault supports platforms such as IBM AIX, HP‑UX, IRIX, Linux, Solaris, and Windows. In a NetApp filer environment, SnapVault transfers changed blocks, while OSSV transfers changed files; however, the secondary system stores only the changed blocks.

In summary, NetApp’s snapshot technology (ROW‑based) enables rapid data recovery via SnapRestore, and FlexClone leverages snapshots to clone data by copying only the root node pointer. NetApp’s collaboration with CommVault on snapshot technology allows SnapVault and NDMP backups to implement fine‑grained policy management and NDMP server capabilities without deploying additional CV or NBU backup servers.

Warm reminder: Please search “ICT_Architect” or scan the QR code below to follow the public account for more great content.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

storage NetApp SnapDiff SnapVault

Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.