Databases 10 min read

Mastering MongoDB Schema: Using Variety for Validation and Analysis

This guide explains how to leverage MongoDB's flexible document model, introduces the open‑source Variety tool for schema analysis, demonstrates practical commands for sampling, depth control, filtering, sorting and result persistence, and covers MongoDB 3.2+ document validation features and their limitations.

dbaplus Community
dbaplus Community
dbaplus Community
Mastering MongoDB Schema: Using Variety for Validation and Analysis

Benefits of the MongoDB Document Model

MongoDB stores data as JSON‑like documents, which offers developers a natural way to persist objects without needing a predefined schema. The model provides high read/write performance because related data can be embedded or denormalized, reducing costly joins and random I/O typical of relational databases.

Variety – A Schema Analyzer for MongoDB

Variety is an open‑source utility that scans a collection, reports field types and their distribution, and generates a concise report that highlights potential schema issues.

Using Variety: Commands and Options

After creating a collection (see the illustration below), run Variety with the mongo shell:

A typical result shows which fields appear in each document and their percentages:

For large collections, limit the sample size to avoid long scans:

Control nesting depth with maxDepth to ignore overly deep embedded documents:

Filter by a condition, for example only documents where caredAbout is true:

$ mongo test --eval "var collection = 'users', query = {'caredAbout':true}" variety.js

Sort results by a field:

$ mongo test --eval "var collection = 'users', sort = { updated_at : -1 }" variety.js

Choose output format (JSON or CSV):

$ mongo test --quiet --eval "var collection = 'users', outputFormat='json'" variety.js

Run analysis on a hidden secondary to avoid load on the primary:

$ mongo secondary.replicaset.member:31337/somedb --eval "var collection = 'users', slaveOk = true" variety.js

Persist the analysis results back into MongoDB:

$ mongo test --quiet --eval "var collection = 'users', persistResults=true" variety.js

Additional parameters let you specify the destination database, collection, and authentication details:

resultsDatabase – target database name

resultsCollection – target collection name

resultsUser – username for the target instance

resultsPass – password for the target instance

mongo test --quiet --eval "var collection = 'users', persistResults=true, resultsDatabase='db.example.com/variety'" variety.js

Why Use Variety?

Even though MongoDB is schema‑free, inconsistent field types can cause data quality issues, query errors, and missing information. Variety quickly reveals type mismatches, missing fields, and unexpected structures, helping teams enforce uniformity before problems surface.

Document Validation in MongoDB 3.2+

MongoDB 3.2 introduced Document Validation, allowing administrators to define rules that enforce data integrity while preserving the flexibility of a schema‑free system.

Example: a contacts collection where phone must be a string, email must end with @mongodb.com, and status must be either "Unknown" or "Incomplete".

For existing collections, validation rules can be added via collMod:

The validationLevel parameter controls when validation is applied:

strict – validates existing and future documents (default).

moderate – validates only existing documents.

The validationAction parameter defines the response to violations:

error – rejects offending inserts/updates (default).

warn – logs a warning but allows the operation.

Validation Limitations

Cannot be applied to collections in the admin, local, or config databases.

System collections (e.g., system.*) are excluded.

The article focuses on practical usage of Variety and MongoDB's built‑in validation to maintain data quality in production environments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

MongoDBNoSQLDatabase Administrationschema validationDocument ValidationVariety
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.