Mastering NoSQL and SQL: Key Differences, MongoDB Features, and MySQL Essentials
This comprehensive guide explains what NoSQL is, contrasts non‑relational and relational databases, outlines the main representatives of each, dives deep into MongoDB architecture, features, use‑cases and limitations, and also covers MySQL fundamentals, replication, backup, and common monitoring tools like Prometheus and Zabbix.
1. What is NoSQL?
NoSQL stands for "Not Only SQL" and refers to non‑relational databases that do not use SQL as a query language and do not require a fixed table schema.
2. Differences between NoSQL (non‑relational) and SQL (relational) databases
Storage method
Relational databases store data in tables with rows and columns, making relationships easy to query.
NoSQL databases store data as large collections such as documents, key‑value pairs, or graphs.
Storage structure
Relational databases require a predefined schema, which brings reliability but makes schema changes difficult.
NoSQL databases use dynamic schemas, allowing flexible data types and structures.
Normalization
Relational databases normalize data to avoid redundancy and save space.
NoSQL databases often store denormalized data in flat collections, improving read/write speed.
Scalability
Relational databases typically scale vertically (adding more resources to a single server).
NoSQL databases are designed for horizontal scaling across many commodity servers.
Query method
Relational databases use SQL, a powerful, standardized language.
NoSQL databases use various non‑standard query languages (e.g., UnQL) without a single standard.
Transactions
Relational databases follow ACID properties (Atomicity, Consistency, Isolation, Durability).
NoSQL databases follow BASE principles (Basically Available, Soft state, Eventual consistency) and typically sacrifice strong transaction guarantees.
3. Main representatives of SQL and NoSQL
SQL : MariaDB, MySQL, SQLite, SQL Server, Oracle, PostgreSQL.
NoSQL : Redis, MongoDB, Memcached, HBase.
4. MongoDB overview and key characteristics
MongoDB is an open‑source, distributed, document‑oriented NoSQL database, offering the richest feature set among NoSQL solutions.
Rich query language : Supports a powerful, object‑like query syntax and indexing.
Document model : Stores data as BSON documents (key‑value pairs).
Schema‑free : Each document can have a different structure.
High availability : Replica sets provide automatic failover and recovery.
Horizontal scaling : Sharding enables parallel processing and scale‑out.
Driver support : Official drivers for C/C++, C#, Java, Node.js, Perl, PHP, Python, Ruby, Scala, etc.
5. Advantages of MongoDB
Document‑oriented storage using JSON‑like format.
Every field can be indexed.
Replication and high scalability.
Automatic sharding.
Rich query capabilities.
Fast, real‑time updates.
6. Suitable and unsuitable scenarios for MongoDB
Ideal scenarios
Real‑time website data (high‑frequency inserts/updates).
Cache layer for persistent data.
Large‑scale deployments with dozens or hundreds of servers.
Storing JSON or object‑like data.
Unsuitable scenarios
Highly transactional systems (e.g., banking) that need strong ACID guarantees.
Traditional BI workloads that benefit from columnar storage and complex SQL queries.
Applications requiring complex multi‑table joins.
7. MongoDB terminology: database, collection, document
Database : A logical container; a MongoDB instance can host multiple databases.
Collection : Analogous to a table; a group of documents without a fixed schema.
Document : A BSON object, similar to a row, consisting of key‑value pairs.
8. Common MongoDB data types
String, Integer, Boolean, Array, Date, Binary Data, Code (JavaScript), Regular Expression.
9. MongoDB indexes and their purpose
Indexes dramatically improve query performance by avoiding full collection scans; without them, queries may take seconds or minutes on large datasets.
10. Typical MongoDB index types
Single‑field indexes
Compound indexes
Multikey indexes
Text indexes
Hash indexes
Wildcard indexes
11‑13. MongoDB replication basics
Replication requires at least two nodes: one primary handling client writes and one or more secondaries replicating the primary’s oplog. The primary records operations in the oplog; secondaries poll the oplog, apply operations, and keep data in sync.
Replication steps:
Secondaries read the latest timestamp from their local oplog.
They request newer entries from the primary’s oplog.
Fetched entries are inserted into the secondary’s oplog and applied.
14. Special secondary member types
Priority 0 – never becomes primary, useful for multi‑data‑center setups.
Hidden – invisible to clients, often used for backups.
Delayed – replicates with a lag, suitable for rolling backups.
Vote – participates only in elections.
15‑18. MongoDB sharded clusters
Sharding distributes data across multiple shards for horizontal scalability. Core components:
Shard : Stores a subset of data; often a replica set.
Config server : Holds cluster metadata, including chunk information.
Query router (mongos) : Front‑end that presents the cluster as a single database to clients.
Sharding strategies:
Range‑based : Divides data by key ranges; efficient for range queries but can lead to uneven distribution.
Hash‑based : Uses a hash of the shard key to distribute data evenly; sacrifices range‑query efficiency.
Tag‑aware : Allows manual placement of chunks using custom tags for fine‑grained balancing.
Balancing processes split oversized chunks and migrate them to under‑utilized shards to maintain even data distribution.
19‑22. MongoDB backup and restore methods
File‑system snapshots (requires journaling).
Copying data files (requires locking the database).
Using mongodump and mongorestore utilities.
23. MongoDB aggregation
Aggregation pipelines process multiple documents and return computed results, similar to SQL GROUP BY and COUNT(*), using the aggregate() method.
24. GridFS
GridFS stores large files by splitting them into smaller chunks (documents) within MongoDB, overcoming BSON size limits.
25‑26. MongoDB query optimization and write durability
Enable profiling with db.setProfilingLevel(n,{m}) to identify slow queries, then add appropriate indexes. Writes are not immediately flushed to disk; they are delayed (default up to 60 seconds) but can be tuned via syncPeriodSecs.
27‑35. MySQL fundamentals
Indexes (e.g., B‑tree, hash) accelerate queries. Transactions follow ACID. Isolation levels include Read Uncommitted, Read Committed, Repeatable Read, and Serializable. Locks – shared (read) and exclusive (write) – prevent data inconsistency.
Primary keys ensure row uniqueness and improve performance.
Common storage engines: InnoDB (transactional, row‑level locking) and MyISAM (non‑transactional, table‑level locking).
Replication involves a master writing to binary logs and slaves pulling those logs, applying them to stay in sync.
High‑availability solutions include master‑slave replication, dual‑master setups, and keepalived or Heartbeat for failover.
Optimization techniques: enable query cache, use EXPLAIN, add indexes, prefer ENUM over VARCHAR for limited values, vertical partitioning, and choose the appropriate storage engine.
Backup options: mysqldump, system‑level snapshots (tar, LVM), and third‑party tools like Percona XtraBackup.
36‑38. Monitoring tools overview
Common monitoring software includes Cacti, Zabbix, Open‑Falcon, and Prometheus.
Prometheus
Prometheus is a CNCF‑graduated open‑source monitoring system and time‑series database. Key features:
Multi‑dimensional data model with metric names and key‑value labels.
Powerful query language (PromQL) for complex analysis.
Pull‑based data collection via HTTP, with optional push gateway.
Components: Prometheus server, exporters, push gateway, Alertmanager, and Web UI (often visualized with Grafana).
Supports various metric types: Counter, Gauge, Histogram, Summary.
Architecture: server scrapes targets, stores data locally, evaluates alerts, and forwards them to Alertmanager.
Zabbix
Zabbix is an enterprise‑grade open‑source monitoring solution offering distributed monitoring, agent‑based data collection, SNMP, IPMI, and proxy support for scaling.
Core components:
Zabbix Server – central processing unit.
Database storage – holds configuration and collected data.
Web interface – GUI for configuration and visualization.
Proxy – optional component to aggregate data from remote agents.
Agent – runs on monitored hosts to collect metrics.
Supported monitoring methods include agent checks (active/passive), SNMP, IPMI, and traps. Proxies enable distributed monitoring and reduce load on the central server.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
