Databases 14 min read

MongoDB vs MySQL: Performance Benchmark for Large-Scale Market Data Queries

This article evaluates MongoDB and MySQL for storing and querying large‑scale precious‑metal market data, detailing hardware setup, version configurations, schema designs, and extensive performance benchmarks across million‑ to billion‑record datasets, and concludes with analysis of memory architectures, data models, and suitable application scenarios.

dbaplus Community
dbaplus Community
dbaplus Community
MongoDB vs MySQL: Performance Benchmark for Large-Scale Market Data Queries

Background and Motivation

The precious‑metal market system currently retrieves data via Redis, which stores only about 8,000 minute‑level records. Anticipating queries over months or years, a more scalable solution is needed. Redis is suitable for small in‑memory datasets, while traditional relational databases face performance bottlenecks at larger scales, prompting an investigation of alternative NoSQL databases.

Why MongoDB?

According to the DB‑Engines ranking (Nov 2017), MongoDB ranks 5th overall and 1st among NoSQL databases. Its advantages include an active community, high query efficiency with full indexing when memory is sufficient, and a sharding mechanism that supports massive data storage and horizontal scaling. The main drawbacks are lack of transactions and limited support for joins and complex queries, which do not affect the target use case.

Performance Test Setup

Hardware Environment

Server specifications are listed in the accompanying image.

Database Versions

MongoDB server version: 3.4.5

MongoDB client driver: mongo-java-driver-2.14.3

MySQL server version: 5.6.34

MySQL connector: MySQL-connector-java-6.0.6

MongoDB uses the WiredTiger storage engine; MySQL uses InnoDB.

Schema and Indexes

MongoDB collection has a unique index on dateTime. Example document structure is shown below:

MySQL table uses a composite unique index on DATETIME, PARTNER_ID, GOODS_ID, SCOPE. Example row image:

All connection pools were configured with a maximum of 200 connections, and SQL statements were tuned for optimal performance.

Benchmark Results

Million‑ and Ten‑Million‑Record Tests

Billion‑Record Tests

Result Analysis

When each query retrieves 500 rows, both MySQL and MongoDB deliver comparable performance (average response < 500 ms, TPS ≈ 230) even with 100 concurrent threads, regardless of total data size (10 M or 1 B rows).

For 5,000‑row queries on a 10 M dataset, MongoDB’s performance drops to less than half of MySQL’s under 50 threads, and degrades further with 100 threads. On a 1 B dataset, both databases perform poorly when concurrency exceeds 50 threads.

The tests reveal that, for simple range queries on a modest data model, MongoDB does not outperform MySQL under high concurrency. Moreover, performance is more sensitive to the number of rows requested per query than to the total dataset size.

Memory Architecture Comparison

InnoDB (MySQL)

The InnoDB buffer pool (20 GB in this case) uses an LRU algorithm to manage pages, with separate FREE and LRU lists. Important parameters include innodb_buffer_pool_size and innodb_read_ahead_threshold, which control cache size and pre‑fetch behavior.

WiredTiger (MongoDB)

The WiredTiger cache is organized as a B‑tree of pages, with configurable cache size via storage.wiredTiger.engineConfig.cacheSizeGB. It caches working‑set data such as indexes, namespaces, and uncommitted writes.

Data Modeling: Embedded vs. Referenced

MongoDB’s schema‑less documents support both embedded (denormalized) and referenced (normalized) models.

Embedded Model Example

{ "_id": ..., "name": "Wilber", "contact": { "phone": "12345678", "email": "[email protected]" } }

Embedding related data in a single document reduces the number of queries and enables atomic updates, but can lead to large documents and potential fragmentation.

Referenced Model Example

{ "_id": ..., "name": "Wilber", "contact": { "phone": "12345678", "email": "[email protected]" }, "posts": [ { "title": "Indexes in MongoDB", "created": "12/01/2014", "link": "www.linuxidc.com" }, { "title": "Replication in MongoDB", "created": "12/02/2014", "link": "www.linuxidc.com" }, { "title": "Sharding in MongoDB", "created": "12/03/2014", "link": "www.linuxidc.com" } ] }

References are preferable when data redundancy would be high, when many‑to‑many relationships are needed, or when complex hierarchical queries are required.

Application Scenario Analysis

When MongoDB Is Suitable

Schema is undefined or evolves rapidly; e.g., content management, comment systems.

High write throughput with low transactional requirements, such as logging.

Very large or rapidly growing datasets; MongoDB’s built‑in sharding facilitates horizontal scaling.

Need for high availability via replica sets.

When MongoDB Is Not Recommended

Applications requiring multi‑document ACID transactions.

Workloads that depend heavily on joins or complex queries.

Relational vs. Non‑Relational Use Cases

Relational databases excel at structured data, transactional integrity, and predictable growth.

NoSQL databases excel at unstructured or semi‑structured data, massive scale, and key‑value access patterns, but lack strong join capabilities.

In summary, the choice between MySQL and MongoDB should be driven by data characteristics, query patterns, and scalability requirements rather than raw performance numbers alone.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance benchmarkmysqlMongoDBdatabase comparisonNoSQLLarge Data
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.