Inside Yiche’s SQL Server: Memory DB, HA, Cloud and Performance Secrets
In this technical interview, Yiche Group’s senior DBA Gao Jiwei shares how the company designs its database platform, leverages SQL Server’s in‑memory OLTP, implements high‑availability solutions, builds a private cloud with open‑source tools, and optimizes performance during massive traffic events.
Database Service Architecture
Yiche Group’s data platform combines several layers:
Relational storage based on Microsoft SQL Server.
ElasticSearch for multi‑dimensional vehicle queries and price look‑ups.
Caching layer to reduce direct storage access.
Monitoring system that generates alerts for CPU, memory, disk usage and abnormal SQL activity.
High‑Availability Options
SQL Server provides multiple HA mechanisms that can be selected after defining business‑level RPO and RTO:
Failover Cluster Instances (FCI)
AlwaysOn Availability Groups
Database Mirroring
Log Shipping
Replication
Service Broker
Choosing the appropriate technology balances cost and recovery objectives.
Private‑Cloud Implementation
Yiche’s private‑cloud platform is built on open‑source components:
Storage : GlusterFS provides distributed, replicated file systems.
Compute : KVM hypervisor hosts virtual machines; Microsoft virtualization solutions are used where required.
Load‑Balancing Strategy
A scalable read‑write separation architecture is employed. SQL Server replication and Service Broker are customized to match traffic patterns, enabling horizontal scaling of read workloads while writes are directed to primary nodes.
In‑Memory OLTP Comparison
Three major in‑memory solutions are discussed:
SQL Server In‑memory OLTP (Hekaton) : Eliminates latches, uses optimistic concurrency with timestamps, and can compile queries to native machine code. Reported throughput improvements range from 10× to 100× (e.g., Bwin increased from 15 000 to 250 000 requests/second).
Oracle TimesTen : Designed for high‑load OLTP but retains traditional locking mechanisms.
SAP HANA : Column‑store engine focused on analytical workloads rather than pure OLTP.
Yiche prefers SQL Server because its optimizer, tooling, and ecosystem better fit complex automotive data services.
Performance‑Tuning Fundamentals
Effective tuning requires deep knowledge of:
Physical storage structures and I/O patterns.
Engine behavior (buffer pool, latch usage, pre‑read).
Query optimizer mechanics (costing, plan selection).
Common issues such as key lookups, missing covering indexes, random I/O, and inappropriate parallelism.
Double‑11 (Singles’ Day) Case Study
During a high‑traffic event, a single SQL Server instance exhibited excessive CPU and memory usage. DMV analysis identified a Sort operator consuming >600 MB per query.
Because Sort is CPU‑intensive and allocates memory proportional to the result set (often >200 % of the data size), the Sort operator caused buffer‑pool instability and degraded throughput.
Switching to a parallel execution plan that introduced a Bitmap filter reduced:
Query response time to 0.3 s
CPU time by roughly 1 s
Memory consumption to ≈300 MB
To enforce parallelism, three mechanisms were evaluated:
Plan Guides – rejected because the workload contained many distinct query hashes.
Trace Flag 8649 – technically viable but required code changes and coordination with developers.
Global cost threshold for parallelism adjustment – chosen for its immediate effect.
The threshold was tuned by sampling high‑cost queries, calculating their subtree costs, and selecting a value that minimized impact. The adjustment was applied in a single step, clearing the plan cache once but yielding an immediate CPU drop while traffic continued to rise.
-- Example of changing the parallelism threshold
EXEC sp_configure 'show advanced options', 1; RECONFIGURE;
EXEC sp_configure 'cost threshold for parallelism', 30; RECONFIGURE;Result: CPU usage fell sharply, memory pressure eased, and the buffer pool remained stable throughout the event.
DBA Role in Cloud Environments
Automation reduces routine tasks such as backup scheduling, but performance monitoring and optimization remain critical. Inefficient schema design, query patterns, or application architecture increase cloud resource consumption, directly affecting total cost of ownership (TCO). DBAs must therefore focus on:
Proactive performance alerts.
Capacity planning for virtualized resources.
Continuous tuning of queries and indexes.
Adapting to cloud‑native technologies (virtualization, distributed storage, containerization) is essential for maintaining database reliability and cost efficiency.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
