Build a Simple NoSQL Database in Python: Hands‑On Tutorial
This article walks through the concepts behind NoSQL databases, contrasts them with traditional SQL relational models, and provides a step‑by‑step Python implementation of a minimalist key‑value store, complete with command parsing, TCP/IP messaging, and code examples.
Introduction
The term NoSQL refers to databases that store data without a fixed schema and provide flexible key/value access. A minimal illustrative implementation in pure Python is available at
https://github.com/liuchengxu/hands-on-learning/blob/master/nosql.py.
Relational Foundations (OldSQL)
SQL (Structured Query Language) is used to query relational database management systems (RDBMS) such as MySQL, SQL Server, and Oracle. Data are organized in tables, each table having a schema that defines column names and types. A primary key uniquely identifies each row. Example Car table schema:
Make – string
Model – string
Year – four‑digit number
Color – string
VIN – string (primary key)
Typical queries:
SELECT Make, Model FROM Car; SELECT Color FROM Car WHERE Year = 1994;Normalization and Joins
To avoid redundant storage, vehicle attributes can be placed in a Vehicle table while service records reside in a ServiceHistory table. The VIN appears in both tables to link a service record to its vehicle. Example join:
SELECT Vehicle.Model, Vehicle.Year FROM Vehicle, ServiceHistory WHERE Vehicle.VIN = ServiceHistory.VIN AND ServiceHistory.Price > 75.00;Indexes
Without indexes a query requires a full table scan, which is slow. Adding an index on a column (e.g., Price) lets the engine locate matching rows directly, at the cost of extra memory for the index structure.
Key‑Value Stores and NoSQL Design
Before the NoSQL label, key/value stores such as memcached used hash tables without any schema. The toy NoSQL database mirrors this idea: a Python dict holds all data, supporting only string keys and values of type integer, string, or list.
Design goals:
Use a Python dict as primary storage.
Support only string keys.
Store integers, strings, and lists.
Communicate via a plain ASCII TCP/IP server.
Provide commands: PUT, GET, PUTLIST, APPEND, INCREMENT, DELETE, STATS.
Supported Commands
PUT : PUT; key; value; INT|STRING|LIST – insert a new entry.
GET : GET; key;; – retrieve a stored value.
PUTLIST : PUTLIST; key; a,b,c ; LIST – store a list.
APPEND : APPEND; key; value; STRING – add an element to an existing list.
INCREMENT : INCREMENT; key;; – increase an integer value.
DELETE : DELETE; key;; – remove an entry.
STATS : STATS; ;; – return success/failure counts for each command.
Message Formats
Request format (fields separated by semicolons): COMMAND; [KEY]; [VALUE]; [VALUE TYPE] Fields after COMMAND are optional depending on the command.
Response format: True|False; payload Examples:
True; Key [foo] set to [1] True; 1 True; ['a', 'b', 'c', 'd']Implementation Overview
The server imports a few standard modules, creates a global DATA dictionary for storage, and builds a COMMAND_HANDLERS lookup table that maps each command name to its handling function.
Message parsing splits the raw ASCII line on semicolons, performs type conversion ( int() for integers, str.split(',') for lists, str() for strings), and dispatches to the appropriate handler.
Each handler contains straightforward logic with error checking. Multiple assignment is used to capture return values and keep the code concise.
Why This Is a NoSQL Database
The program qualifies as a NoSQL database because it stores data without a predefined schema, offers flexible key/value access, and demonstrates the trade‑off between simplicity and query capability.
Using a VIN as the key and a list as the value (e.g., ['Lexus','RX350',2013,'Black']) means queries such as “find all cars from 1994” require scanning every entry, illustrating the limitations of naïve key/value designs.
Querying Limitations
Because the store only supports direct key lookup, any attribute‑based query must iterate over the entire DATA dictionary, checking the appropriate list index for each record. This is analogous to a full table scan and is inefficient for large datasets.
Summary
The article explains the meaning of NoSQL, reviews basic relational concepts, and walks through a toy key‑value store implemented in Python. It highlights the simplicity of schema‑less storage, the limited queryability of pure key/value designs, and common techniques (indexes, namespaces, structured formats) that production NoSQL systems use to mitigate these issues.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
