Databases 23 min read

Overview of Lightweight Python Databases: PickleDB, TinyDB, ZODB, Durus, Buzhug, Gadfly, and PyTables

This article introduces several lightweight Python databases—including PickleDB, TinyDB, ZODB, Durus, Buzhug, Gadfly, and PyTables—detailing their main features, typical use cases, limitations, and providing basic code examples to help developers choose suitable storage solutions for small projects or learning purposes.

Python Programming Learning Circle

Oct 31, 2024

Python offers several lightweight, pure‑Python database libraries suitable for small projects, learning, or rapid prototyping. The following sections briefly introduce each library, its main features, typical use cases, cautions, and a minimal code example.

PickleDB

PickleDB is a tiny key‑value store written in Python that persists data to a JSON file. It provides a dictionary‑like API and requires no external dependencies.

Main Features

Very lightweight, not intended for large datasets or high‑concurrency.

Simple dict‑style API.

Data stored as JSON, easy to read and modify.

Automatic persistence to disk.

No external dependencies.

Typical Use Cases

Configuration storage.

Small scripts or prototypes.

Teaching basic database operations.

Temporary data during development.

Considerations

Performance degrades with large data or concurrent access.

Lacks encryption and access control; not suitable for sensitive data.

Basic Usage

import pickledb

# Create or open a database
db = pickledb.load('example.db', auto_dump=True)

# Insert data
db.set('key1', 'value1')

# Retrieve data
value = db.get('key1')
print(value)  # output: value1

# Check key existence
exists = db.exists('key1')
print(exists)  # output: True

# Delete data
db.rem('key1')

# Get all keys
keys = db.getall()
print(keys)

# Force dump to disk
db.dump()

TinyDB

TinyDB is a document‑oriented NoSQL database written in Python. It stores data in JSON files, requires no external server, and offers a simple API with a powerful query language.

Main Features

Pure‑Python, zero‑dependency, embeddable.

Document storage; each document is a Python dict.

Simple CRUD API similar to native data structures.

Rich query language supporting complex conditions and regex.

Plugin system for extensibility.

Basic transaction support.

Typical Use Cases

Small applications needing quick data persistence.

Embedded or desktop applications.

Prototyping before moving to a larger database.

Configuration storage.

Considerations

Performance and scalability limited to small datasets.

File‑based storage may become slower as the JSON file grows.

Basic Usage

from tinydb import TinyDB, Query

# Create or open a database
db = TinyDB('db.json')

# Insert data
db.insert({'name': 'John', 'age': 22})
db.insert({'name': 'Jane', 'age': 25})

# Query data
User = Query()
result = db.search(User.name == 'John')
print(result)  # output: [{'name': 'John', 'age': 22}]

# Update data
db.update({'age': 23}, User.name == 'John')

# Delete data
db.remove(User.name == 'Jane')

# Retrieve all data
all_data = db.all()
print(all_data)

db.close()

ZODB

ZODB (Zope Object Database) is an object‑oriented database for Python that stores Python objects directly, bypassing the relational model.

Main Features

Object‑oriented persistence; stores complex Python objects.

Transparent persistence; serialization handled automatically.

ACID transaction support.

Versioning and history tracking.

Extensible storage back‑ends (file, memory, network).

Schema‑less design.

Typical Use Cases

Persisting complex data structures (e.g., CMS, scientific apps).

Python‑centric applications needing tight integration.

Any app that benefits from built‑in transaction safety.

Considerations

Not ideal for very large datasets or high‑performance scenarios.

Learning curve for developers accustomed to relational databases.

Smaller community and ecosystem.

Basic Usage

import transaction
from ZODB import FileStorage, DB
import persistent

# Define a persistent class
class Person(persistent.Persistent):
    def __init__(self, name, age):
        self.name = name
        self.age = age

# Set up storage and database
storage = FileStorage.FileStorage('mydata.fs')
db = DB(storage)

# Open a connection
connection = db.open()
root = connection.root()

# Add an object
root['person'] = Person('John Doe', 30)

# Commit transaction
transaction.commit()

# Retrieve the object
person = root['person']
print(person.name, person.age)

# Clean up
connection.close()
db.close()
storage.close()

Durus

Durus is a lightweight object‑oriented persistence system written in Python, similar to ZODB but with a simpler design.

Main Features

Object‑oriented storage; direct persistence of Python objects.

File‑based persistence.

Basic transaction support.

Simple, easy‑to‑learn API.

Very lightweight; no complex configuration.

Typical Use Cases

Small projects needing simple data persistence.

Python applications that want native object storage.

Rapid prototyping before switching to a more feature‑rich database.

Considerations

Performance and scalability limited to small datasets.

Limited query capabilities; no advanced indexing.

Small community support.

Basic Usage

from durus.persistent import Persistent
from durus.connection import Connection
from durus.storage import FileStorage

# Define a persistent class
class Person(Persistent):
    def __init__(self, name, age):
        self.name = name
        self.age = age

# Create storage and connection
storage = FileStorage('mydata.durus')
connection = Connection(storage)

# Get the root object
root = connection.get_root()

# Add an object
root['person'] = Person('John Doe', 30)

# Commit transaction
connection.commit()

# Retrieve the object
person = root['person']
print(person.name, person.age)

# Clean up
connection.close()
storage.close()

Buzhug

Buzhug is a pure‑Python lightweight database that offers a SQL‑like query language while remaining simple and schema‑less.

Main Features

Pure Python implementation; no external dependencies.

SQL‑like query syntax.

Very lightweight; ideal for learning and small projects.

Intuitive API for beginners.

Schema‑less flexibility.

Typical Use Cases

Small applications requiring simple data storage.

Educational purposes and learning basic database concepts.

Rapid prototyping before moving to a more robust system.

Considerations

Performance and feature set limited; not suited for large datasets.

Small community and ecosystem.

Lacks advanced transaction and indexing capabilities.

Basic Usage

from buzhug import Base

# Create or open a database
db = Base('people').create(('name', str), ('age', int))

# Insert data
db.insert(name='John Doe', age=30)
db.insert(name='Jane Doe', age=25)

# Query data
for person in db.select():
    print(person.name, person.age)

# Update data
db.update(db.name == 'John Doe', age=31)

# Delete data
db.delete(db.name == 'Jane Doe')

db.close()

Gadfly

Gadfly is a pure‑Python lightweight relational database that implements a subset of SQL, suitable for teaching and small‑scale projects.

Main Features

Pure Python; runs anywhere Python is available.

Supports standard SQL queries.

Lightweight and embeddable; no server required.

Typical Use Cases

Learning SQL and basic database concepts.

Small applications needing simple relational storage.

Rapid development and prototyping.

Considerations

Performance and functionality limited to small datasets.

Small community; limited ongoing development.

Compatibility adjustments may be needed for modern Python versions.

Basic Usage

from gadfly import gadfly

# Create or connect to a database
connection = gadfly('mydb', 'mydb_directory')

# Get a cursor
cursor = connection.cursor()

# Create a table
cursor.execute("CREATE TABLE people (name VARCHAR, age INTEGER)")

# Insert data
cursor.execute("INSERT INTO people (name, age) VALUES ('John Doe', 30)")
cursor.execute("INSERT INTO people (name, age) VALUES ('Jane Doe', 25)")

# Query data
cursor.execute("SELECT * FROM people")
for row in cursor.fetchall():
    print(row)

# Update data
cursor.execute("UPDATE people SET age = 31 WHERE name = 'John Doe'")

# Delete data
cursor.execute("DELETE FROM people WHERE name = 'Jane Doe'")

# Commit and close
connection.commit()
connection.close()

PyTables

PyTables is an open‑source library for managing large scientific datasets using the HDF5 file format. It provides efficient storage, compression, hierarchical organization, and powerful querying, tightly integrated with NumPy.

Main Features

Built on HDF5, a mature format for massive data.

Supports multiple compression algorithms.

Hierarchical data organization (groups and tables).

Handles datasets larger than memory with partial I/O.

Rich data‑type support, including NumPy arrays.

Powerful query capabilities.

Seamless NumPy integration.

Typical Use Cases

Scientific computing and data analysis (e.g., climate, genomics, physics).

Managing very large datasets that cannot fit into RAM.

Data archiving and sharing using the portable HDF5 format.

Considerations

Performance may require tuning of data layout and compression.

Depends on the external HDF5 library; proper installation is required.

Cross‑platform compatibility depends on HDF5 version.

Basic Usage

import numpy as np
import tables

# Create an HDF5 file
with tables.open_file('example.h5', mode='w') as file:
    # Create a group
    group = file.create_group('/', 'data_group', 'Data Group')

    # Create a structured array
    data = np.array([(1, b'Hello'), (2, b'World')], dtype=[('number', 'i4'), ('word', 'S10')])
    table = file.create_table(group, 'example_table', description=data.dtype, title='Example Table')

    # Insert data
    row = table.row
    for item in data:
        row['number'] = item['number']
        row['word'] = item['word']
        row.append()
    table.flush()

    # Query data
    for row in table.where('number > 1'):
        print(row['number'], row['word'].decode('utf-8'))

    # Read entire table into a NumPy array
    np_data = table.read()
    print(np_data)

Each of these libraries targets a niche where simplicity, minimal dependencies, and ease of use outweigh the need for high performance or advanced features. Developers should select the one that best matches their project size, data complexity, and required functionality.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Key-Value

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.