Databases 7 min read

How to Use Vektor: A Zero‑RAM Native PHP Vector Database for Fast ANN Search

Vektor is a high‑performance, pure‑PHP vector database that stores data on disk, uses the HNSW algorithm for approximate nearest‑neighbor search, and operates with zero RAM overhead, offering both an embeddable library and a standalone HTTP API server with simple installation and configuration steps.

Open Source Tech Hub
Open Source Tech Hub
Open Source Tech Hub
How to Use Vektor: A Zero‑RAM Native PHP Vector Database for Fast ANN Search

Overview

Vektor is a high‑performance, pure‑file, embedded vector database written entirely in native PHP. It implements a Zero‑RAM Overhead design: the dataset is never fully loaded into memory, and all data is accessed directly from disk. Approximate nearest‑neighbor (ANN) queries are powered by the Hierarchical Navigable Small World (HNSW) graph algorithm.

Features

Pure PHP implementation : No external dependencies or C extensions; works on any PHP 8.2+ runtime.

Zero memory overhead : Data is read from binary files; memory usage stays constant regardless of dataset size.

HNSW index : Graph‑based structure provides fast ANN search.

Binary storage : Vectors, graph connections and metadata are stored in compact binary files.

Embedded or server mode : Use as a PHP library or run as an independent HTTP API server.

Thread‑safe : File‑locking (flock) guarantees safe concurrent reads and writes.

Cosine similarity : Optimized distance metric for high‑dimensional embeddings (default 1536 dimensions, configurable).

System Requirements

PHP 8.2 or higher

Composer for dependency management

Installation

Composer (recommended for existing projects): composer require centamiv/vektor Standalone API server:

git clone https://github.com/centamiv/vektor.git
cd vektor
composer install --no-dev
mkdir -p data
chmod -R 775 data

Configuration

Server mode reads settings from a .env file.

Copy the example file: cp .env.example .env Edit .env to set the API token and vector dimensions:

# .env
VEKTOR_API_TOKEN=your_secure_random_string_here
VEKTOR_DIMENSIONS=1536

VEKTOR_API_TOKEN : When set, all endpoints except /up require an Authorization: Bearer <token> header. Leave empty for a public API.

VEKTOR_DIMENSIONS : Defines the vector dimensionality (default 1536). Changing this value requires clearing the data/ directory and re‑initialising.

Usage Options

1️⃣ HTTP API Server

Start the server (development or testing): php -S 0.0.0.0:8000 -t public If an API token is configured, include it in the request header:

Authorization: Bearer your_secure_random_string_here

API Endpoints

GET /up – Health check (no authentication). Returns {"status": "up"}.

GET /info – Returns database statistics such as file size, record count and dimensions.

POST /insert – Inserts a vector. Example payload:

{
  "id": "my-doc-id",
  "vector": [0.1, 0.2, 0.3, ...],
  "metadata": {
    "source": "docs/intro.md",
    "chunk": 3
  }
}

POST /search – Performs a vector search. Example payload:

{
  "vector": [0.1, 0.2, 0.3, ...],
  "k": 5,
  "include_metadata": true,
  "include_vector": false
}

Optional parameters: include_vector, include_metadata.

POST /delete – Deletes a vector by its ID.

POST /optimize – Triggers database optimisation (cleans soft‑deleted space and rebuilds the index).

2️⃣ Embedded Library (Direct PHP Usage)

use Centamiv\Vektor\Services\Indexer;
use Centamiv\Vektor\Services\Searcher;
use Centamiv\Vektor\Services\Optimizer;
use Centamiv\Vektor\Core\Config;

// Optional: customise data directory and dimensions (must be set before initialisation)
Config::setDataDir(__DIR__ . '/my-data');
Config::setDimensions(768);

// Insert a vector
$indexer = new Indexer();
$indexer->insert(
    "article-001",
    [0.021, -0.534, 0.789, /* ... 1536‑dimensional vector */],
    ["category" => "php", "lang" => "zh"]
);

// Search
$searcher = new Searcher();
$results = $searcher->search($queryVector, 10, includeMetadata: true);

// Optimise (clean up space)
$optimizer = new Optimizer();
$optimizer->run();

Database File Structure (Core Design)

vector.bin

– Append‑only storage of raw floating‑point vectors. meta.bin – Disk‑based binary search tree mapping IDs to file offsets. payload.bin – Append‑only storage of serialized metadata (JSON). graph.bin – HNSW graph structure used for fast navigation and ANN search.

All files use a strict binary layout and file‑locking to achieve true zero‑RAM overhead.

vector databaseHNSWPHPEmbedding SearchANNZero RAM
Open Source Tech Hub
Written by

Open Source Tech Hub

Sharing cutting-edge internet technologies and practical AI resources.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.