Understanding Graph Databases: Concepts, Models, and Dgraph Implementation
This article introduces graph databases as a NoSQL solution, explains property‑graph modeling, compares relational and document stores, evaluates several graph products, and details Dgraph’s architecture, indexing, query language, and real‑world business applications such as knowledge graphs and equity‑relationship analysis.
What Is a Graph Database and Its Application Scope?
Graph databases are a type of NoSQL system that store entities as vertices and relationships as edges, enabling efficient storage, retrieval, and traversal of highly connected data such as social networks, web graphs, transportation networks, and corporate equity structures.
Property Graph
A property‑graph model extends basic graph theory by attaching key‑value attributes to vertices and edges. Each vertex contains a unique identifier, outgoing and incoming edge sets, and a property set; each edge contains a unique identifier, source and target vertices, a label, and its own property set.
Typical Use Cases
Social networks – people as vertices, friendships as edges.
Web graph – webpages as vertices, hyperlinks as edges.
Road/railway networks – intersections as vertices, routes as edges.
Corporate equity – companies as vertices, shareholding relations as edges.
Database Relationships and Selection
Data models drive database choice: relational models suit one‑to‑many data, document models suit hierarchical data, while graph models excel at many‑to‑many, highly connected scenarios. Graph databases store both data and topology, offering intuitive modeling and efficient multi‑hop queries.
Types of NoSQL Databases
Document databases (e.g., MongoDB, Elasticsearch)
Key‑value stores (e.g., Redis)
Columnar stores (e.g., HBase, Cassandra)
Graph databases (e.g., NebulaGraph, Neo4j, OrientDB, Dgraph)
Graph Database Types and Selection Criteria
Selection focuses on open‑source licensing, distributed architecture, millisecond‑level multi‑hop latency, and support for billions of vertices/edges. A comparison table evaluates JanusGraph, Nebula Graph, Dgraph, and Neo4j across openness, deployment cost, learning curve, community, and distributed support.
Dgraph Overview
1. Architecture
Dgraph consists of three components: Zero (cluster management), Alpha (data storage and query processing), and Ratel (web UI). Zero balances groups of Alpha nodes using Raft consensus.
2. Scaling, Replication, and Sharding
High‑availability replication runs three Zero and three Alpha instances.
Sharding splits data when a group exceeds ~1 TB, distributing shards across many Alpha nodes.
Self‑healing is achieved via Kubernetes in the cloud offering.
3. Data Storage
Data is stored as triples (subject‑predicate‑object) with a global 64‑bit UID. Triples sharing the same predicate are grouped into a posting list keyed by <subject, predicate> . Posting lists are compressed using delta encoding, achieving ~10× compression.
4. Indexing
All scalar types have default indexes; strings support hash, exact, term, full‑text, and regex tokenizers. Index keys are <predicate, token> , and updates modify posting lists accordingly.
5. Querying
Queries start from a UID list and traverse edges. Example traversal query:
{
movies(func: uid(0xb5849, 0x394c)) {
uid
m_name
code
star { s_name }
}
}Functions allow name‑based lookup, e.g.,
{
movie(func: alloftext(name@en, "the dog which barks")) {
name@en
}
}Filters combine functions with predicates, and recursive queries enable deep‑hop traversals:
{
find_follower(func: uid(A_UID)) @recurse(depth: 4) {
name
age
follows { name }
}
}Graph Database Business Practices
1. Business Background
Knowledge Graph : Handles 5‑10 million entities (potentially >100 million) with complex query conditions; relational databases cannot meet low‑latency requirements, so Dgraph is adopted.
Equity Relationship Visualization : Requires multi‑hop traversal over ~30 k companies and ~18 million edges; Dgraph’s optimized graph traversal satisfies performance needs.
2. Solution
A unified architecture connects data pipelines (Kafka → Dgraph Java client) to ingest both historical and real‑time data, batching writes for high throughput. An SDK abstracts Dgraph queries, allowing callers to provide parameters without writing raw DQL.
Conclusion and Outlook
The article covered graph database fundamentals, data modeling reasons for choosing graph stores, detailed Dgraph features, and practical deployments that solve large‑scale, low‑latency query problems. Market forecasts predict the graph‑database market will grow from ~USD 0.8 billion in 2019 to USD 3‑4 billion within six years, representing 5‑10 % of the overall database market.
政采云技术
ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.