Databases 20 min read

Understanding Graph Databases: Concepts, Models, and Dgraph Implementation

This article introduces graph databases as a NoSQL solution, explains property‑graph modeling, compares relational and document stores, evaluates several graph products, and details Dgraph’s architecture, indexing, query language, and real‑world business applications such as knowledge graphs and equity‑relationship analysis.

政采云技术

Jun 30, 2023

Understanding Graph Databases: Concepts, Models, and Dgraph Implementation

What Is a Graph Database and Its Application Scope?

Graph databases are a type of NoSQL system that store entities as vertices and relationships as edges, enabling efficient storage, retrieval, and traversal of highly connected data such as social networks, web graphs, transportation networks, and corporate equity structures.

Property Graph

A property‑graph model extends basic graph theory by attaching key‑value attributes to vertices and edges. Each vertex contains a unique identifier, outgoing and incoming edge sets, and a property set; each edge contains a unique identifier, source and target vertices, a label, and its own property set.

Typical Use Cases

Social networks – people as vertices, friendships as edges.

Web graph – webpages as vertices, hyperlinks as edges.

Road/railway networks – intersections as vertices, routes as edges.

Corporate equity – companies as vertices, shareholding relations as edges.

Database Relationships and Selection

Data models drive database choice: relational models suit one‑to‑many data, document models suit hierarchical data, while graph models excel at many‑to‑many, highly connected scenarios. Graph databases store both data and topology, offering intuitive modeling and efficient multi‑hop queries.

Types of NoSQL Databases

Document databases (e.g., MongoDB, Elasticsearch)

Key‑value stores (e.g., Redis)

Columnar stores (e.g., HBase, Cassandra)

Graph databases (e.g., NebulaGraph, Neo4j, OrientDB, Dgraph)

Graph Database Types and Selection Criteria

Selection focuses on open‑source licensing, distributed architecture, millisecond‑level multi‑hop latency, and support for billions of vertices/edges. A comparison table evaluates JanusGraph, Nebula Graph, Dgraph, and Neo4j across openness, deployment cost, learning curve, community, and distributed support.

Dgraph Overview

1. Architecture

Dgraph consists of three components: Zero (cluster management), Alpha (data storage and query processing), and Ratel (web UI). Zero balances groups of Alpha nodes using Raft consensus.

2. Scaling, Replication, and Sharding

High‑availability replication runs three Zero and three Alpha instances.

Sharding splits data when a group exceeds ~1 TB, distributing shards across many Alpha nodes.

Self‑healing is achieved via Kubernetes in the cloud offering.

3. Data Storage

Data is stored as triples (subject‑predicate‑object) with a global 64‑bit UID. Triples sharing the same predicate are grouped into a posting list keyed by <subject, predicate>. Posting lists are compressed using delta encoding, achieving ~10× compression.

4. Indexing

All scalar types have default indexes; strings support hash, exact, term, full‑text, and regex tokenizers. Index keys are <predicate, token>, and updates modify posting lists accordingly.

5. Querying

Queries start from a UID list and traverse edges. Example traversal query:

{
  movies(func: uid(0xb5849, 0x394c)) {
    uid
    m_name
    code
    star { s_name }
  }
}

Functions allow name‑based lookup, e.g.,

{
  movie(func: alloftext(name@en, "the dog which barks")) {
    name@en
  }
}

Filters combine functions with predicates, and recursive queries enable deep‑hop traversals:

{
  find_follower(func: uid(A_UID)) @recurse(depth: 4) {
    name
    age
    follows { name }
  }
}

Graph Database Business Practices

1. Business Background

Knowledge Graph : Handles 5‑10 million entities (potentially >100 million) with complex query conditions; relational databases cannot meet low‑latency requirements, so Dgraph is adopted.

Equity Relationship Visualization : Requires multi‑hop traversal over ~30 k companies and ~18 million edges; Dgraph’s optimized graph traversal satisfies performance needs.

2. Solution

A unified architecture connects data pipelines (Kafka → Dgraph Java client) to ingest both historical and real‑time data, batching writes for high throughput. An SDK abstracts Dgraph queries, allowing callers to provide parameters without writing raw DQL.

Conclusion and Outlook

The article covered graph database fundamentals, data modeling reasons for choosing graph stores, detailed Dgraph features, and practical deployments that solve large‑scale, low‑latency query problems. Market forecasts predict the graph‑database market will grow from ~USD 0.8 billion in 2019 to USD 3‑4 billion within six years, representing 5‑10 % of the overall database market.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

graph database data modeling Dgraph Property Graph graph architecture graph query

Written by

政采云技术

ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.