Interview with JanusGraph PMC Members on Graph Database Landscape, Neo4j Comparison, and Deployment Best Practices
In this interview, JanusGraph PMC members Florian Hockmann and Jason Plurad discuss the project's origins, compare JanusGraph with Neo4j, share advice for production deployments, outline future expectations for JanusGraph and TinkerPop, and provide practical tips for graph modeling and community contribution.
Introduction
In the third part of deep‑diving into databases, we spoke with JanusGraph PMC members Florian Hockmann (G DATA) and Jason Plurad (IBM) to get guidance on the broader graph ecosystem.
JanusGraph is a scalable graph database capable of storing billions of vertices and edges across a multi‑machine cluster. It originated from the Titan project and has been under Linux Foundation governance since 2017.
Background of the Interviewees
Florian Hockmann is a R&D engineer at G DATA, using a graph database to link millions of malware samples for threat analysis.
Jason Plurad is an open‑source developer and IBM Cognitive Applications advocate, active in the JanusGraph and Apache TinkerPop communities, helping product teams and customers adopt graph and other open‑source data technologies.
How They Work with JanusGraph
IBM is a founding member of JanusGraph; the team has used Titan (JanusGraph’s predecessor) for several products because of its open‑source license and flexibility.
After Titan’s original company was acquired, the community created JanusGraph under the Linux Foundation, and IBM contributed heavily to its development.
Florian contributed to Apache TinkerPop’s Gremlin‑Net implementation and assists the community via mailing lists and StackOverflow.
Choosing Between Neo4j and JanusGraph
Both JanusGraph and Neo4j support the Apache TinkerPop graph framework, enabling the same Gremlin traversals across multiple graph databases such as Amazon Neptune, Azure Cosmos DB, and DataStax Enterprise Graph.
License considerations are important: JanusGraph uses the permissive Apache License, while Neo4j Community Edition uses GPL, and the enterprise edition requires a commercial subscription.
Key technical differences: Neo4j is a self‑contained system with its own storage engine, index, server, protocol, and query language, whereas JanusGraph relies on external components (e.g., Elasticsearch or Solr for indexing, Cassandra or HBase for storage), offering greater flexibility but more complexity.
JanusGraph’s use of TinkerPop provides a standard graph API similar to SQL for relational databases, while Neo4j primarily promotes its own Cypher language.
Advice for Deploying JanusGraph in Production
Start with a small, simple deployment and scale gradually; JanusGraph’s documentation includes a “deployment scenarios” chapter to guide this process.
Familiarize yourself with TinkerPop and Gremlin; resources include TinkerPop tutorials and the free e‑book “Practical Gremlin.”
Contribute to the open‑source project, engage with the community, and develop operational expertise for the external storage backends (Cassandra, HBase, etc.) that JanusGraph depends on.
Future Expectations for JanusGraph and TinkerPop
The community hopes for significant backend improvements, such as a performant in‑memory backend for production use, and anticipates the upcoming TinkerPop 4 release, which will broaden Gremlin execution engines (single‑threaded, Spark‑based, etc.).
Efforts are also underway to create a more abstract data model that could extend TinkerPop beyond graph databases.
Performance Modeling Tips
Evaluate new or changed schemas with realistic data and representative queries before production.
Identify and mitigate “super‑node” issues early, and decide whether an entity should be modeled as a vertex or as a property based on query patterns.
Iteratively refine the model, monitor branch factors, and consider denormalization to reduce early‑stage query explosion.
Getting Involved with JanusGraph
Contributions can be code, documentation, or community support; start by browsing open issues on GitHub, opening new ones, and submitting pull requests.
The project welcomes contributors across many modules (storage adapters, indexing, client libraries, etc.) and encourages collaboration through forums, documentation updates, example projects, and conference talks.
Architects Research Society
A daily treasure trove for architects, expanding your view and depth. We share enterprise, business, application, data, technology, and security architecture, discuss frameworks, planning, governance, standards, and implementation, and explore emerging styles such as microservices, event‑driven, micro‑frontend, big data, data warehousing, IoT, and AI architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.