The latest collection of technologies from open source software development group, TinkerPop, which work in the graph database area has been released. TinkerPop 2.4 includes graph database frameworks, algorithm packages and servers, and has been given the name “Gremlin Without a Cause".
The name is a pun on the fact that Gremlin, the graph database query language, is one of the primary reasons why TinkerPop was formed, and has also given the group the opportunity to design a logo depicting James Dean dressed as a Gremlin.
If you’re not familiar with graph databases, they make use of the Resource Description Framework, RDF. In RDF, all expressions are collections of triples, each consisting of a subject, and object, and a predicate or property that denotes the relationship between the subject and object.
If you wanted to represent the triple:
Fred lives in Denver
you could have a node called Fred, another node called Denver, and the predicate could be ‘City’ (or maybe ‘Address’ for more detailed data).
You need a special query language for working with graph databases because of the fact that a property graph can be traversed in many ways, starting with the straightforward paths then adding operations such as filter on edges, retrace steps, and update counters.
Gremlin is a language that can concisely express routes through the graph including these and other operations. It’s a Domain Specific Language (DSL) written in Groovy that compiles down to Pipes.
The core of the TinkerPop stack is the Blueprints framework, which Marko Rodriguez, the main developer behind TinkerPop, describes as “the JDBC of Graph Databases”. Blueprints provides a collection of generic interfaces that mean you can develop graph-based applications without introducing explicit dependencies on concrete Graph Database implementations. It also provides concrete bindings for Neo4J, OrientDB and Dex graph databases. Gremlin sits on top of this framework.
The main changes to Blueprints in this release start with the fact that the VertexQuery and GraphQuery API now make use of general predicates so adding support for push-down predicates. GraphFactory has been improved with support for dynamic determination of graph constructors.
Pipes is the TinkerPop dataflow framework that uses Process Graphs. A process graph is composed of Pipe vertices connected by communication edges. In this new version, it has full support for the predicate operator in Blueprints.
Gremlin itself has benefited from query optimization so that Gremlin compilation is more efficient, and has branch factor support for more operators. The list of improvements provides links to the full release notes for the new versions of the components.