Monday, 11 May 2026

Gremlin Traversals: The Foundation of Graph Queries

  

In Apache TinkerPop, a graph query is expressed as a traversal. The term is deliberate and meaningful, rather than issuing a declarative query that describes what data is required, a Gremlin traversal describes how to move through the graph to reach the desired result.

 

At a conceptual level, a traversal starts at one or more elements in the graph vertices or edges and progresses step by step until it reaches a final outcome. Each step represents a well-defined operation, and traversals are built by chaining these steps together in a fluent, readable manner.

 

A traversal may consist of a single step, but in practice it is almost always composed of multiple steps that progressively narrow, transform, or project the result set.

 

The Traversal Source

All Gremlin traversals begin with a traversal source, commonly named g. The traversal source acts as the entry point into the graph and provides the initial steps used to start a traversal.

graph = TinkerGraph.open()
g = graph.traversal()

gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> 
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]

This statement creates a GraphTraversalSource instance backed by the underlying graph. From this point onward, all traversals are expressed by invoking methods on g.

 

Starting Points: V() and E()

Most traversals begin with either:

 

·      g.V(): to start from vertices

·      g.E(): to start from edges

 

The V() step returns a traversal over vertices, while the E() step returns a traversal over edges. These steps can be used both at the beginning of a traversal and, in certain cases, in the middle of a traversal to switch context. Their advanced usage will be explored later in this tutorial series.

 

At this introductory stage, it is helpful to think of:

 

·      g.V() as "looking at all vertices in the graph"

·      g.E() as "looking at all edges in the graph"

 

Additional steps are then applied to filter, navigate, or reshape these elements. Both V() and E() also accept parameters that restrict the traversal to a specific subset of vertices or edges, such as those with known identifiers. For now, we will focus on their parameterless forms.

 

Filtering Traversals with has and hasLabel

To refine the set of vertices or edges under consideration, Gremlin provides filtering steps. Two of the most frequently used filters are:

 

·      hasLabel(label): filters elements by label

·      has(key, value): filters elements by property key and value

 

These steps act as predicates, allowing only elements that satisfy the specified condition to pass through the traversal. While Gremlin offers many variations of the has step, including predicates and composite conditions, these two forms are sufficient to begin writing meaningful traversals.

 

Creating Sample Data: Employee Vertices

Consider the following example graph containing employee data. Each employee is represented as a vertex with the label employee and a set of properties.

 

g.addV('employee').
  property('name', 'Asha').
  property('department', 'Engineering').
  property('location', 'India')

g.addV('employee').
  property('name', 'Ravi').
  property('department', 'Engineering').
  property('location', 'India')

g.addV('employee').
  property('name', 'Kiran').
  property('department', 'Engineering').
  property('location', 'USA')

g.addV('employee').
  property('name', 'Meera').
  property('department', 'HR').
  property('location', 'India')

g.addV('employee').
  property('name', 'Anita').
  property('department', 'HR').
  property('location', 'USA')

gremlin> g.addV('employee').
......1>   property('name', 'Asha').
......2>   property('department', 'Engineering').
......3>   property('location', 'India')
==>v[0]
gremlin> 
gremlin> g.addV('employee').
......1>   property('name', 'Ravi').
......2>   property('department', 'Engineering').
......3>   property('location', 'India')
==>v[4]
gremlin> 
gremlin> g.addV('employee').
......1>   property('name', 'Kiran').
......2>   property('department', 'Engineering').
......3>   property('location', 'USA')
==>v[8]
gremlin> 
gremlin> g.addV('employee').
......1>   property('name', 'Meera').
......2>   property('department', 'HR').
......3>   property('location', 'India')
==>v[12]
gremlin> 
gremlin> g.addV('employee').
......1>   property('name', 'Anita').
......2>   property('department', 'HR').
......3>   property('location', 'USA')
==>v[16]
gremlin>

Querying by Label

The following traversal returns all vertices that have the label employee:

g.V().hasLabel('employee')

This traversal:

·      Starts with all vertices in the graph

·      Filters them to retain only those whose label is employee

 

At this stage, the traversal returns a stream of vertices rather than concrete values.

gremlin> g.V().
......1>   hasLabel('employee').
......2>   valueMap(true)
==>[id:0,label:employee,name:[Asha],location:[India],department:[Engineering]]
==>[id:16,label:employee,name:[Anita],location:[USA],department:[HR]]
==>[id:4,label:employee,name:[Ravi],location:[India],department:[Engineering]]
==>[id:8,label:employee,name:[Kiran],location:[USA],department:[Engineering]]
==>[id:12,label:employee,name:[Meera],location:[India],department:[HR]]

Querying by Property

To locate a specific Vertex by name, the has step can be used:

g.V().has('name', 'Asha')

This traversal returns the vertex representing the employee named Asha. Note that this query does not explicitly check the label. If multiple vertex types share a name property, the result set may include unexpected elements.

gremlin> g.V().
......1>   has('name', 'Asha').
......2>   valueMap(true)
==>[id:0,label:employee,name:[Asha],location:[India],department:[Engineering]]

   

Combining Label and Property Filters

In practice, it is common to combine label and property filters to make queries more precise. The following two traversals are equivalent.

 

·      g.V().hasLabel('employee').has('name', 'Asha')

·      g.V().has('employee', 'name', 'Asha')

 

The second form uses an overloaded version of the has step that accepts the label as its first argument. Both traversals ensure that:

 

·      The element is an employee

·      The name property has the value Asha

 

Choosing between these forms is largely a matter of style and readability.

 

 

gremlin> g.V().
......1>   hasLabel('employee').
......2>   has('name', 'Asha').
......3>   valueMap(true)
==>[id:0,label:employee,name:[Asha],location:[India],department:[Engineering]]
gremlin> 
gremlin> g.V().
......1>   has('employee', 'name', 'Asha').
......2>   valueMap(true)
==>[id:0,label:employee,name:[Asha],location:[India],department:[Engineering]]

Understanding the Traversal Result

When executed in the Gremlin Console, the result of the traversal appears as:

v[4]

gremlin> g.V().has('employee', 'name', 'Ravi')
==>v[4]

   

This output indicates that the traversal returned a vertex with an internal identifier of 4. What is returned is not a simple value, but an instance of the TinkerPop Vertex data structure.

 

At this stage, the traversal itself has not yet produced a concrete object suitable for further manipulation. To do so, a terminal step must be applied.

 

Terminal Steps and Materializing Results

Terminal steps conclude a traversal and convert its result into a concrete object. One commonly used terminal step is next(), which retrieves the next element from the traversal.

 

v = g.V().hasLabel('employee').has('name', 'Asha').next()

   

Here, the variable v now holds a Vertex instance. Although the Gremlin Console uses Groovy as its execution environment, the underlying objects are Java based. This allows standard Java introspection methods to be applied.

 

For example, the following expression reveals the runtime type of the returned object:

 

gremlin> v.getClass()
==>class org.apache.tinkerpop.gremlin.groovy.loaders.SugarLoader$VertexCategory

   

This confirms that the result is a TinkerPop Vertex implementation.

 

Terminal steps play an important role when Gremlin is embedded in applications, as they mark the boundary between traversal construction and application-level logic. Additional terminal steps, such as toList, hasNext, and iterate, will be introduced in later post as we move from console based exploration to production usage.

 

Previous                                                    Next                                                    Home

No comments:

Post a Comment