Programming for beginners: Selective Path Traversals in Gremlin using from() and to()

When working with Apache TinkerPop and Gremlin, the path() step is one of the most powerful tools for understanding how a traversal reached a particular result. It allows us to inspect the full journey of a traversal — every vertex, edge, or value encountered along the way. This is invaluable for debugging, analysis, and building features such as route discovery, lineage tracking, and graph explainability.

However, returning the entire path is not always desirable. In many real-world scenarios, parts of the traversal are merely transitional and do not add value to the final output. Including them can make results noisy, harder to interpret, and more expensive in terms of memory and CPU usage — especially as traversals grow deeper or more complex.

Starting with Apache TinkerPop 3.2.5, Gremlin introduced the ability to selectively control which portions of a path are returned using the from() and to() modulators on the path() step. These modulators allow us to label meaningful points in a traversal and then extract only the relevant sub-path, rather than the full traversal history.

In this post, we’ll explore how from() and to() can be used to:

· Exclude unimportant starting points from path results

· Focus on specific segments of a traversal

· Reduce redundant or repetitive path output

· Improve the clarity and usefulness of path-based queries

By the end of this post, you should have a solid understanding of when and how to use from() and to() to make your Gremlin path queries more precise, readable, and efficient, and why these modulators are essential tools in any serious TinkerPop user’s toolkit.

Explaining from() and to()

In Gremlin, the path() step captures every object visited during a traversal. While this is powerful, it often returns more information than needed. The from() and to() modulators (introduced in TinkerPop 3.2.5) allow you to extract only a specific segment of the traversal path, based on labels defined using the as() step.

· from(label): Starts the returned path from the step labeled with label

· to(label): Ends the returned path at the step labeled with label

Together, they let you slice a path, returning only the meaningful portion instead of the entire traversal history.

Demo Graph: Company Reporting Hierarchy

We’ll model a simple organizational structure using a reports_to relationship.

Engineer → Manager → Director → Senior Director → VP

Engineers
  ├─ Alice
  ├─ Bob
  ├─ Carol
      ↓
Managers
  ├─ Mallory
  ├─ Trent
      ↓
Directors
  ├─ Diana
  ├─ Victor
      ↓
Senior Director
  └─ Sophia
      ↓
VP
  └─ Vincent

Step 1: Create Employees

// Engineers
g.addV('employee').property('name', 'Alice').property('role', 'Engineer').as('e1').
  addV('employee').property('name', 'Bob').property('role', 'Engineer').as('e2').
  addV('employee').property('name', 'Carol').property('role', 'Engineer').as('e3')

// Managers
g.addV('employee').property('name', 'Mallory').property('role', 'Manager').as('m1').
  addV('employee').property('name', 'Trent').property('role', 'Manager').as('m2')

// Directors
g.addV('employee').property('name', 'Diana').property('role', 'Director').as('d1').
  addV('employee').property('name', 'Victor').property('role', 'Director').as('d2')

// Senior Director & VP
g.addV('employee').property('name', 'Sophia').property('role', 'Senior Director').as('sd').
  addV('employee').property('name', 'Vincent').property('role', 'VP').as('vp')

Step 2: Create Reporting Relationships

// Engineers → Managers
g.V().has('employee','name','Alice').
  addE('reports_to').
  to(__.V().has('employee','name','Mallory'))

g.V().has('employee','name','Bob').
  addE('reports_to').to(__.V().has('employee','name','Mallory'))

g.V().has('employee','name','Carol').
  addE('reports_to').to(__.V().has('employee','name','Trent'))

// Managers → Directors
g.V().has('employee','name','Mallory').
  addE('reports_to').to(__.V().has('employee','name','Diana'))

g.V().has('employee','name','Trent').
  addE('reports_to').to(__.V().has('employee','name','Victor'))

// Directors → Senior Director
g.V().has('employee','name','Diana').
  addE('reports_to').to(__.V().has('employee','name','Sophia'))

g.V().has('employee','name','Victor').
  addE('reports_to').to(__.V().has('employee','name','Sophia'))

// Senior Director → VP
g.V().has('employee','name','Sophia').
  addE('reports_to').to(__.V().has('employee','name','Vincent'))

Step 3: Full Reporting Chain (All Paths)

g.V().has('employee', 'role', 'Engineer').
  repeat(out('reports_to')).until(has('role','VP')).
  path().by('name')

gremlin> g.V().has('employee', 'role', 'Engineer').
......1>   repeat(out('reports_to')).until(has('role','VP')).
......2>   path().by('name')
==>[Alice,Mallory,Diana,Sophia,Vincent]
==>[Bob,Mallory,Diana,Sophia,Vincent]
==>[Carol,Trent,Victor,Sophia,Vincent]

gremlin> g.V().has('employee', 'role', 'Engineer').
......1>   repeat(out('reports_to')).until(has('role','VP')).
......2>   path().by('name')
==>[Alice,Mallory,Diana,Sophia,Vincent]
==>[Bob,Mallory,Diana,Sophia,Vincent]
==>[Carol,Trent,Victor,Sophia,Vincent]

Let's find the reporting hierarchy for the Engineer Alice.

g.V().
  has('employee', 'name', 'Alice').
  out('reports_to').
  out('reports_to').
  out('reports_to').
  out('reports_to').
  path().
  by('name')

gremlin> g.V().
......1>   has('employee', 'name', 'Alice').
......2>   out('reports_to').
......3>   out('reports_to').
......4>   out('reports_to').
......5>   out('reports_to').
......6>   path().
......7>   by('name')
==>[Alice,Mallory,Diana,Sophia,Vincent]

Now I want the reporting hierarchy of Alice till SeniorDirector level in the overall path.

g.V().
  has('employee', 'name', 'Alice').
  as('engineer').
  out('reports_to').
  out('reports_to').
  as('seniorDirector').
  out('reports_to').
  out('reports_to').
  path().
  by('name').
  from('engineer').
  to('seniorDirector')

gremlin> g.V().
......1>   has('employee', 'name', 'Alice').
......2>   as('engineer').
......3>   out('reports_to').
......4>   out('reports_to').
......5>   as('seniorDirector').
......6>   out('reports_to').
......7>   out('reports_to').
......8>   path().
......9>   by('name').
.....10>   from('engineer').
.....11>   to('seniorDirector')
==>[Alice,Mallory,Diana]

Let's say I want to VP that Alice is reporting to.

g.V().
  has('employee', 'name', 'Alice').
  out('reports_to').
  out('reports_to').
  out('reports_to').
  out('reports_to').
  as('VP').
  path().
  by('name').
  from('VP')

gremlin> g.V().
......1>   has('employee', 'name', 'Alice').
......2>   out('reports_to').
......3>   out('reports_to').
......4>   out('reports_to').
......5>   out('reports_to').
......6>   as('VP').
......7>   path().
......8>   by('name').
......9>   from('VP')
==>[Vincent]

I want to know the Director, Senior Director and VP that Alice reports to.

g.V().
  has('employee', 'name', 'Alice').
  out('reports_to').
  out('reports_to').
  as('Director').
  out('reports_to').
  out('reports_to').
  path().
  by('name').
  from('Director')

gremlin> g.V().
......1>   has('employee', 'name', 'Alice').
......2>   out('reports_to').
......3>   out('reports_to').
......4>   as('Director').
......5>   out('reports_to').
......6>   out('reports_to').
......7>   path().
......8>   by('name').
......9>   from('Director')
==>[Diana,Sophia,Vincent]

I want to know the Director, Senior Director that Alice Reports to.

g.V().
  has('employee', 'name', 'Alice').
  out('reports_to').
  out('reports_to').
  as('Director').
  out('reports_to').
  as('SeniorDirector').
  out('reports_to').
  path().
  by('name').
  from('Director').
  to('SeniorDirector')

gremlin> g.V().
......1>   has('employee', 'name', 'Alice').
......2>   out('reports_to').
......3>   out('reports_to').
......4>   as('Director').
......5>   out('reports_to').
......6>   as('SeniorDirector').
......7>   out('reports_to').
......8>   path().
......9>   by('name').
.....10>   from('Director').
.....11>   to('SeniorDirector')
==>[Diana,Sophia]

In summary, the from() and to() modulators give you precise control over what portion of a traversal path is returned when using the path() step in Gremlin. Rather than being forced to return the entire traversal history, these modulators allow you to extract only the meaningful segment of a path based on labeled steps.

At a high level:

· path() captures everything the traversal touches

· as() marks important waypoints

· from(label) defines where the returned path begins

· to(label) defines where the returned path ends

Crucially, these modulators do not affect traversal execution, they only influence how the path is presented. You can traverse deeper, apply filters, or continue beyond the selected segment while returning a clean, focused sub-path.

Key Points to Remember

· Use from() to skip irrelevant starting points

· Use to() to cap paths at a meaningful boundary

· Use both together to slice paths with precision

· Remember that the traversal continues even if the path does not between the from and to

Previous Next Home

Programming for beginners

Sunday, 17 May 2026

Selective Path Traversals in Gremlin using from() and to()

No comments:

Post a Comment