Tuesday, 2 June 2026

Working with Vertex and Edge Labels in Gremlin

  

When designing a graph data model, labels play a crucial role in organizing and querying your data effectively. In Apache TinkerPop’s Apache TinkerPop Gremlin, both vertices and edges can be assigned meaningful labels that act as logical types within your graph.

 

In this post, we’ll explore:

 

·      What vertex and edge labels are

·      Why labels matter in graph design

·      How to query using labels in Gremlin

·      Different ways to filter using hasLabel(), label(), and has()

·      Working with multiple labels

·      Performance considerations and indexing strategies

 

We’ll also discuss:

 

·      When to rely on labels vs. properties

·      How label indexing impacts performance

·      Best practices for large-scale graphs

·      When to use a property as a surrogate for labels

 

By the end of this article, you’ll understand how to design cleaner graph schemas and write more optimized Gremlin traversals using labels effectively.

 

1. Demo Graph: Company Management System

Let's build a simple "Company Management System" graph to demo the examples.

 

Vertex Labels

·      employee

·      department

·      project

·      location

 

Edge Labels

·      works_in

·      manages

·      assigned_to

·      located_at

 

Step 1: Create a new in-memory graph

graph = TinkerGraph.open()
g = graph.traversal()

   

Step 2: Create Departments

 

eng = g.addV('department').property('name','Engineering').next()
hr  = g.addV('department').property('name','HR').next()

   

Step 3: Create Locations

 

blr = g.addV('location').property('city','Bangalore').next()
hyd = g.addV('location').property('city','Hyderabad').next()

Step 4: Create Employees

alice = g.addV('employee').
            property('name','Alice').
            property('role','Manager').
            next()

bob = g.addV('employee').
          property('name','Bob').
          property('role','Developer').
          next()

carol = g.addV('employee').
            property('name','Carol').
            property('role','HR Executive').
            next()

Step 5: Create Projects

p1 = g.addV('project').property('name','AI Platform').next()
p2 = g.addV('project').property('name','HR Automation').next()

Step 6: Create Relationships

g.addE('works_in').from(alice).to(eng).iterate()
g.addE('works_in').from(bob).to(eng).iterate()
g.addE('works_in').from(carol).to(hr).iterate()

g.addE('manages').from(alice).to(bob).iterate()

g.addE('assigned_to').from(bob).to(p1).iterate()
g.addE('assigned_to').from(carol).to(p2).iterate()

g.addE('located_at').from(eng).to(blr).iterate()
g.addE('located_at').from(hr).to(hyd).iterate()

gremlin> g.V().valueMap(true)
==>[id:0,label:department,name:[Engineering]]
==>[id:17,label:project,name:[AI Platform]]
==>[id:2,label:department,name:[HR]]
==>[id:19,label:project,name:[HR Automation]]
==>[id:4,label:location,city:[Bangalore]]
==>[id:6,label:location,city:[Hyderabad]]
==>[id:8,label:employee,role:[Manager],name:[Alice]]
==>[id:11,label:employee,role:[Developer],name:[Bob]]
==>[id:14,label:employee,role:[HR Executive],name:[Carol]]
gremlin> 
gremlin> 
gremlin> g.E().valueMap(true)
==>[id:21,label:works_in]
==>[id:22,label:works_in]
==>[id:23,label:works_in]
==>[id:24,label:manages]
==>[id:25,label:assigned_to]
==>[id:26,label:assigned_to]
==>[id:27,label:located_at]
==>[id:28,label:located_at]

gremlin> g.V().valueMap(true)
==>[id:0,label:department,name:[Engineering]]
==>[id:17,label:project,name:[AI Platform]]
==>[id:2,label:department,name:[HR]]
==>[id:19,label:project,name:[HR Automation]]
==>[id:4,label:location,city:[Bangalore]]
==>[id:6,label:location,city:[Hyderabad]]
==>[id:8,label:employee,role:[Manager],name:[Alice]]
==>[id:11,label:employee,role:[Developer],name:[Bob]]
==>[id:14,label:employee,role:[HR Executive],name:[Carol]]
gremlin> 
gremlin> 
gremlin> g.E().valueMap(true)
==>[id:21,label:works_in]
==>[id:22,label:works_in]
==>[id:23,label:works_in]
==>[id:24,label:manages]
==>[id:25,label:assigned_to]
==>[id:26,label:assigned_to]
==>[id:27,label:located_at]
==>[id:28,label:located_at]

   

2. What Vertex and Edge Labels Are?

In Apache TinkerPop Gremlin, every vertex and edge has a label. Think of a label as the type of the graph element.

 

For example, when we created vertices, we passed employee, department, project and location as arguments to addV method.

 

g.addV('employee')
g.addV('department')
g.addV('project')
g.addV('location')

   

So when we run following statement, it return the label that Alice belongs to.

 

g.V().
  has('name','Alice').
  label()

gremlin> g.V().
......1>   has('name','Alice').
......2>   label()
==>employee

   

Edge Labels

g.addE('works_in') statement creates an edge with label 'works_in'.

 

g.V().
  has('name','Alice').
  outE().
  label()

gremlin> g.V().
......1>   has('name','Alice').
......2>   outE().
......3>   label()
==>works_in
==>manages

   

Edge labels describe how vertices are connected.

 

3. Why Labels Matter in Graph Design?

Labels are not just names, they define structure and meaning. We can filter out the vertices and edges based on their meaning/labels.

 

g.V().
  hasLabel('employee').
  valueMap(true)

gremlin> g.V().
......1>   hasLabel('employee').
......2>   valueMap(true)
==>[id:8,label:employee,role:[Manager],name:[Alice]]
==>[id:11,label:employee,role:[Developer],name:[Bob]]
==>[id:14,label:employee,role:[HR Executive],name:[Carol]]

   

4. How to Query Using Labels in Gremlin?

Following statement gets all the employees.

 

g.V().
  hasLabel('employee').
  valueMap(true)

gremlin> g.V().
......1>   hasLabel('employee').
......2>   valueMap(true)
==>[id:8,label:employee,role:[Manager],name:[Alice]]
==>[id:11,label:employee,role:[Developer],name:[Bob]]
==>[id:14,label:employee,role:[HR Executive],name:[Carol]]

g.V().
  hasLabel('employee').
  values('name')

g.V().
  hasLabel('employee').
  values('name')

   

Similarly following statement print all the departments.

 

g.V().
  hasLabel('department').
  values('name')

gremlin> g.V().
......1>   hasLabel('department').
......2>   values('name')
==>Engineering
==>HR

   

Get All Employees Working in Engineering

 

g.V().
  hasLabel('employee').
  as('emp').
  out('works_in').
  hasLabel('department').
  has('name','Engineering').
  select('emp').
  values('name')

gremlin> g.V().
......1>   hasLabel('employee').
......2>   as('emp').
......3>   out('works_in').
......4>   hasLabel('department').
......5>   has('name','Engineering').
......6>   select('emp').
......7>   values('name')
==>Alice
==>Bob

   

We can even write above query like below.

 

g.V().
  hasLabel('department').
  has('name', 'Engineering').
  in('works_in').
  values('name')

gremlin> g.V().
......1>   hasLabel('department').
......2>   has('name', 'Engineering').
......3>   in('works_in').
......4>   values('name')
==>Alice
==>Bob

4. Filter elements using label() step

g.V().
  where(label().is(eq('employee'))).
  values('name')

gremlin> g.V().
......1>   where(label().is(eq('employee'))).
......2>   values('name')
==>Alice
==>Bob
==>Carol

   

Using has(label, value)

 

g.V().
  has(label, 'employee').
  values('name')

gremlin> g.V().
......1>   has(label, 'employee').
......2>   values('name')
==>Alice
==>Bob
==>Carol

   

Using three parameter has()

 

g.V().
  has('employee','name','Bob').
  values('name')

gremlin> g.V().
......1>   has('employee','name','Bob').
......2>   values('name')
==>Bob

5. Working with Multiple Labels

Gremlin allows multiple labels in one step.

 

Get All Employees and Departments

g.V().
  hasLabel('employee', 'department').
  values('name')

gremlin> g.V().
......1>   hasLabel('employee', 'department').
......2>   values('name')
==>Engineering
==>HR
==>Alice
==>Bob
==>Carol

   

For Edges

 

g.E().
  hasLabel('works_in','assigned_to').
  valueMap(true)

gremlin> g.E().
......1>   hasLabel('works_in','assigned_to').
......2>   valueMap(true)
==>[id:21,label:works_in]
==>[id:22,label:works_in]
==>[id:23,label:works_in]
==>[id:25,label:assigned_to]
==>[id:26,label:assigned_to]

   

6. Performance Considerations and Indexing Strategies

Not all graph databases index labels. Some do, some don’t and some partially do.

 

Suppose if labels are not indexed, following statement scans entire vertex set.

 

g.V().hasLabel('employee')

  If your graph engine does not index labels, then add a property like below.

 

g.addV('entity').property('type','employee')

   

Then index the property 'type' and query like below. This becomes index-backed.

 

g.V().has('type','employee')

  

Previous                                                    Next                                                    Home

No comments:

Post a Comment