Monday, 1 June 2026

Understanding toList(), toSet(), toBulkSet() and fill() in Gremlin

  

When working with Gremlin traversals, the result of a query is typically returned as a stream of elements. However, in real-world applications, we often need to collect those results into structured containers such as lists or sets for further processing, reporting, or reuse. This is where Gremlin’s terminal collection steps toList(), toSet(), toBulkSet(), and fill() become extremely useful.

 

The toList() step collects traversal results into an ordered list, preserving duplicates and iteration order. This is ideal when every occurrence matters, such as counting frequency or maintaining sequence. On the other hand, toSet() removes duplicates automatically and returns only unique values, making it useful when distinct results are required without additional filtering logic.

 

For more advanced use cases, toBulkSet() provides a weighted collection known as a BulkSet. Unlike a normal set, a BulkSet retains all values along with their occurrence counts. This makes it particularly powerful when you need both uniqueness and frequency information without performing separate aggregation queries.

 

Finally, fill() offers a different approach by storing traversal results into a pre-existing collection. This allows greater flexibility, especially when integrating Gremlin queries into application code where variables may already be defined or reused.

 

Understanding the differences between these steps helps you control how traversal results are materialized, manage duplicates efficiently, and choose the right data structure for your use case. Whether you are analyzing graph data, building APIs, or debugging traversals, mastering these collection steps is essential for writing clean and efficient Gremlin queries.

 

Let’s create a small student-course graph using the TinkerPop console style.

graph = TinkerGraph.open()
g = graph.traversal()

// Add Students
g.addV('student').property('name','Alice').property('grade',90).iterate()
g.addV('student').property('name','Bob').property('grade',85).iterate()
g.addV('student').property('name','Charlie').property('grade',90).iterate()
g.addV('student').property('name','David').property('grade',75).iterate()
g.addV('student').property('name','Eva').property('grade',85).iterate()

Let’s print the Graph.

gremlin> g.V().valueMap(true)
==>[id:0,label:student,grade:[90],name:[Alice]]
==>[id:3,label:student,grade:[85],name:[Bob]]
==>[id:6,label:student,grade:[90],name:[Charlie]]
==>[id:9,label:student,grade:[75],name:[David]]
==>[id:12,label:student,grade:[85],name:[Eva]]

Example 1: Using toList() (Keeps Duplicates)

g.V().
  values('grade').
  toList().
  join(',')

gremlin> g.V().
......1>   values('grade').
......2>   toList().
......3>   join(',')
==>90,85,90,75,85

   

As you see the output, grade 85 and 90 are repeated twice.

 

Example 2: Using toSet() (Removes Duplicates)

 

g.V().
  values('grade').
  toSet().
  join(',')

gremlin> g.V().
......1>   values('grade').
......2>   toSet().
......3>   join(',')
==>85,90,75

   

Example 3: Using toBulkSet() (Weighted Set)

 

g.V().
  values('grade').
  toBulkSet().
  asBulk()

gremlin> g.V().
......1>   values('grade').
......2>   toBulkSet().
......3>   asBulk()
==>90=2
==>85=2
==>75=1

Example 4: Using fill() to store the data into a list

myList = []

g.V().
  values('grade').
  fill(myList)

gremlin> g.V().
......1>   values('grade').
......2>   fill(myList)
==>90
==>85
==>90
==>75
==>85
gremlin> 
gremlin> println myList
[90, 85, 90, 75, 85]

Example 5: Using fill to store the data into a set.

mySet = [] as Set

g.V().
  values('grade').
  fill(mySet)

gremlin> mySet = [] as Set
gremlin> 
gremlin> g.V().
......1>   values('grade').
......2>   fill(mySet)
==>90
==>85
==>75
gremlin> 
gremlin> println mySet
[90, 85, 75]

 

 

Previous                                                    Next                                                    Home

No comments:

Post a Comment