When working with graph databases, especially large or highly connected graphs, unrestricted traversals can quickly return enormous result sets or run longer than expected. Apache TinkerPop provides several traversal steps that allow you to control how much data is returned and how long a traversal is allowed to run.
In this section, we explore Gremlin’s mechanisms for limiting traversal output, including limit, tail, and range, which restrict the number of elements returned, as well as timeLimit, which bounds execution by elapsed time rather than result count. These steps are essential tools for writing safe, performant, and production-ready Gremlin queries.
We will see how the placement of these steps in a traversal can significantly affect performance, why limiting earlier in the traversal is usually more efficient, and how time-bounded traversals can be used to explore large graphs incrementally particularly in path-finding scenarios.
1. Create a Sample Graph (Indian Literature Domain)
Let's use Author and books relationship graph. Follow below step-by-step procedure to build the Graph.
Step 1: Get the Gremlin traversal instance.
graph = TinkerGraph.open() g = graph.traversal()
Step 2: Create Author, Book and Genre vertices
// ---------- Authors ---------- g.addV('author').property('name', 'Rabindranath Tagore') g.addV('author').property('name', 'R.K. Narayan') g.addV('author').property('name', 'Arundhati Roy') g.addV('author').property('name', 'Chetan Bhagat') g.addV('author').property('name', 'Amartya Sen') // ---------- Books ---------- g.addV('book').property('title', 'Gitanjali').property('year', 1910) g.addV('book').property('title', 'The Guide').property('year', 1958) g.addV('book').property('title', 'Malgudi Days').property('year', 1943) g.addV('book').property('title', 'The God of Small Things').property('year', 1997) g.addV('book').property('title', 'Five Point Someone').property('year', 2004) g.addV('book').property('title', 'Development as Freedom').property('year', 1999) // ---------- Genres ---------- g.addV('genre').property('name', 'Poetry') g.addV('genre').property('name', 'Fiction') g.addV('genre').property('name', 'Non-Fiction')
Step 3: Create Author to Book relationships.
g.V(). has('author','name','Rabindranath Tagore'). addE('wrote'). to(__.V().has('book','title','Gitanjali')) g.V(). has('author','name','R.K. Narayan'). addE('wrote'). to(__.V().has('book','title','The Guide')) g.V(). has('author','name','R.K. Narayan'). addE('wrote'). to(__.V().has('book','title','Malgudi Days')) g.V(). has('author','name','Arundhati Roy'). addE('wrote'). to(__.V().has('book','title','The God of Small Things')) g.V(). has('author','name','Chetan Bhagat'). addE('wrote'). to(__.V().has('book','title','Five Point Someone')) g.V(). has('author','name','Amartya Sen'). addE('wrote'). to(__.V().has('book','title','Development as Freedom'))
Step 4: Create Book to Genre relationship
// ---------- Book -> Genre ---------- g.V(). has('book','title','Gitanjali'). addE('belongsTo'). to(__.V().has('genre','name','Poetry')) g.V(). has('book','title','The Guide'). addE('belongsTo'). to(__.V().has('genre','name','Fiction')) g.V(). has('book','title','Malgudi Days'). addE('belongsTo'). to(__.V().has('genre','name','Fiction')) g.V(). has('book','title','The God of Small Things'). addE('belongsTo'). to(__.V().has('genre','name','Fiction')) g.V(). has('book','title','Five Point Someone'). addE('belongsTo'). to(__.V().has('genre','name','Fiction')) g.V(). has('book','title','Development as Freedom'). addE('belongsTo'). to(__.V().has('genre','name','Non-Fiction'))
2. Examples
Example 1: Return only the first 3 booksg.V(). hasLabel('book'). limit(3). values('title')
gremlin> g.V(). ......1> hasLabel('book'). ......2> limit(3). ......3> values('title') ==>Gitanjali ==>The Guide ==>Malgudi Days
Example 2: Return only the last 1 book
Using tail() step we can get the last book.
g.V(). hasLabel('book'). tail(). values('title')
gremlin> g.V(). ......1> hasLabel('book'). ......2> tail(). ......3> values('title') ==>Development as Freedom
Example 3: Return only last 2 books.
Using tail(2) step we can get the last 2 books.
g.V(). hasLabel('book'). tail(2). values('title')
gremlin> g.V(). ......1> hasLabel('book'). ......2> tail(2). ......3> values('title') ==>Five Point Someone ==>Development as Freedom
Example 4: Using range Instead of limit
range(0,3) is equivalant to limit(3)
g.V(). hasLabel('book'). range(0,3). values('title')
gremlin> g.V(). ......1> hasLabel('book'). ......2> range(0,3). ......3> values('title') ==>Gitanjali ==>The Guide ==>Malgudi Days
Example 5: Find up to 2 books written by 'R.K. Narayan'
g.V().has('author','name','R.K. Narayan'). out('wrote'). limit(2). values('title')
gremlin> g.V().has('author','name','R.K. Narayan'). ......1> out('wrote'). ......2> limit(2). ......3> values('title') ==>The Guide ==>Malgudi Days
Example 6: Search for books by traversing relationships, but stop after 10 ms
g.V(). has('author','name','R.K. Narayan'). repeat(timeLimit(10).out('wrote')). emit(). values('title')
gremlin> g.V(). ......1> has('author','name','R.K. Narayan'). ......2> repeat(timeLimit(10).out('wrote')). ......3> emit(). ......4> values('title') ==>The Guide ==>Malgudi Days
In summary,
· limit(): fastest and safest way to cap results
· range(): pagination-friendly
· tail(): expensive; use carefully
· timeLimit(): guards against runaway traversals
Always limit as early as possible in the traversal
Previous Next Home
No comments:
Post a Comment