Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TinkerPop gremlin count vertices only in a path()

When I make a query of a path e.g.:

g.V(1).inE().outV().inE().outV().inE().outV().path()

There are both vertices and edges in the path(), is there any way to count the number of vertices in the path only and ignore edges?

like image 732
Henry Bai Avatar asked Sep 05 '25 03:09

Henry Bai


2 Answers

Gremlin is missing something important to make this really easy to do - it doesn't discern types very well for purposes of filtering, thus TINKERPOP-2234. I've altered your example a bit so that we could have something a little trickier to work with:

gremlin> g.V(1).repeat(outE().inV()).emit().path()
==>[v[1],e[9][1-created->3],v[3]]
==>[v[1],e[7][1-knows->2],v[2]]
==>[v[1],e[8][1-knows->4],v[4]]
==>[v[1],e[8][1-knows->4],v[4],e[10][4-created->5],v[5]]
==>[v[1],e[8][1-knows->4],v[4],e[11][4-created->3],v[3]]

With repeat() we get variable length Path instances so dynamic counting of the vertices is a bit trickier than the fixed example you have in your question where the pattern of the path is known and a count is easy to discern just from the Gremlin itself. So, with a dynamic number of vertices and without TINKERPOP-2234 you have to get creative. A typical strategy is to just filter away the edges by way of some label or property value that is unique to vertices:

gremlin> g.V(1).repeat(outE().inV()).emit().path().map(unfold().hasLabel('person','software').fold())
==>[v[1],v[3]]
==>[v[1],v[2]]
==>[v[1],v[4]]
==>[v[1],v[4],v[5]]
==>[v[1],v[4],v[3]]
gremlin> g.V(1).repeat(outE().inV()).emit().path().map(unfold().hasLabel('person','software').fold()).count(local)
==>2
==>2
==>2
==>3
==>3

Or perhaps use an property unique to all edges:

gremlin> g.V(1).repeat(outE().inV()).emit().path().map(unfold().not(has('weight')).fold())
==>[v[1],v[3]]
==>[v[1],v[2]]
==>[v[1],v[4]]
==>[v[1],v[4],v[5]]
==>[v[1],v[4],v[3]]
gremlin> g.V(1).repeat(outE().inV()).emit().path().map(unfold().not(has('weight')).fold()).count(local)
==>2
==>2
==>2
==>3
==>3

If you don't have these properties or labels in your schema that allows for this you could probably use your traversal pattern to come up with some math to figure it out. In my case, i know that my Path will always be (pathLength + 1) / 2 so:

gremlin> g.V(1).repeat(outE().inV()).emit().path().as('p').math('(p + 1) / 2').by(count(local))
==>2.0
==>2.0
==>2.0
==>3.0
==>3.0

Hopefully, one of those ways will inspire you to a solution.

like image 177
stephen mallette Avatar answered Sep 08 '25 00:09

stephen mallette


+1 for typeOf predicate support in Gremlin (TINKERPOP-2234).

In addition to @stephan's answer, you can also mark and select only vertices:

g.V().repeat(outE().inV().as('v')).times(3).select(all,'v')

Also, if the graph provider support it, you can also use {it.class}:

g.V().repeat(outE().inV().as('v')).times(3).path()
    .map(unfold().groupCount().by({it.class}))
like image 29
Kfir Dadosh Avatar answered Sep 07 '25 23:09

Kfir Dadosh