Is it possible to bound the length of property path? For example getting all the triples with lengths that are between (m,n) or all that are not between this range? For instance, how could this be done with the following query?
select ?x ?y
where {?x p* ?y}
A property path is a possible route through a graph between two graph nodes. A trivial case is a property path of length exactly 1, which is a triple pattern. Property paths allow for more concise expression of some SPARQL basic graph patterns and also add the ability to match arbitrary length paths.
RDF is a directed, labeled graph data format for representing information in the Web. This specification defines the syntax and semantics of the SPARQL query language for RDF. SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware.
The query consists of two parts: the SELECT clause identifies the variables to appear in the query results, and the WHERE clause provides the basic graph pattern to match against the data graph.
BIND. SPARQL's BIND function allows us to assign a value to a variable.
Some SPARQL engines support a method for doing this directly, with a regular-expression-like syntax. E.g.,
?s :p{n,m} ?o
would be a path with a length between n and m. That syntax is described in SPARQL 1.1 Property Paths: W3C Working Draft 26 January 2010. There is also support for exact lengths, minimum lengths, and maximum lengths. For better of for worse, that syntax didn't make it into the final SPARQL 1.1 standard. Some SPARQL endpoints will still accept it though, so it's worth trying.
But there is a workaround. The idea is to split the candidate path into two parts. By checking how many ways it can be split into two parts, you can find the length of the path. That is, you do something like this to, for instance, find ?s and ?p where they are joined by a path of length ten:
select ?s ?o {
  ?s :p* ?mid .
  ?mid :p* ?o .
}
group by ?s ?o
having (count(?mid) = 10)
Be sure to check the actual counts if you use this approach. It's easy to get an off-by-one (or -two) error depending on how you want to calculate length. There are a few options (whether to count the properties or the nodes, whether to count the endpoints or not, etc.), so a little bit of experimentation is worth while.
For some more examples of how you can use this pattern, have a look at:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With