I am trying to compare users with according to their common interests in this graph.
I know why the following query produces duplicate pairs but can't think of a good way in cypher to avoid it. Is there any way to do it without looping in cypher?
neo4j-sh (?)$ start n=node(*) match p=n-[:LIKES]->item<-[:LIKES]-other where n <> other return n.name,other.name,collect(item.name) as common, count(*) as freq order by freq desc;
==> +-----------------------------------------------+
==> | n.name | other.name | common | freq |
==> +-----------------------------------------------+
==> | "u1" | "u2" | ["f1","f2","f3"] | 3 |
==> | "u2" | "u1" | ["f1","f2","f3"] | 3 |
==> | "u1" | "u3" | ["f1","f2"] | 2 |
==> | "u3" | "u2" | ["f1","f2"] | 2 |
==> | "u2" | "u3" | ["f1","f2"] | 2 |
==> | "u3" | "u1" | ["f1","f2"] | 2 |
==> | "u4" | "u3" | ["f1"] | 1 |
==> | "u4" | "u2" | ["f1"] | 1 |
==> | "u4" | "u1" | ["f1"] | 1 |
==> | "u2" | "u4" | ["f1"] | 1 |
==> | "u1" | "u4" | ["f1"] | 1 |
==> | "u3" | "u4" | ["f1"] | 1 |
==> +-----------------------------------------------+
In order to avoid having duplicates in the form of a--b and b--a, you can exclude one of the combinations in your WHERE clause with
WHERE ID(a) < ID(b)
making your above query
start n=node(*) match p=n-[:LIKES]->item<-[:LIKES]-other where ID(n) < ID(other) return n.name,other.name,collect(item.name) as common, count(*) as freq order by freq desc;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With