Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count tag-to-tag relationships without having it explode?

Tags:

neo4j

I'm using neo4j, storing a simple "content has-many tags" data structure. I'd like to find out "what tags co-exist with what other tags the most?"

I've got around 500K content-to-tag relationships, so unfortunately, that works out to 0.5M^2 posible coexist relationships, and then you need to count how many each type of relationship happens! Or do you? Am I doing this the long way?

It never seems to return, and my CPU is pegged out for quite some time now.

final ExecutionResult result = engine.execute(
 "START metag=node(*)\n"
 + "MATCH metag<-[:HAS_TAG]-content-[:HAS_TAG]->othertag\n"
 + "WHERE metag.name>othertag.name\n"
 + "RETURN metag.name, othertag.name, count(content)\n"
 + "ORDER BY count(content) DESC");
for (Map<String, Object> row : result) {
 System.out.println(row.get("metag.name") + "\t" + row.get("othertag.name") + "\t" + row.get("count(content)"));
}
like image 766
Benjamin H Avatar asked Jan 29 '26 21:01

Benjamin H


1 Answers

You should try to decrease your bound points to make the traversal faster. I assume your graph will always have more tags than content so you should make the content your bound points. Something like

start 
     content = node:node_auto_index(' type:"CONTENT" ')
match
     metatag<-[:HAS_CONTENT]-content-[:HAS_CONTENT]->othertag
where 
     metatag<>othertag
return 
     metatag.name, othertag.name, count(content)  
like image 165
Amit Avatar answered Feb 02 '26 01:02

Amit



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!