2016-03-15 31 views
2

使用Neo4J 2.1.5这个密码查询很慢,有没有优化?

数据:

2000人
目标:对于每个人,计算出总的朋友,朋友的朋友,朋友的朋友的朋友。
结果如下:
Person FullName |朋友总数| Friends-2 total | Friends-3 total |全球总数

MATCH (person:Person) 
WITH person 
OPTIONAL MATCH person-[:KNOWS]-(p2:Person) 
WITH person, count(p2) as f1 
OPTIONAL MATCH path = shortestPath(person-[:KNOWS*..2]-(f2:Person)) 
WHERE length(path) = 2 
WITH count(nodes(path)[-1]) AS f2, person, f1 
OPTIONAL MATCH path = shortestPath(person-[:KNOWS*..3]-(f3:Person)) 
WHERE length(path) = 3 
WITH count(nodes(path)[-1]) AS f3, person, f2, f1 
RETURN person._firstName + " " + person._lastName, f1, f2, f3, f1+f2+f3 AS total 

这些技巧是为了避免错误的计算与cylic图;这就是为什么我使用shortestPath

但是,此查询持续很长时间:60秒! 有没有可能的优化?

回答

1

[编辑]

这是否适合您?

MATCH (person:Person) 
OPTIONAL MATCH (person)-[:KNOWS]-(p1:Person) 
WITH person, COALESCE(COLLECT(p1),[]) AS p1s 
WITH person, CASE p1s WHEN [] THEN [NULL] ELSE p1s END AS p1s 
UNWIND p1s AS p1 
OPTIONAL MATCH (p1)-[:KNOWS]-(p2:Person) 
WHERE NOT ((p2 = person) OR (p2 IN p1s)) 
WITH person, p1s, COALESCE(COLLECT(DISTINCT p2),[]) AS p2s 
WITH person, p1s, CASE p2s WHEN [] THEN [NULL] ELSE p2s END AS p2s UNWIND p2s AS p2 
OPTIONAL MATCH (p2)-[:KNOWS]-(p3:Person) 
WHERE NOT ((p3 = person) OR (p3 IN p1s) OR (p3 IN p2s)) 
WITH person, 
    CASE p1s WHEN [NULL] THEN 0 ELSE SIZE(p1s) END AS f1, 
    CASE p2s WHEN [NULL] THEN 0 ELSE SIZE(p2s) END AS f2, 
    COUNT(DISTINCT p3) AS f3 
RETURN person.firstName + " " + person.lastName, f1, f2, f3, f1+f2+f3 AS total; 

每个朋友只计算一次。

下面是对一些比较模糊的策略的解释。查询必须用[NULL]替换空的p1sp2s集合,以便UNWIND不会中止查询的其余部分。然后,在计算收藏品的尺寸时,我们需要给[NULL]收藏品计数0

+0

f1是正确的,但它不会为f2和f3返回好结果。 我期待51为f2,但它返回73. – Mik378

+0

我不知道什么可能是错误的在您的查询... – Mik378

+0

添加一个'distinct':'COLLECT(distinct p2)'帮助,但它仍然是53 51. – Mik378