我正在Ubuntu Linux上使用Python 3.2的py2neo将SQLite3数据库中的图形填充到neo4j中。尽管速度并不是最关心的问题,但图表在大约3小时内只获得了40K行(每个sql行有一个关系),总数为500万行。使用Cypher加速py2neo
这里是主循环:
from py2neo import neo4j as neo
import sqlite3 as sql
#select all 5M rows from sql-database
sql_str = """select * from bigram_with_number"""
#loop through each row
for (freq, first, firstfreq, second, secondfreq) in sql_cursor.execute(sql_str):
# create the Cypher query string using cypher 2.0 with merge
# so that nodes are created only if needed
query = neo.CypherQuery(neo4j_db,"""
CYPHER 2.0
merge (n:word {form: {firstvar}, freq: {freqfirst}})
merge(m:word {form: {secondvar}, freq: {freqsecond}})
create unique (n)-[:bigram {freq: {freqbigram}}]->(m) return n, m""")
#execute the string with parameters from sql-query
result = query.execute(freqbigram = freq, firstvar = first, freqfirst=firstfreq, secondvar=second, freqsecond=secondfreq)
虽然数据库填充好听,它完成前,将需要数周时间。 我怀疑可以更快地做到这一点。
我将Cypher查询创建移出循环,并删除了其中的return n,m'语句,从而使速度提高了5倍。但它仍然太慢。 –