2014-02-21 26 views
0

哪一种方式在内存管理和计算速度方面最有效?Networkx:节点作为对象或节点作为标识符与词典属性表

下面的简单测试表明,将节点中的属性存储为python对象或通过属性表进行字典查找会稍微好一些。由于内存如何分配,这是否总是这样?

作为测试我构造的简单的例子:

class country(): 
    def __init__(self, name, gdp): 
     self.name = name 
     self.gdp = gdp 
    def __repr__(self): 
     return str(self.name) 

#Country Objects 
countries = dict() 
countries['AUS'] = country('AUS', 2000) 
countries['USA'] = country('USA', 10000) 
countries['ZWE'] = country('ZWE', 13) 

#Attribute Dictionary 
gdp = dict() 
gdp['AUS'] = 2000 
gdp['USA'] = 10000 
gdp['ZWE'] = 13 

构建网络:

#Nodes as ID's 
G1 = nx.Graph() 
G1.add_nodes_from(countries.keys()) 
G1.nodes() 

#Nodes as Objects 
G2 = nx.Graph() 
for c in countries.keys(): 
    G2.add_node(countries[c]) 
G2.nodes() 

运行%timeit在IPython的:

G1F()

#Lookup Data from Strings Network 
def G1f(): 
    for n in G1.nodes(): 
     print "Node: %s" % n 
     print "\tGDP: %s" % gdp[n] 
%timeit G1f 

输出为G1F():

10000000 loops, best of 3: 26.4 ns per loop 

G2F()

#Lookup Data from Objects 
def G2f(): 
    for n in G2.nodes(): 
     print "Node: %s" % n.name 
     print "\tGDP: %s" % n.gdp 
%timeit G2f 

输出,用于G2F()

10000000 loops, best of 3: 21.8 ns per loop 

更新

G3F()从答案]

G3 = nx.Graph() 
for c,v in gdp.items(): 
    G3.add_node(c, gdp=v) 
def G3f(): 
    for n,d in G3.nodes(data=True): 
     print "Node: %s" % n 
     print "\tGDP: %s" % d['gdp'] 

输出为G13f():

10000 loops, best of 3: 63 µs per loop 

回答

0

您也可以使用节点属性是这样的:

import networkx as nx 
#Attribute Dictionary 
gdp = dict() 
gdp['AUS'] = 2000 
gdp['USA'] = 10000 
gdp['ZWE'] = 13 

G3 = nx.Graph() 
for c,v in gdp.items(): 
    G3.add_node(c, gdp=v) 

print G3.nodes(data=True) 

def G3f(): 
    for n,d in G3.nodes(data=True): 
     print "Node: %s" % n 
     print "\tGDP: %s" % d['gdp'] 

我不清楚测试计时是否非常重要。除非这是一个非常大的问题(也许有一天每个人都会拥有自己的国家!),速度或记忆力可能没有太大的差别。我怀疑创建许多小自定义对象(国家())的开销最终会使用更多的内存和时间。

+0

谢谢 - 我对计时很感兴趣,因为我还想给边添加属性,并且边的数量可能变得非常大。我将为上面的G3f()添加%timeit作为另一个选项。 – sanguineturtle