2014-01-26 24 views
8

Pyparsing运行良好的一个非常小的语法,但语法的不断壮大,性能下降,并通过屋顶的内存使用情况。pyparsing性能和内存使用

我现在gramar是:

newline = LineEnd() 
minus = Literal ('-') 
plus = Literal ('+') 
star = Literal ('*') 
dash = Literal ('/') 
dashdash = Literal ('//') 
percent = Literal ('%') 
starstar = Literal ('**') 
lparen = Literal ('(') 
rparen = Literal (')') 
dot = Literal ('.') 
comma = Literal (',') 
eq = Literal ('=') 
eqeq = Literal ('==') 
lt = Literal ('<') 
gt = Literal ('>') 
le = Literal ('<=') 
ge = Literal ('>=') 
not_ = Keyword ('not') 
and_ = Keyword ('and') 
or_ = Keyword ('or') 
ident = Word (alphas) 
integer = Word (nums) 

expr = Forward() 
parenthized = Group (lparen + expr + rparen) 
trailer = (dot + ident) 
atom = ident | integer | parenthized 
factor = Forward() 
power = atom + ZeroOrMore (trailer) + Optional (starstar + factor) 
factor << (ZeroOrMore (minus | plus) + power) 
term = ZeroOrMore (factor + (star | dashdash | dash | percent)) + factor 
arith = ZeroOrMore (term + (minus | plus)) + term 
comp = ZeroOrMore (arith + (eqeq | le | ge | lt | gt)) + arith 
boolNot = ZeroOrMore (not_) + comp 
boolAnd = ZeroOrMore (boolNot + and_) + boolNot 
boolOr = ZeroOrMore (boolAnd + or_) + boolAnd 
match = ZeroOrMore (ident + eq) + boolOr 
expr << match 
statement = expr + newline 
program = OneOrMore (statement) 

当我解析以下

print (program.parseString ('3*(1+2*3*(4+5))\n')) 

这需要相当长的:

~/Desktop/m2/pyp$ time python3 slow.py 
['3', '*', ['(', '1', '+', '2', '*', '3', '*', ['(', '4', '+', '5', ')'], ')']] 

real 0m27.280s 
user 0m25.844s 
sys 0m1.364s 

而且内存使用量上升到1.7吉布(原文如此!)。

有我做了一些严重的错误执行这一语法还是我还能如何保持内存使用情况在可以忍受的利润率?

+0

以几分之一秒内lex和yacc同样的事情。 – Hyperboreus

回答

11

进口pyparsing使packrat解析到memoize的解析行为后:

ParserElement.enablePackrat() 

这应该使性能有了很大的改进。

+0

谢谢,我会试试看。 – Hyperboreus

+1

根据记录,这3.5秒钟后,以0.036秒我的电脑,近100倍的改善上。是否有任何理由不自动开启memoization--是否在某些边缘情况下失败? – Hooked

+1

@Hooked:对于pyparsing解析上packrat详情,请参阅[在pyparsing FAQ这个项目(http://pyparsing-public.wikispaces.com/FAQs#toc3)。一般来说,请参阅[本SO线程](https://stackoverflow.com/q/1410477/857390)进行packrat解析。 –