2015-12-09 49 views
0

在IPython和Jupyter文档中,它说get_ipython()。magic()已被弃用。但是,当我将代码更改为使用run_line_magic时,它无法推送到R(请参见下文)。可能与此问题有关 https://bitbucket.org/rpy2/rpy2/issues/184/valueerror-call-stack-is-not-deep-enough在Jupyter/IPython中使用rpy2 run_line_magic错误

我在Mac优胜美地,使用蟒蛇与Python 2.7。我昨天刚刚更新了Anaconda和rpy2。下面的代码来自Jupyter笔记本。

%load_ext rpy2.ipython 
import pandas as pd 

'''Two test functions with rpy2. 
The only difference between them is that 
rpy2fun_magic uses 'magic' to push variable to R and 
rpy2fun_linemagic uses 'run_line_magic' to push variable. 
'magic' works fine. 'run_line_magic' returns an error.''' 

def rpy2fun_magic(df): 
get_ipython().magic('R -i df') 
get_ipython().run_line_magic('R','df_cor <- cor(df)') 
get_ipython().run_line_magic('R','-o df_cor') 
return (df_cor) 

def rpy2fun_linemagic(df): 
get_ipython().run_line_magic('R','-i df') 
get_ipython().run_line_magic('R','df_cor <- cor(df)') 
get_ipython().run_line_magic('R','-o df_cor') 
return (df_cor) 

dataframetest = pd.DataFrame([[1,2,3,4],[6,3,4,5],[9,1,7,3]]) 

df_cor_magic = rpy2fun_magic(dataframetest) 
print 'Using magic to push variable works fine\n' 
print df_cor_magic 

print '\nBut using run_line_magic returns an error\n' 

df_cor_linemagic = rpy2fun_linemagic(dataframetest) 

Using magic to push variable works fine 

[[ 1.   -0.37115374 0.91129318 -0.37115374] 
[-0.37115374 1.   -0.72057669 1.  ] 
[ 0.91129318 -0.72057669 1.   -0.72057669] 
[-0.37115374 1.   -0.72057669 1.  ]] 

But using run_line_magic returns an error 

--------------------------------------------------------------------------- 
NameError         Traceback (most recent call last) 
<ipython-input-1-e418b72a8621> in <module>() 
     28 print '\nBut using run_line_magic returns an error\n' 
     29 
---> 30 df_cor_linemagic = rpy2fun_linemagic(dataframetest) 

<ipython-input-1-e418b72a8621> in rpy2fun_linemagic(df) 
     15 
     16 def rpy2fun_linemagic(df): 
---> 17  get_ipython().run_line_magic('R','-i df') 
     18  get_ipython().run_line_magic('R','df_cor <- cor(df)') 
     19  get_ipython().run_line_magic('R','-o df_cor') 

/Users/alexmillner/anaconda/lib/python2.7/site-packages/IPython/core/interactiveshell.pyc in run_line_magic(self, magic_name, line) 
     2255     kwargs['local_ns'] = sys._getframe(stack_depth).f_locals 
     2256    with self.builtin_trap: 
    -> 2257     result = fn(*args,**kwargs) 
     2258    return result 
     2259 

/Users/alexmillner/anaconda/lib/python2.7/site-packages/rpy2/ipython/rmagic.pyc in R(self, line, cell, local_ns) 

/Users/alexmillner/anaconda/lib/python2.7/site-packages/IPython/core/magic.pyc in <lambda>(f, *a, **k) 
     191  # but it's overkill for just that one bit of state. 
     192  def magic_deco(arg): 
    --> 193   call = lambda f, *a, **k: f(*a, **k) 
     194 
     195   if callable(arg): 

/Users/alexmillner/anaconda/lib/python2.7/site-packages/rpy2/ipython/rmagic.pyc in R(self, line, cell, local_ns) 
     657       val = self.shell.user_ns[input] 
     658      except KeyError: 
    --> 659       raise NameError("name '%s' is not defined" % input) 
     660     if args.converter is None: 
     661      ro.r.assign(input, self.pyconverter(val)) 

NameError: name 'df' is not defined 
+0

它也可能有助于添加您正在使用的IPython/Jupyter版本。 – ely

+0

我更新了最初的回应,在底部有两个解决方案选项可能会有所帮助。 – ely

+0

我怀疑'run_line_magic()'有阴暗的角落(请参阅https://github.com/ipython/ipython/issues/8941了解与ipython 0.4.0类似的内容),并且我们可以通过报告问题来帮助ipython开发人员。 – lgautier

回答

0

同一问题进行一些讨论与%timeit第一,其次是在底部的解决方法的答案。我在Anaconda 2.7.10中使用IPython 3.1.0,所以根据版本差异,我的观察结果可能会有所不同。

这不是唯一到R的扩展,你可以用更简单的东西像%timeit重现此:

In [47]: dfrm 
Out[47]: 
      A   B   C 
0 0.690466 0.370793 0.963782 
1 0.478427 0.358897 0.689173 
2 0.189277 0.268237 0.570624 
3 0.735665 0.342549 0.509810 
4 0.929736 0.090079 0.384444 
5 0.210941 0.347164 0.852408 
6 0.241940 0.187266 0.961489 
7 0.768143 0.548450 0.604004 
8 0.055765 0.842224 0.668782 
9 0.717827 0.047011 0.948673 

In [48]: def run_timeit(df): 
    get_ipython().run_line_magic('timeit', 'df.sum()') 
    ....:  

In [49]: run_timeit(dfrm) 
--------------------------------------------------------------------------- 
NameError         Traceback (most recent call last) 
<ipython-input-49-1e62302232b6> in <module>() 
----> 1 run_timeit(dfrm) 

<ipython-input-48-0a3e09ec1e0c> in run_timeit(df) 
     1 def run_timeit(df): 
----> 2  get_ipython().run_line_magic('timeit', 'df.sum()') 
     3 

/home/ely/anaconda/lib/python2.7/site-packages/IPython/core/interactiveshell.pyc in run_line_magic(self, magic_name, line) 
    2226     kwargs['local_ns'] = sys._getframe(stack_depth).f_locals 
    2227    with self.builtin_trap: 
-> 2228     result = fn(*args,**kwargs) 
    2229    return result 
    2230 

/home/ely/anaconda/lib/python2.7/site-packages/IPython/core/magics/execution.pyc in timeit(self, line, cell) 

/home/ely/anaconda/lib/python2.7/site-packages/IPython/core/magic.pyc in <lambda>(f, *a, **k) 
    191  # but it's overkill for just that one bit of state. 
    192  def magic_deco(arg): 
--> 193   call = lambda f, *a, **k: f(*a, **k) 
    194 
    195   if callable(arg): 

/home/ely/anaconda/lib/python2.7/site-packages/IPython/core/magics/execution.pyc in timeit(self, line, cell) 
    1034    number = 1 
    1035    for _ in range(1, 10): 
-> 1036     time_number = timer.timeit(number) 
    1037     worst_tuning = max(worst_tuning, time_number/number) 
    1038     if time_number >= 0.2: 

/home/ely/anaconda/lib/python2.7/site-packages/IPython/core/magics/execution.pyc in timeit(self, number) 
    130   gc.disable() 
    131   try: 
--> 132    timing = self.inner(it, self.timer) 
    133   finally: 
    134    if gcold: 

<magic-timeit> in inner(_it, _timer) 

NameError: global name 'df' is not defined 

的问题是该行魔法设置为寻求在全球范围内的变量名,而不是在功能范围。如果参数传送给函数rpy2fun_linemagic碰巧与一个全局变量名称一致,内部代码将挑选起来,例如:

In [52]: def run_timeit(dfrm): 
    get_ipython().run_line_magic('timeit', 'dfrm.sum()') 
    ....:  

In [53]: run_timeit(dfrm) 
The slowest run took 5.67 times longer than the fastest. This could mean that an intermediate result is being cached 
10000 loops, best of 3: 99.1 µs per loop 

但是这只能是偶然的,因为传递给run_line_magic内部字符串包含全球找到的名称。

不过,我即使使用普通magic功能得到了同样的错误:

In [58]: def run_timeit(df): 
    get_ipython().magic('timeit df.sum()') 
    ....:  

In [59]: run_timeit(dfrm) 
--------------------------------------------------------------------------- 
NameError         Traceback (most recent call last) 
<ipython-input-59-1e62302232b6> in <module>() 
----> 1 run_timeit(dfrm) 

<ipython-input-58-e98c720ea7e8> in run_timeit(df) 
     1 def run_timeit(df): 
----> 2  get_ipython().magic('timeit df.sum()') 
     3 

/home/ely/anaconda/lib/python2.7/site-packages/IPython/core/interactiveshell.pyc in magic(self, arg_s) 
    2305   magic_name, _, magic_arg_s = arg_s.partition(' ') 
    2306   magic_name = magic_name.lstrip(prefilter.ESC_MAGIC) 
-> 2307   return self.run_line_magic(magic_name, magic_arg_s) 
    2308 
    2309  #------------------------------------------------------------------------- 

/home/ely/anaconda/lib/python2.7/site-packages/IPython/core/interactiveshell.pyc in run_line_magic(self, magic_name, line) 
    2226     kwargs['local_ns'] = sys._getframe(stack_depth).f_locals 
    2227    with self.builtin_trap: 
-> 2228     result = fn(*args,**kwargs) 
    2229    return result 
    2230 

/home/ely/anaconda/lib/python2.7/site-packages/IPython/core/magics/execution.pyc in timeit(self, line, cell) 

/home/ely/anaconda/lib/python2.7/site-packages/IPython/core/magic.pyc in <lambda>(f, *a, **k) 
    191  # but it's overkill for just that one bit of state. 
    192  def magic_deco(arg): 
--> 193   call = lambda f, *a, **k: f(*a, **k) 
    194 
    195   if callable(arg): 

/home/ely/anaconda/lib/python2.7/site-packages/IPython/core/magics/execution.pyc in timeit(self, line, cell) 
    1034    number = 1 
    1035    for _ in range(1, 10): 
-> 1036     time_number = timer.timeit(number) 
    1037     worst_tuning = max(worst_tuning, time_number/number) 
    1038     if time_number >= 0.2: 

/home/ely/anaconda/lib/python2.7/site-packages/IPython/core/magics/execution.pyc in timeit(self, number) 
    130   gc.disable() 
    131   try: 
--> 132    timing = self.inner(it, self.timer) 
    133   finally: 
    134    if gcold: 

<magic-timeit> in inner(_it, _timer) 

NameError: global name 'df' is not defined 

一(超级坏)的方式来解决这个问题是使用globals来定位相同的项目传递给你的函数的参数,然后你将拥有一个全局名称。

例如:

In [68]: def run_timeit(df): 
    for var_name, var_val in globals().iteritems(): 
     if df is var_val: 
      get_ipython().run_line_magic('timeit', '%s.sum()'%(var_name)) 
      break 
    ....:   

In [69]: run_timeit(dfrm) 
The slowest run took 5.72 times longer than the fastest. This could mean that an intermediate result is being cached 
10000 loops, best of 3: 99.2 µs per loop 

但这是非常不稳定的,因为它依赖于在Python传址名。如果我传递一个整数或字符串这样的对象,我将不得不检查它是否是被执行的或其他东西,否则在全局命名空间中找不到它。

另一种可能稍微好一点的方法是使用IPython存储的user_ns命名空间dict。那么至少你不看全局,有过当用户在IPython的分配已命名的特定变量更稳定:

In [71]: def run_timeit(df): 
    ....:  g = get_ipython() 
    ....:  for var_name, var_val in g.user_ns.iteritems(): 
    ....:   if df is var_val: 
    ....:    g.run_line_magic('timeit', '%s.sum()'%(var_name)) 
    ....:    break 
    ....:   

In [72]: run_timeit(dfrm) 
The slowest run took 5.58 times longer than the fastest. This could mean that an intermediate result is being cached 
10000 loops, best of 3: 99 µs per loop 

在你的R特异性的函数调用的情况下,我会请尝试:

def rpy2fun_linemagic(df): 
    g = get_ipython() 
    for var_name, var_val in g.user_ns.iteritems(): 
     if df is var_val: 
      g.run_line_magic('R', '-i %s'%(var_name)) 
      g.run_line_magic('R', 'df_cor <- cor(%s)'%(var_name)) 
      g.run_line_magic('R', '-o df_cor') 
      return df_cor 

您可能还必须谨慎使用return语句。如果输出转换回Python的结果也是在全局范围创建变量,而不是函数作用域,则可能需要使用return g.user_ns['df_cor']或其他东西。或者,如果该变量被创建为副作用,则可能不想返回任何内容。我不喜欢依赖这种隐含的变异,但它可以为你工作。

+0

谢谢你。所以线魔法看待全局变量的事实是有意的?还是一个错误?我应该期待这个变化吗?再次感谢。 –

+0

我不知道Jupyter团队的设计意图,但是我的直觉是相信它只会关注全球价值观,并且无论发生什么事情,在普通的“魔术”为你工作的特殊情况下都不应该是依靠。此外,我会说你不应该*希望*他们支持在本地功能范围内寻找。从编程的角度来看,编写一个依赖于在名称空间中按名称查看值的函数是一个非常糟糕的想法。更好的选择是直接使用'rpy2',如果你想把它作为一个函数调用而不是魔法。 – ely

+0

谢谢。这就是我想我会做的 - 直接使用rpy2。再次感谢。 –

0

我怀疑你提供的代码示例只是为了演示run_line_magic()的问题,但为了参考,我添加了一种方法来执行相同的操作,而不涉及ipython。

from rpy2.robjects import globalenv 
def rpy2cor(df): 
    fun = globalenv.get('cor', wantfun=True) 
    df_cor = fun(df) 
    return df_cor