2017-04-02 83 views
0

我正在使用python脚本抓取一个页面(例如,facebook页面),并且想要将每个帖子写入文件(类似于gettwitter进程)。我使用Apache Nifi ExecuteScript处理器来调用python脚本。用nifi执行脚本没有可行的替代错误python

但是,我得到SyntaxError: no viable alternative at input

我可以在处理器之间成功传输数据,但是当我尝试添加报废代码,我不断收到错误。我正在使用python版本2.7.8。据我所知,executeScript在内部使用jython,jython可以翻译python代码。

下面是代码,如果我们删除与nifi相关的代码(内部类和流文件),它可以与外部nifi的python很好地工作。

import urllib2 
import json 
import datetime 
import csv 
import time 
import sys 
import traceback 
from org.apache.nifi.processor.io import OutputStreamCallback 
from org.python.core.util import StringUtil 

class WriteContentCallback(OutputStreamCallback): 
    def __init__(self, content): 
     self.content_text = content 

    def process(self, outputStream): 
     try: 
      outputStream.write(StringUtil.toBytes(self.content_text)) 
     except: 
      traceback.print_exc(file=sys.stdout) 
      raise 

#app_id = "<FILL IN>" 
#app_secret = "<FILL IN>" # DO NOT SHARE WITH ANYONE! 
page_id = "dsssssss" 
#page_id = raw_input("Please Paste Public Page Name:") 

#access_token = app_id + "|" + app_secret 

access_token = "sdfsdfsf%sdfsdf" 

#access_token = raw_input("Please Paste Your Access Token:") 


def scrapeFacebookPageFeedStatus(page_id, access_token): 
    flowFile = session.create() 
    flowFile = session.write(flowFile, WriteContentCallback("Hello there this is my data") 
    #flowFile = session.write() 
    #session.transfer(flowFile, REL_SUCCESS) 


     has_next_page = False 
     num_processed = 0 # keep a count on how many we've processed 
     scrape_starttime = datetime.datetime.now() 


     while has_next_page: 
      print "Scraping %s Facebook Page: %s\n" % (page_id, scrape_starttime) 
      has_next_page = False 

     print "\nDone!\n%s Statuses Processed in %s" % \ 
       (num_processed, datetime.datetime.now() - scrape_starttime) 


if __name__ == '__main__': 
    #scrapeFacebookPageFeedStatus(page_id, access_token) 
    flowFile = session.create() 
    flowFile = session.write(flowFile, WriteContentCallback("Hello there this is my data") 
    session.transfer(flowFile, REL_SUCCESS) 

下面是错误的完整跟踪

> 2017-04-02 16:19:22,817 ERROR [Timer-Driven Process Thread-10] 
> o.a.nifi.processors.script.ExecuteScript 
> ExecuteScript[id=a62f4b97-8fd7-15cd-95b9-505e1b960805] 
> ExecuteScript[id=a62f4b97-8fd7-15cd-95b9-505e1b960805] failed to 
> process due to org.apache.nifi.processor.exception.ProcessException: 
> javax.script.ScriptException: SyntaxError: no viable alternative at 
> input 'has_next_page' in <script> at line number 178 at column number 
> 8; rolling back session: 
> org.apache.nifi.processor.exception.ProcessException: 
> javax.script.ScriptException: SyntaxError: no viable alternative at 
> input 'has_next_page' in <script> at line number 178 at column number 
> 8 2017-04-02 16:19:22,819 ERROR [Timer-Driven Process Thread-10] 
> o.a.nifi.processors.script.ExecuteScript 
> org.apache.nifi.processor.exception.ProcessException: 
> javax.script.ScriptException: SyntaxError: no viable alternative at 
> input 'has_next_page' in <script> at line number 178 at column number 
> 8 
>   at org.apache.nifi.processors.script.ExecuteScript.onTrigger(ExecuteScript.java:214) 
> ~[nifi-scripting-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2] 
>   at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) 
> [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2] 
>   at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) 
> [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2] 
>   at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) 
> [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2] 
>   at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) 
> [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2] 
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_77] 
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> [na:1.8.0_77] 
>   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) 
> [na:1.8.0_77] 
>   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) 
> [na:1.8.0_77] 
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
> [na:1.8.0_77] 
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
> [na:1.8.0_77] 
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] Caused by: javax.script.ScriptException: SyntaxError: no viable alternative 
> at input 'has_next_page' in <script> at line number 178 at column 
> number 8 
>   at org.python.jsr223.PyScriptEngine.scriptException(PyScriptEngine.java:190) 
> ~[jython-standalone-2.7.0.jar:na] 
>   at org.python.jsr223.PyScriptEngine.compileScript(PyScriptEngine.java:75) 
> ~[jython-standalone-2.7.0.jar:na] 
>   at org.python.jsr223.PyScriptEngine.eval(PyScriptEngine.java:31) 
> ~[jython-standalone-2.7.0.jar:na] 
>   at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:264) 
> ~[na:1.8.0_77] 
>   at org.apache.nifi.processors.script.impl.JythonScriptEngineConfigurator.eval(JythonScriptEngineConfigurator.java:59) 
> ~[nifi-scripting-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2] 
>   at org.apache.nifi.processors.script.ExecuteScript.onTrigger(ExecuteScript.java:204) 
> ~[nifi-scripting-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2] 
>   ... 11 common frames omitted Caused by: org.python.core.PySyntaxError: null 
>   at org.python.core.ParserFacade.fixParseError(ParserFacade.java:95) 
> ~[jython-standalone-2.7.0.jar:na] 
>   at org.python.core.ParserFacade.parseExpressionOrModule(ParserFacade.java:136) 
> ~[jython-standalone-2.7.0.jar:na] 
>   at org.python.util.PythonInterpreter.compile(PythonInterpreter.java:320) 
> ~[jython-standalone-2.7.0.jar:na] 
>   at org.python.util.PythonInterpreter.compile(PythonInterpreter.java:316) 
> ~[jython-standalone-2.7.0.jar:na] 
>   at org.python.util.PythonInterpreter.compile(PythonInterpreter.java:308) 
> ~[jython-standalone-2.7.0.jar:na] 
>   at org.python.jsr223.PyScriptEngine.compileScript(PyScriptEngine.java:70) 
> ~[jython-standalone-2.7.0.jar:na] 
>   ... 15 common frames omitted 

感谢您的指导。谢谢。

+0

您可以发布您正在使用的脚本的[最小,完整和可验证示例](https://stackoverflow.com/help/mcve),用于运行它的命令以及错误的完整回溯 - 你收到的消息? – Montmons

+1

@ SB87我已更新原始问题并添加了所需的详细信息。谢谢。 – omer

回答

1

从它出现你在下面的行缺少一个右括号的源代码:

flowFile = session.write(flowFile, WriteContentCallback("Hello there this is my data") 

如果你仍然得到之后的错误,很可能是进口的一个模块包含(或依赖on)native(CPython)模块,而不是纯Python。前者在Jython中不受支持,导入的模块必须是纯Python代码。有关更多详细信息,请参见this related SO post

+0

感谢您的反馈。这个错误是由于缺少右括号。我更新了它,这次没有编译错误,但我没有收到任何流文件(传输0字节)。即使在nifi-app.log中我也没有看到任何东西 – omer