0
我正在运行Spark 1.6版本,我正在用spark试验远程数据过程。使用JDBC从远程数据库获取数据后,我使用registerTempTable('')
方法创建了spark数据框并将其临时保存为表格。到目前为止,它正在工作。当我运行火花上下文中的查询我收到此错误:Spark SQL执行失败。获取java.lang.RuntimeException:[1.227]失败:`'union''预期,但'。'找到
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ubuntu/spark-1.6.2-bin-hadoop2.6/python/pyspark/sql/context.py", line 580, in sql
return DataFrame(self._ssql_ctx.sql(sqlQuery), self)
File "/home/ubuntu/spark-1.6.2-bin-hadoop2.6/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__
File "/home/ubuntu/spark-1.6.2-bin-hadoop2.6/python/pyspark/sql/utils.py", line 45, in deco
return f(*a, **kw)
File "/home/ubuntu/spark-1.6.2-bin-hadoop2.6/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o21.sql.
: java.lang.RuntimeException: [1.227] failure: ``union'' expected but `.' found
我使用pyspark在命令提示符处,这里是我的代码:
from pyspark import SQLContext
sqlContext = SQLContext(sc)
df = sqlContext.read.format('jdbc').options(
url='jdbc:sqlserver://<ipaddress>;user=xyz;password=pw',
dbtable='JOURNAL'
).load()
df.registerTempTable('JOURNAL')
df = sqlContext.read.format('jdbc').options(
url='jdbc:sqlserver:<ipaddress>;user=xyz;password=pw',
dbtable='GHIS'
).load()
df.registerTempTable('GHIS')
df = sqlContext.read.format('jdbc').options(
url='jdbc:sqlserver:<ip address>;user=xyz;password=pw',
dbtable='LEAS'
).load()
df.registerTempTable('LEAS')
高达这么远我得到和加载数据
现在,在这里我有问题:
doubtaccount = sqlContext.sql("SELECT ENTITYID as EntityID,SUBSTRING(DESCRPN,1,CHARINDEX('-',DESCRPN,1)-1) as BldgID,SUBSTRING(DESCRPN,CHARINDEX('-',DESCRPN,1)+1,20) as LeaseID,PERIOD*100+15 as TxnDateInt,PERIOD as Period,0-AMT as BDAmt FROM BI_Staging.dbo.JOURNAL where SOURCE = 'DA' and ACCTNUM = 'RE078201000' and STATUS = 'P' ")
当我运行这个查询时,我遇到了上述问题。我在堆栈溢出中搜索了类似的错误,但我没有找到任何。我的查询有什么问题吗?这实际上在数据库中工作。