2015-02-09 23 views
0

根据我对spark sql的调查,知道超过2个表不能直接连接,我们必须使用sub查询以使其工作。所以我使用子查询,并能加入3个表:Apache Spark SQL问题:java.lang.RuntimeException:[1.517] failure:标识符预计

与下面的查询:

“SELECT姓名,年龄,性别,dpi.msisdn,subscriptionType, maritalStatus,isHighARPU,ip地址,开始时间, endTime,isRoaming, dpi.totalCount,dpi.website FROM(SELECT subsc.name,subsc.age, subsc.gender,subsc.msisdn,subsc.subscriptionType, subsc.maritalStatus,subsc.isHighARPU,cdr.ipAddress,cdr .startTime, cdr.endTime,cdr.isRoaming FROM SUBSCRIBER_META subsc,CDR_FACT cdr WHERE subsc.msisdn = cdr.msisdn AND cdr.isRoaming =' Y')temp, DPI_FACT dpi WHERE temp.msisdn = dpi.msisdn“;

但是当同样的模式,我想加入4桌,它扔我下面的异常

了java.lang.RuntimeException:[1.517]失败:标识符预期

查询加入4个表:

SELECT名字,dueAmount FROM(SELECT姓名,年龄,性别,dpi.msisdn, subscriptionType,maritalStatus,isHighARPU,ip地址,开始时间,结束时间 ,我sRoaming,dpi.totalCount,dpi.website FROM(SELECT subsc.name,subsc.age,subsc.gender,subsc.msisdn, subsc.subscriptionType,subsc.maritalStatus,subsc.isHighARPU, cdr.ipAddress,cdr.startTime ,cdr.endTime,cdr.isRoaming FROM SUBSCRIBER_META subsc,CDR_FACT cdr WHERE subsc.msisdn = cdr.msisdn AND cdr.isRoaming ='Y')temp,DPI_FACT dpi WHERE temp.msisdn = dpi.msisdn)inner,BILLING_META计费where inner.msisdn = billing.msisdn

任何人都可以请帮助我使这个查询工作?

在此先感谢。错误如下:

09/02/2015 02:55:24 [ERROR] org.apache.spark.Logging$class: Error running job streaming job 1423479307000 ms.0 
java.lang.RuntimeException: [1.517] failure: identifier expected 

SELECT name, dueAmount FROM (SELECT name, age, gender, dpi.msisdn, subscriptionType, maritalStatus, isHighARPU, ipAddress, startTime, endTime, isRoaming, dpi.totalCount, dpi.website FROM (SELECT subsc.name, subsc.age, subsc.gender, subsc.msisdn, subsc.subscriptionType, subsc.maritalStatus, subsc.isHighARPU, cdr.ipAddress, cdr.startTime, cdr.endTime, cdr.isRoaming FROM SUBSCRIBER_META subsc, CDR_FACT cdr WHERE subsc.msisdn = cdr.msisdn AND cdr.isRoaming = 'Y') temp, DPI_FACT dpi WHERE temp.msisdn = dpi.msisdn) inner, BILLING_META billing where inner.msisdn = billing.msisdn 
                                                                                                                                    ^
     at scala.sys.package$.error(package.scala:27) 
     at org.apache.spark.sql.catalyst.SqlParser.apply(SqlParser.scala:60) 
     at org.apache.spark.sql.SQLContext.parseSql(SQLContext.scala:73) 
     at org.apache.spark.sql.api.java.JavaSQLContext.sql(JavaSQLContext.scala:49) 
     at com.hp.tbda.rta.examples.JdbcRDDStreaming5$7.call(JdbcRDDStreaming5.java:596) 
     at com.hp.tbda.rta.examples.JdbcRDDStreaming5$7.call(JdbcRDDStreaming5.java:546) 
     at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:274) 
     at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:274) 
     at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:527) 
     at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:527) 
     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:41) 
     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) 
     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) 
     at scala.util.Try$.apply(Try.scala:161) 
     at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) 
     at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:172) 
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
     at java.lang.Thread.run(Thread.java:745) 
+0

您可以尝试将别名的名称从内部更改为其他内容 – 2015-02-09 14:06:43

回答

0

由于您在sql中使用了保留关键字“内部”的Spark,所以发生异常。避免使用Keywords in Spark SQL作为自定义标识符。