pspark SQL模块不能识别列

我有这样一个数据帧：pspark SQL模块不能识别列

| Cid|    Cname| acctype| accnum|country|currency |rank | 
+----+--------------------+--------+--------+-------+---------+-------+ 
|6489| u'Kristi Bradley'| current|62814653|  US|  USD| 296| 
|4204| u'Elizabeth Ande...| current|18476174|  US|  USD| 237| 
|6020| u'Melody Miller'| current|84315491|  US|  USD| 285| 
|9512|  u'William Wise'| deposit|37582740|  US|  USD| 277| 
|7223| u'Jackie Arellano'| current|46939546|  US|  USD| 498| 
|8514| u'Michael Hawkins'| current|90554826|  US|  USD| 498| 
|6075|  u'James Garcia'| current|77363343|  US|  USD|  53| 
|7700| u'Margaret Phill...| deposit|18799392|  US|  USD| 399|

我pyspark SQL模块的查询是这样的：

result = sqlContext.sql("SELECT sum(rank) FROM US_df WHERE acctype ='current' ")

但我得到的结果为空这样的：

| _c0| 
+----+ 
|null| 
+----+

我在这里做错了什么？

来源

2016-09-19 higgs

这应该工作。 'SELECT rank FROM US_df WHERE acctype ='current''返回期望值吗？ –

没有@BorisShchegolev上述查询也没有工作。但是，我通过Group By子句得到了预期的结果。谢谢！ – higgs

嗨我现在还不能评论，对于简短的回答抱歉。你注册了你的桌子吗？

US_df.createOrReplaceTempView("us_table")

http://spark.apache.org/docs/latest/sql-programming-guide.html#running-sql-queries-programmatically

来源

2016-09-19 08:51:48 GwydionFR

是的，我注册了桌子。我不知道为什么它不起作用，但是查询使用了group by子句。 – higgs

pspark SQL模块不能识别列

回答

相关问题