1
我目前正在使用:我可以使用SELECT从数据框而不是创建此临时表吗?
+---+-------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
|id |sen |attributes |
+---+-------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
|1 |Stanford is good college.|[[Stanford,ORGANIZATION,NNP], [is,O,VBZ], [good,O,JJ], [college,O,NN], [.,O,.], [Stanford,ORGANIZATION,NNP], [is,O,VBZ], [good,O,JJ], [college,O,NN], [.,O,.]]|
+---+-------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
I want to get above df from :
+----------+--------+--------------------+
|article_id| sen| attribute|
+----------+--------+--------------------+
| 1|example1|[Standford,Organi...|
| 1|example1| [is,O,VP]|
| 1|example1| [good,LOCATION,ADP]|
+----------+--------+--------------------+
使用:
df3.registerTempTable("d1")
val df4 = sqlContext.sql("select article_id,sen,collect(attribute) as attributes from d1 group by article_id,sen")
有没有,我没有登记临时表,作为同时节省了数据帧,它给很多垃圾什么办法! ! lige df3.Select“”??
我不明白你在做什么。请检查您的问题!我现在投票结束,因为目前还不清楚。 – eliasah
@eliasah - 看看答案,我的意思是这样的! –
如果给定的答案解决了你的问题,请接受它,否则评论为什么它为你工作! – eliasah