SPARK SQL：在CASE语句中实现AND条件

我知道如何在使用Scala的SPARK SQL中实现一个简单的CASE WHEN-THEN子句。我正在使用版本1.6.2。但是，我需要在CASE-WHEN子句中的多个列上指定AND条件。如何使用Scala在SPARK中实现这一点？SPARK SQL：在CASE语句中实现AND条件

在此先感谢您的时间和帮助！

这里的SQL查询，我有：

select sd.standardizationId, 
    case when sd.numberOfShares = 0 and 
      isnull(sd.derivatives,0) = 0 and 
      sd.holdingTypeId not in (3,10) 
     then 
      8 
     else 
      holdingTypeId 
     end 
    as holdingTypeId 
from sd;

来源

2016-11-10 Prash

这是否查询工作？如果不是，你会得到什么错误？不要让我们猜测。请参阅[mcve] –

另外一种方式，如果它想避免使用完整的字符串表达式，如下：

import org.apache.spark.sql.Column 
import org.apache.spark.sql.functions._ 

val sd = sqlContext.table("sd") 

val conditionedColumn: Column = when(
    (sd("numberOfShares") === 0) and 
    (coalesce(sd("derivatives"), lit(0)) === 0) and 
    (!sd("holdingTypeId").isin(Seq(3,10): _*)), 8 
).otherwise(sd("holdingTypeId")).as("holdingTypeId") 

val result = sd.select(sd("standardizationId"), conditionedColumn)

来源

2016-11-10 13:01:01

首先读出表的数据帧

val table = sqlContext.table("sd")

然后用表达选择。根据你的数据库对齐syntaxt。

val result = table.selectExpr("standardizationId","case when numberOfShares = 0 and isnull(derivatives,0) = 0 and holdingTypeId not in (3,10) then 8 else holdingTypeId end as holdingTypeId")

，并显示结果

result.show

来源

2016-11-10 08:31:30 FaigB

只需要知道，如果这个工作正常，上面的问题查询也可以正常工作......对吧？这两者有什么区别？ – Shankar

在第一个例子中，它用作查询表和创建数据框（如果作者使用同样的方式）的sql，并且它应该在“不在”中提升解析异常。在第二个示例中，选择应用于已经从表创建的数据框，并且操作仅根据语法支持而增加解析异常。 – FaigB

非常感谢@FaigB！这就是诀窍！ – Prash

SPARK SQL：在CASE语句中实现AND条件

回答

相关问题