2015-11-24 67 views
1

我想运行多个子查询到where子句中,并且我得到下面的错误。是否意味着Hive不支持它?如果没有,有没有不同的方式来写下面的查询?Hive与多个子查询

执行配置单元查询时发生错误:OK FAILED:SemanticException [Error 10249]:第14行不支持的子查询表达式'adh':只支持1个子查询表达式。

select 
    first_name, 
    last_name, 
    salary, 
    title, 
    department 
from 
    employee_t1 emp 
where 
    emp.salary <= 100000 
    and (
     (emp.code in (select comp from history_t2 where code_hist <> 10)) 
     or 
     (emp.adh in (select comp from sector_t3 where code_hist <> 50)) 
    ) 
    and department = 'Pediatrics'; 
+0

[文件说,他们支持(https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries#LanguageManualSubQueries-SubqueriesintheWHEREClause) 。虽然不确定多个。 –

回答

0

两个选项。一个是join S和另一种是union all

where emp.salary <= 100000 and 
     emp.code in (select comp 
        from history_t2 
        where code_hist <> 10 
        union all 
        select comp 
        from sector_t3 
        where code_hist <> 50 
       ) and 
     emp.department = 'Pediatrics'; 

这通常不建议,因为有优化的选择较少。但是如果Hive有这个限制(并且我没有在Hive中尝试过这种类型的查询),那么这可能是一种解决方法。

如果comp字段在两个表中是唯一的,那么join方法将是最合适的。否则,您需要删除重复项以避免join中的重复。

0

我同意戈登。使用加入你可以试试下面的查询(未测试):

select 
    a.first_name, 
    a.last_name, 
    a.salary, 
    a.title, 
    a.department 
from 
    (Select * from employee_t1 where 
    emp.salary <= 100000 
    and department = 'Pediatrics') a 
left outer join (select comp from history_t2 where code_hist <> 10) b 
on a.code = b.comp 
left outer join (select comp from sector_t3 where code_hist <> 50) c 
on a.adh = c.comp 
where b.comp is not null 
or c.comp is not null 
;