0
我的表是非常大,但小文档片断会是这样:SQL - 分区一列,有些字段类型
---------+---+----------+--------+------------+---
|distance|qtt|deliver_by| store |deliver_time| ...
+--------+---+----------+--------+------------|---
| 11 | 1| pa | store_a| 1111 |
| 123 | 2| pa | store_a| 1112 |
| 33 | 3| pb | store_a| 1113 |
| 33 | 2| pa | store_b| 2221 |
| 44 | 2| pb | store_b| 2222 |
| 5 | 2| pc | store_b| 2223 |
| 5 | 2| pc | store_b| 2224 |
| 6 | 5| pb | store_c| 3331 |
| 7 | 5| pb | store_c| 3332 |
----------------------------------------------....
有多个商店但只有3种可能提供(deliver_by:pa,pb和pc),其在特定时间递送产品。考虑deliver_time
时间戳。
我要选择整个表格和增加6分新列,分钟和最大在商店每deliver_by
时间。 一个商店可以由3个交付(pa,pb,pc)中的任何一个提供服务,但不是必需的。
我可以完成几乎所有的正确结果,与下面的查询中,问题是,在情况下deliver_by
的pX是不存在,我没有得到一个空而是最小/ max在商店交货。
我真的想用一个分区,所以我写了这个以添加新的最小值/最大值列:
select
min(deliver_time) over (partition by store, deliver_by='pa') as as min_time_sd_pa
, max(deliver_time) over (partition by store, deliver_by='pa') as as min_time_sd_pa
, min(deliver_time) over (partition by store, deliver_by='pb') as as min_time_sd_pb
, max(deliver_time) over (partition by store, deliver_by='pb') as as min_time_sd_pb
, min(deliver_time) over (partition by store, deliver_by='pc') as as min_time_sd_pc
, max(deliver_time) over (partition by store, deliver_by='pc') as as min_time_sd_pc
, distance, qtt, ....
from mytable
的正确的输出将:
min_time_sd_pa|max_time_sd_pa|min_time_sd_pb|max_time_sd_pb|min_time_sd_pc|max_time_sd_pc|distance|qtt|deliver_by| store |deliver_time
--------------+--------------+--------------+--------------+--------------+--------------+--------+---+----------+--------+------------
1111 | 1112 | 1113 | 1113 | null | null | 11 | 1| pa | store_a| 1111
1111 | 1112 | 1113 | 1113 | null | null | 123 | 2| pa | store_a| 1112
1111 | 1112 | 1113 | 1113 | null | null | 33 | 3| pb | store_a| 1113
2221 | 2221 | 2222 | 2222 | 2223 | 2224 | 33 | 2| pa | store_b| 2221
2221 | 2221 | 2222 | 2222 | 2223 | 2224 | 44 | 2| pb | store_b| 2222
2221 | 2221 | 2222 | 2222 | 2223 | 2224 | 5 | 2| pc | store_b| 2223
2221 | 2221 | 2222 | 2222 | 2223 | 2224 | 5 | 2| pc | store_b| 2224
null | null | null | null | 3331 | 3332 | 6 | 5| pb | store_c| 3331
null | null | null | null | 3331 | 3332 | 7 | 5| pb | store_c| 3332
---------------------------------------------------------------------------------------------------------------------------------------
什么在我的select min(..) over..
声明中缺少,或者我怎么能以最简单的方式完成这个结果? 我正在使用Hive QL,但我想这是最通用的SQL DBMS。
感谢
是的,它会...... –