2017-04-24 35 views
3

为组聚合函数I具有下表:阵列相交通过

CREATE TABLE person 
AS 
    SELECT name, preferences 
    FROM (VALUES 
    ('John', ARRAY['pizza', 'meat']), 
    ('John', ARRAY['pizza', 'spaghetti']), 
    ('Bill', ARRAY['lettuce', 'pizza']), 
    ('Bill', ARRAY['tomatoes']) 
) AS t(name, preferences); 

group by personintersect(preferences)作为聚合函数。所以我想要以下输出:

person | preferences 
------------------------------- 
John | ['pizza'] 
Bill | [] 

这应该如何在SQL中完成?我想我需要做类似以下的事情,但X函数是什么样的?

SELECT person.name, array_agg(X) 
FROM  person 
LEFT JOIN unnest(preferences) preferences 
ON  true 
GROUP BY name 
+0

可能会加入unnest(首选项)? –

+0

@VaoTsun我认为这是一个好主意,但我该如何与该连接相交(并在之后应用'array_agg')? –

+0

数组有重复值的机会吗? –

回答

2

使用FILTERARRAY_AGG

SELECT name, array_agg(pref) FILTER (WHERE namepref = total) 
FROM (
    SELECT name, pref, t1.count AS total, count(*) AS namepref 
    FROM (
    SELECT name, preferences, count(*) OVER (PARTITION BY name) 
    FROM person 
) AS t1 
    CROSS JOIN LATERAL unnest(preferences) AS pref 
    GROUP BY name, total, pref 
) AS t2 
GROUP BY name; 

下面是使用ARRAY构造和DISTINCT做到这一点的方法之一。

WITH t AS (
    SELECT name, pref, t1.count AS total, count(*) AS namepref 
    FROM (
    SELECT name, preferences, count(*) OVER (PARTITION BY name) 
    FROM person 
) AS t1 
    CROSS JOIN LATERAL unnest(preferences) AS pref 
    GROUP BY name, total, pref 
) 
SELECT DISTINCT 
    name, 
    ARRAY(SELECT pref FROM t AS t2 WHERE total=namepref AND t.name = t2.name) 
FROM t; 
+1

这将*不*计算数组的交集,但会生成一个数组,其中包含不止一次出现的所有首选项。尝试以下三条记录:'('Paul',ARRAY ['pizza','meat'])','('Paul',ARRAY ['pizza','salad'])''和'('Paul', ARRAY ['沙拉','啤酒'])'。结果应该是空的,但是你的查询会产生'{pizza,salad}'。 –

+0

@LaurenzAlbe修复。 –

+0

这将工作,除非有数组包含多次相同的值。 –

2

你可以创建自己的聚合函数:

CREATE OR REPLACE FUNCTION arr_sec_agg_f(anyarray, anyarray) RETURNS anyarray 
    LANGUAGE sql IMMUTABLE AS 
    'SELECT CASE 
       WHEN $1 IS NULL 
       THEN $2 
       WHEN $2 IS NULL 
       THEN $1 
       ELSE array_agg(x) 
      END 
    FROM (SELECT x FROM unnest($1) a(x) 
      INTERSECT 
      SELECT x FROM unnest($2) a(x) 
     ) q'; 

CREATE AGGREGATE arr_sec_agg(anyarray) (
    SFUNC = arr_sec_agg_f(anyarray, anyarray), 
    STYPE = anyarray 
); 

SELECT name, arr_sec_agg(preferences) 
FROM person 
GROUP BY name; 

┌──────┬─────────────┐ 
│ name │ arr_sec_agg │ 
├──────┼─────────────┤ 
│ John │ {pizza}  │ 
│ Bill │    │ 
└──────┴─────────────┘ 
(2 rows) 
+0

这很整洁,我不知道这是可能的。我将继续搜索查询,因为我目前无法更改模式。在这种情况下我该怎么办?接受这个问题并再次提出问题,注意我无法创建自己的函数,因此正在寻找一个简单的查询? –

+0

不,只是不接受我的回答。 –

1

如果编写自定义聚合(如@LaurenzAlbe提供)是不是你的选择,你可以通常注册相同的逻辑在recursive CTE

with recursive cte(name, pref_intersect, pref_prev, iteration) as (
    select name, 
      min(preferences), 
      min(preferences), 
      0 
    from  your_table 
    group by name 
    union all 
    select name, 
      array(select e from unnest(pref_intersect) e 
        intersect 
        select e from unnest(pref_next) e), 
      pref_next, 
      iteration + 1 
    from  cte, 
    lateral (select your_table.preferences pref_next 
       from  your_table 
       where your_table.name  = cte.name 
       and  your_table.preferences > cte.pref_prev 
       order by your_table.preferences 
       limit 1) n 
) 
select distinct on (name) name, pref_intersect 
from  cte 
order by name, iteration desc 

http://rextester.com/ZQMGW66052

这里的主要想法是找到一个你可以在你的行中“行走”的顺序。我使用了preferences数组的自然顺序(因为没有显示多少列)。理想情况下,这种排序应该发生在(a)唯一的字段上(最好在主键上),但在这里,因为preferences列中的重复不会影响交集的结果,所以这已经足够了。