2012-08-24 96 views
2

我有一个表如下PostgreSQL的:与顺序组合不同/组由随机的()

id| page | text 
------------------------ 
1 | page1 | Hello World 
2 | page1 | Foo Bar 
3 | page2 | Baz Baz 
3 | page2 | Some Text 
4 | page3 | Some Other Text 

我要选择2个随机项 - 但每个页面只允许在结果中出现一次。

 
SELECT * FROM mydata ORDER BY RANDOM(); LIMIT 2 

但我可以结合这与DISTINCT或分组?

+0

你想从基表中得到2个随机页面(给页面提供很多条目的机会更好),或者每页有相同机会的2个随机页面? –

+0

[http:// stackoverflow。COM /问题/ 12007297/PostgreSQL的 - 选择 - 与唯一的价值/ 12012445](http://stackoverflow.com/questions/12007297/postgresql-select-with-unique-value/12012445)? – aymeric

回答

2

喜欢的东西:

select id, page, text 
from (
    select id, page, text, 
     row_number() over (partition by page order by random()) as rn 
    from mydata 
) 
where rn <= 2 
+0

我认为OP正在寻找基表中两行的* total *。你的查询检索两个随机行*每个“页”*。 –

+0

@ErwinBrandstetter:这是我的理解(每页检索两行) –

1

如果你想:
......从基表
共有两排...并给每一个平等的机会出现在示例中,无论表中有多少个条目:

SELECT * 
FROM (
    SELECT DISTINCT ON (page) * 
    FROM mydata 
    ORDER BY page, random() -- pick one random entry per page 
    ) x 
ORDER BY random() -- pick two random pages 
LIMIT 2; 

或者,使用窗口函数:

WITH x AS (
    SELECT *, row_number() OVER (PARTITION BY page ORDER BY random()) AS rn 
    FROM mydata 
    ) 
SELECT id, page, text 
FROM x 
WHERE rn = 1 
ORDER BY random() 
LIMIT 2; 

您必须测试哪个更快。
如果您正在处理一张大桌子并且需要快速表现,那么您可以做得更好。 Here is one way how.


如果,另一方面,你想:
......总共两行从表mydata
...并给每进入一个几乎同等的机会 a出现在样本中,从而有效地为表格中包含更多条目的页面提供更好的机会。
机会仍然不是真的相等 - 您的限制增加了根据定义输入罕见页面的机会。

WITH x AS (
    SELECT * 
    FROM mydata 
    ORDER BY random() 
    LIMIT 1 
    ) 
SELECT * FROM x 
UNION ALL 
(
SELECT m.* 
FROM mydata m 
    , x 
WHERE m.page <> x.page -- assuming page IS NOT NULL 
ORDER BY random() 
LIMIT 1 
); 

UNION的第二SELECT周围的括号是必需的,以允许个人订购。
经过PostgreSQL 9.1测试。窗口函数需要版本8.4或更高版本。

1

同欧文的回答,只是有点结构:http://www.sqlfiddle.com/#!1/d3e83/6

with first_random as 
(
    select * from tbl order by random() limit 1 
) 
, second_random as 
(
    select * 
    from tbl 
    where page <> (select page from first_random) 
    order by random() limit 1 
) 
select * from first_random 
union 
select * from second_random; 

同样的,a_horse_with_no_name的答案,除了这是正确的:http://www.sqlfiddle.com/#!1/d3e83/12

select id, page, text, rn 
from (
    select id, page, text, 
     row_number() over (partition by page order by random()) as rn 
    from tbl 
) x 
where rn = 1 
order by random() 
limit 2; 

选择后者,它具有简单的执行计划

0

这可能工作:

SELECT * FROM 
    (SELECT * FROM mydata GROUP BY page) t 
ORDER BY RANDOM() LIMIT 2 
+0

不,我将永远从我的主表第一次出现获取文本。 – Alex