有没有办法在PostgreSQL的多值字段中搜索部分匹配？

我有一个表，安静是这样的：有没有办法在PostgreSQL的多值字段中搜索部分匹配？

CREATE TABLE myTable (
    family text, 
    names text[] 
)

我可以搜索这样的：

SELECT family 
FROM myTable where names @> array['B0WP04'];

但我想这样做：

SELECT family 
FROM myTable where names @> array['%P0%'];

这可能吗？

来源

2015-05-07 pidupuis

可以使用parray_gin扩展https://github.com/theirix/parray_gin

这个扩展是说工作只能达到9.2，但我只是安装并测试了9.3和它工作得很好。

这里是如何在Ubuntu-like系统:)

# install postgresql extension network client and postgresql extension build tools 
sudo apt-get install python-setuptools 
easy_install pgxnclient 
sudo apt-get install postgresql-server-dev-9.3 

# get the extension 
pgxn install parray_gin

安装它，这是我的测试

-- as a superuser: add the extension to the current database 
CREATE EXTENSION parray_gin; 

-- as a normal user 
CREATE TABLE test (
    id SERIAL PRIMARY KEY, 
    names TEXT [] 
); 

INSERT INTO test (names) VALUES 
    (ARRAY ['nam1', 'nam2']), 
    (ARRAY ['2nam1', '2nam2']), 
    (ARRAY ['Hello', 'Woooorld']), 
    (ARRAY ['Woooorld', 'Hello']), 
    (ARRAY [] :: TEXT []), 
    (NULL), 
    (ARRAY ['Hello', 'is', 'it', 'me', 'you''re', 'looking', 'for', '?']); 

-- double up the rows in test table, with many rows, the index is used 
INSERT INTO test (names) (SELECT names FROM test); 

SELECT count(*) from test; /* 
count 
-------- 
997376 
(1 row) 
*/

现在，我们有一些测试数据，它的神奇时间：

-- http://pgxn.org/dist/parray_gin/doc/parray_gin.html 
CREATE INDEX names_idx ON test USING GIN (names parray_gin_ops); 

--- now it's time for some tests 
EXPLAIN ANALYZE SELECT * FROM test WHERE names @> ARRAY ['is']; /* 

-- WITHOUT INDEX ON NAMES 
               QUERY PLAN             
------------------------------------------------------------------------------------------------------------ 
Seq Scan on test (cost=0.00..25667.00 rows=1138 width=49) (actual time=0.021..508.599 rows=51200 loops=1) 
    Filter: (names @> '{is}'::text[]) 
    Rows Removed by Filter: 946176 
Total runtime: 653.879 ms 
(4 rows) 

-- WITH INDEX ON NAMES 
                 QUERY PLAN               
---------------------------------------------------------------------------------------------------------------------------- 
Bitmap Heap Scan on test (cost=455.73..3463.37 rows=997 width=49) (actual time=14.327..240.365 rows=51200 loops=1) 
    Recheck Cond: (names @> '{is}'::text[]) 
    -> Bitmap Index Scan on names_idx (cost=0.00..455.48 rows=997 width=0) (actual time=12.241..12.241 rows=51200 loops=1) 
     Index Cond: (names @> '{is}'::text[]) 
Total runtime: 341.750 ms 
(5 rows) 

*/ 

EXPLAIN ANALYZE SELECT * FROM test WHERE names @@> ARRAY ['%nam%']; /* 

-- WITHOUT INDEX ON NAMES 
               QUERY PLAN             
------------------------------------------------------------------------------------------------------------ 
Seq Scan on test (cost=0.00..23914.20 rows=997 width=49) (actual time=0.023..590.093 rows=102400 loops=1) 
    Filter: (names @@> '{%nam%}'::text[]) 
    Rows Removed by Filter: 894976 
Total runtime: 796.636 ms 
(4 rows) 

-- WITH INDEX ON NAMES 
                 QUERY PLAN               
----------------------------------------------------------------------------------------------------------------------------- 
Bitmap Heap Scan on test (cost=159.73..3167.37 rows=997 width=49) (actual time=20.164..293.942 rows=102400 loops=1) 
    Recheck Cond: (names @@> '{%nam%}'::text[]) 
    -> Bitmap Index Scan on names_idx (cost=0.00..159.48 rows=997 width=0) (actual time=18.539..18.539 rows=102400 loops=1) 
     Index Cond: (names @@> '{%nam%}'::text[]) 
Total runtime: 490.060 ms 
(5 rows) 

*/

最终性能完全取决于您的数据和查询，但在我的虚拟示例中，此扩展非常有效c查询时间减半。

来源

2015-05-07 10:10:14

这工作真是太好了！但是，有时使用索引的查询不会返回任何结果。基本上对于使用'％a％'的查询，通常返回整个表现在返回0行，因为我创建了索引。 – pidupuis

在PostgreSQL 9.3，您可以：

select family 
from myTable 
join lateral unnest(mytable.names) as un(name) on true 
where un.name like '%P0%';

但请记住，它可以产生重复，所以perhaphs你想添加不同的。

对于早期版本：

select family 
from myTable where 
exists (select 1 from unnest(names) as un(name) where un.name like '%P0%');

来源

2015-05-07 09:46:49

添加对拉狄克的回答了一下，我试图

select family 
from myTable where 
exists (select 1 from unnest(names) as name where name like '%P0%');

，它也适用。我在PostgreSQL文档中搜索了un()函数，但找不到任何东西。

我不是说它不会做任何事情，但我只是好奇un()功能应该做的事情（和高兴有我的问题解决）

来源

2016-07-26 17:01:30

'作为联合国（名称）'只是幻想别名;）。更多信息：“表格别名的另一种形式为表的列提供了临时名称，以及表格本身：在表格列表中，您可以在https： //www.postgresql.org/docs/current/static/queries-table-expressions.html –

有没有办法在PostgreSQL的多值字段中搜索部分匹配？

回答

相关问题