对主键和外键以外的postgres列使用JSONB

我刚刚发现了PostgreSQL的JSONB，并想知道如果我将它用于所有表的列，会出现什么问题？对主键和外键以外的postgres列使用JSONB

也就是说我所有的表将有主键和外键列和类型JSONB的任何其他数据的field列。

除了占用额外的空间，因为JSONB的开销，并失去打字“列”，那会我错过？

来源

2017-02-15 dynamic_cast

事实证明，你在这里的东西。

使用关系数据库的要点。

定义良好的关系。
一个明确的和详细的模式。
大型数据集的高性能。

玩家可以不断的关系。但是你失去了架构和很多的性能。模式不仅仅是数据验证。这意味着您不能在个别字段上使用触发器或约束条件。

至于性能...你会注意到的JSONB性能最测试是对其他类似的数据类型。他们从不反对正常的SQL表。这是因为，尽管JSONB效率惊人，但其效率几乎不如常规SQL高。所以我们来测试一下，结果发现你在这里做些什么。

运用this JSONB performance presentation我创建了一个正确的SQL架构中的数据集...

create table customers (
    id text primary key 
); 

create table products (
    id text primary key, 
    title text, 
    sales_rank integer, 
    "group" text, 
    category text, 
    subcategory text, 
    similar_ids text[] 
); 

create table reviews (
    customer_id text references customers(id), 
    product_id text references products(id), 
    "date" timestamp, 
    rating integer, 
    votes integer, 
    helpful_votes integer 
);

，另一种使用SQL的关系，但JSONB数据...

create table customers (
    id text primary key 
); 

create table products_jb (
    id text primary key, 
    fields jsonb 
); 

create table reviews_jb (
    customer_id text references customers(id), 
    product_id text references products_jb(id), 
    fields jsonb 
);

和一个JSONB表。

create table reviews_jsonb (
    review jsonb 
);

然后我imported the same data into both sets of tables using a little script。 589859评论，93319产品，98761客户。

让我们尝试相同的查询作为JSONB性能的文章，得到的平均评价一个产品类别。首先，没有索引。

传统SQL：138毫秒

test=> select round(avg(r.rating), 2) 
from reviews r 
join products p on p.id = r.product_id 
where p.category = 'Home & Garden'; 
round 
------- 
    4.59 
(1 row) 

Time: 138.631 ms

完全JSONB：380毫秒

test=> select round(avg((review#>>'{review,rating}')::numeric),2) 
test-> from reviews_jsonb 
test-> where review #>>'{product,category}' = 'Home & Garden'; 
round 
------- 
    4.59 
(1 row) 

Time: 380.697 ms

混合JSONB：190毫秒

test=> select round(avg((r.fields#>>'{rating}')::numeric),2) 
from reviews_jb r 
join products_jb p on p.id = r.product_id 
where p.fields#>>'{category}' = 'Home & Garden'; 
round 
------- 
    4.59 
(1 row) 

Time: 192.333 ms

那老老实实去比想象的要好。混合方法的速度是完整JSONB的两倍，但比普通SQL慢50％。现在如何与索引？

传统SQL：130毫秒（500毫秒的索引）

test=> create index products_category on products(category); 
CREATE INDEX 
Time: 491.969 ms 

test=> select round(avg(r.rating), 2) 
from reviews r 
join products p on p.id = r.product_id 
where p.category = 'Home & Garden'; 
round 
------- 
    4.59 
(1 row) 

Time: 128.212 ms

全JSONB：360毫秒（+ 25000毫秒的索引）

test=> create index on reviews_jsonb using gin(review); 
CREATE INDEX 
Time: 25253.348 ms 
test=> select round(avg((review#>>'{review,rating}')::numeric),2) 
from reviews_jsonb 
where review #>>'{product,category}' = 'Home & Garden'; 
round 
------- 
    4.59 
(1 row) 

Time: 363.222 ms

混合JSONB：185毫秒（ +6900毫秒为指标）

test=> create index on products_jb using gin(fields); 
CREATE INDEX 
Time: 3654.894 ms 
test=> create index on reviews_jb using gin(fields); 
CREATE INDEX 
Time: 3237.534 ms 
test=> select round(avg((r.fields#>>'{rating}')::numeric),2) 
from reviews_jb r 
join products_jb p on p.id = r.product_id 
where p.fields#>>'{category}' = 'Home & Garden'; 
round 
------- 
    4.59 
(1 row) 

Time: 183.679 ms

原来，这是一个查询索引不会有太大的帮助。

这就是我所看到的数据有点混杂，混合JSONB总是比完整SQL慢，但比完整JSONB更快。这似乎是一个很好的妥协。您可以使用传统的外键和连接，但可以灵活地添加您喜欢的任何字段。

我建议采取混合方式更进一步：为您所知的字段使用SQL列，并有一个JSONB列来提取任何其他字段以提高灵活性。

我鼓励你在这里玩弄测试数据，看看表现如何。

来源

2017-02-15 20:28:45 Schwern

对主键和外键以外的postgres列使用JSONB

回答

相关问题