在couchbase上查询的执行时间太长

我是新的couchbase，我正在使用N1QL做一些查询，但它需要很长时间（9分钟）我的数据有200.000个文档，文档具有嵌套类型，文档中嵌套类型的数量是在200.000个文档之间分配的6.000.000，所以UNNEST操作很重要。我的数据的样品是：在couchbase上查询的执行时间太长

我做

{"p_partkey": 2, "lineorder": [{"customer": [{"c_city": "INDONESIA1"}], "lo_supplycost": 54120, "orderdate": [{"d_weeknuminyear": 19}], "supplier": [{"s_phone": "16-789-973-6601|"}], "commitdate": [{"d_year": 1993}], "lo_tax": 7}, {"customer": [{...

一个查询为：

SELECT SUM(l.lo_extendedprice*l.lo_discount*0.01) as revenue 
from part p UNNEST p.lineorder l UNNEST l.orderdate o 
where o.d_year=1993 and l.lo_discount between 1 and 3 and l.lo_quantity<25;

数据有上面提到的领域。但它需要9分钟才能执行。我只用我的电脑来做，所以只有一个节点。我的电脑有16GB的内存，而集群RAM Cota是3.2GB，只有一个3GB的存储桶。我的数据总大小为2,45GB。我已经使用这里提到的计算：http://docs.couchbase.com/admin/admin/Concepts/bp-sizingGuidelines.html来确定我的群集和存储区的大小。我做错了什么，或者这一次是正确的这个数据量？

现在我已经创建了索引，如：

CREATE INDEX idx_discount ON part(DISTINCT ARRAY l.lo_discount FOR l IN lineorder END); 

CREATE INDEX idx_quantity ON part(DISTINCT ARRAY l.lo_quantity FOR l IN lineorder END); 

CREATE INDEX idx_year ON part(DISTINCT ARRAY o.d_year FOR o IN (DISTINCT ARRAY l.orderdate FOR l IN lineorder END) END);

但数据库不使用它。

一个查询的例子是：

SELECT SUM(l.lo_extendedprice*l.lo_discount*0.01) as revenue 
from part p UNNEST p.lineorder l UNNEST l.orderdate o 
where o.d_year=1993 and l.lo_discount between 1 and 3 and l.lo_quantity<25;

又如，我已创建的索引：

CREATE INDEX teste3 ON `part` (DISTINCT ARRAY l.lo_quantity FOR l IN lineorder END);

和查询：

select l.lo_quantity from part as p UNNEST p.lineorder l where l.lo_quantity>20 limit 3

因为我已删除主索引，它不执行。返回错误： “没有关键空间部分的主索引，使用CREATE PRIMARY INDEX创建一个。”，

来源

2016-05-11 Raphael

您可以使用Couchbase 4.5（GA即将推出）和数组索引。数组索引可以与UNNEST一起使用。它允许您索引数组的各个元素，包括嵌套在其他数组中的数组。

您可以创建以下索引，然后使用EXPLAIN确保使用您的预期索引有IndexScan。

CREATE INDEX idx_discount ON part(DISTINCT ARRAY l.lo_discount FOR l IN lineorder END); 

CREATE INDEX idx_quantity ON part(DISTINCT ARRAY l.lo_quantity FOR l IN lineorder END); 

CREATE INDEX idx_year ON part(DISTINCT ARRAY (DISTINCT ARRAY o.d_year FOR o IN l.orderdate END) FOR l IN lineorder END);

来源

2016-05-11 03:49:45 geraldss

嗨@geraldss，我已经在使用4.5。我的意图是使用索引，因为生病做了不同的查询。 Colud你告诉我，如果我已经正确配置了我的couchbase，并且如果不使用索引，那么另一种方式可以获得更好的性能？谢谢你的帮助。 – Raphael

嗨@Raphael，即使你有很多查询，你也需要使用索引。 Couchbase允许您创建多个索引。 – geraldss

好的@geraldss，非常感谢。 – Raphael

阅读的博客后：http://blog.couchbase.com/2016/may/1.making-most-of-your-arrays..-with-covering-array-indexes-and-more我discovedered问题：

如果你创建这样的INDEX：

CREATE INDEX iflight_day 
     ON `travel-sample` (DISTINCT ARRAY v.flight FOR v IN schedule END);

你必须使用相同的字母的查询，在这种情况下字母'v'。

SELECT v.day from `travel-sample` as t UNNEST t.schedule v where v.flight="LY104";

同样是最深层次的情况：

CREATE INDEX inested ON `travel-sample` 
(DISTINCT ARRAY (DISTINCT ARRAY y.flight FOR y IN x.special_flights END) FOR x IN schedule END);

在这种情况下，你必须使用 'Y' 和 'X'：

SELECT x.day from `travel-sample` as t UNNEST t.schedule x UNNEST x.special_flights y where y.flight="AI444";

现在每一件事工作精细。

但另一个问题出现了，当我质疑这样的：

SELECT * from `travel-sample` as t UNNEST t.schedule x UNNEST x.special_flights y 
where x.day=7 and y.flight="AI444";

只有一天像索引创建上面使用。

CREATE INDEX day 
      ON `travel-sample` (DISTINCT ARRAY y.day FOR y IN schedule END);

它只使用一个索引，有时是'日'，有时'inested'。

来源

2016-08-12 22:09:52 Raphael

变量必须在UNNEST和数组索引之间匹配。试试这个：SELECT s.day FROM \'travel-sample \'AS t UNNEST t.schedule AS v WHERE v.flight =“LY146”; – geraldss

@geraldss，我刚刚安装了最新版本的企业版。我的查询就像你的，你只是改变每v的s，但它不是使用索引iflight_day，只有当我这样查询：SELECT s.day from（SELECT schedule FROM travel-sample WHERE ANY v IN schedule SATISFIES v.flight = “LY146”END）as t UNNEST t.schedule s where s.flight =“LY146”; – Raphael

不知道为什么发生这种情况。请尝试即将到来的4.5.1。他们都为我工作。 – geraldss

在couchbase上查询的执行时间太长

回答

相关问题