如何在elasticsearch中存储关系数据

在elasticsearch中存储关系数据有哪些选项。我知道以下方法如何在elasticsearch中存储关系数据

嵌套对象： - 我不想存储在嵌套格式的数据，因为我想在不改变其他文件，如果我使用嵌套的对象则更新一个文档将在父文件中重复儿童数据。
亲子： - 我不想将数据存储在单个索引中，但使用父子数据需要在一个索引（不同类型）中存在。我知道这个限制将会在未来版本中被删除，如https://github.com/elastic/elasticsearch/issues/15613问题所述，但我想要一个适用于5.5版本的解决方案。

除此之外，还有其他方法吗？

来源

2017-09-07 mkalsi

还有两种方法：Denormalization和running multiple queries for joins。

非规范化会占用更多空间并增加写入时间，但您只需运行一个查询即可检索数据，因此您的阅读时间将会提高。由于您不想将数据存储在单个索引中，因此加入可能会帮助您解决问题。

来源

2017-09-08 00:04:45 rndus2r

嵌套对象是一个完美的方法。如果您正确更新子对象，则不会在父文档中重复子对象。对于我需要维护Master-Child一对多关系的关系数据的其中一个用例，我使用了相同的方法。我写了一个无痛脚本为更新API到添加 & 更新现有嵌套子，而无需创建副本或重复条目父文档中的对象。

更新答：

下面是一个嵌入式嵌套类型的文档“孩子的”亲子嵌套类型文档的结构。

{ 
    "parent_id": 1, 
    "parent_name": "ABC", 
    "parent_number": 123, 
    "parent_addr": "123 6th St. Melbourne, FL 32904" 
    "childs": [ 
     { 
     "child_id": 1, 
     "child_name": "PQR", 
     "child_number": 456, 
     "child_age": 10 
     }, 
     { 
     "child_id": 2, 
     "child_name": "XYZ", 
     "child_number": 789, 
     "child_age": 12 
     }, 
     { 
     "child_id": 3, 
     "child_name": "QWE", 
     "child_number": 234, 
     "child_age": 16 
     } 

    ] 
}

映射将如下：

PUT parent/ 
{ 
    "parent": { 
    "mappings": { 
     "parent": { 
     "properties": { 
      "parent_id": { 
      "type": "long" 
      }, 
      "parent_name": { 
      "type": "text", 
      "fields": { 
       "keyword": { 
       "type": "keyword", 
       "ignore_above": 256 
       } 
      } 
      }, 
      "parent_number": { 
      "type": "long" 
      }, 
      "parent_addr": { 
      "type": "text", 
      "fields": { 
       "keyword": { 
       "type": "keyword", 
       "ignore_above": 256 
       } 
      } 
      }, 
      "child_tickets": { 
      "type": "nested", 
      "properties": { 
       "child_id": { 
       "type": "long" 
       }, 
       "child_name": { 
       "type": "text", 
       "fields": { 
        "keyword": { 
        "type": "keyword", 
        "ignore_above": 256 
        } 
       } 
       }, 
       "child_number": { 
       "type": "long" 
       }, 
       "child_age": { 
       "type": "long" 
       } 
      } 
      } 
     } 
     } 
    } 
    } 
}

在关系数据库管理系统，这两个实体（父母，子女）是由一个家长之间的两个不同的表到许多关系 - >子。 Parent的id是Child的行的外键。（id是必须为两个表）

现在在Elasticsearch中，要索引父文档，我们必须有id来索引它，在这种情况下，它是parent_id。指数父文档查询（PARENT_ID是我在谈论，并有与索引ID（_id文件的ID）= 1）：

POST parent/parent/1 
{ 
    "parent_id": 1, 
    "parent_name": "ABC", 
    "parent_number": 123, 
    "parent_addr": "123 6th St. Melbourne, FL 32904" 
}

现在，加入孩子（一个或多个）父。为此，您将需要应具有子ID和父ID的子文档。要添加一个孩子，父母的ID是必须的。以下是更新查询来添加新的孩子或更新已经存在的孩子。

POST parent/parent/1/_update 
{ 
    "script":{ 
    "lang":"painless", 
    "inline":"if (!ctx._source.containsKey(\"childs\")) { 
       ctx._source.childs = []; 
       ctx._source.childs.add(params.child); 
      } else { 
       int flag=0; 
       for(int i=0;i<ctx._source.childs.size();i++){ 
        if(ctx._source.childs[i].child_id==params.child.child_id){ 
         ctx._source.childs[i]=params.child; 
         flag++; 
        } 
       } 
       if(flag==0){ 
        ctx._source.childs.add(params.child); 
       } 
      }", 
    "params":{ 
     "child":{ 
       "child_id": 1, 
       "child_name": "PQR", 
       "child_number": 456, 
       "child_age": 10 
      } 
     } 
    } 
}

给它一个镜头。干杯!

让我知道你是否需要别的东西。

来源

2017-09-08 21:20:25

感谢Hatim，你可以分享你的脚本来存储嵌套对象而不重复。 – mkalsi

@mkalsi我已经用示例和脚本更新了我的答案。核实。 –

如果我们的数据集非常大，我们计划索引2000万个文档，那么内联脚本的性能如何。 – mkalsi

如何在elasticsearch中存储关系数据

回答

相关问题