2015-06-10 89 views
1

当我用Python Elasticsearch API查询Elasticsearch时,我得到了大约5000个结果。在搜索查询中设置“大小”参数为较大的值不是结果的数量导致下面的Java OOM错误:Elasticsearch批量获取搜索结果?

File "MGDFinder.py", line 114, in <module> 
    res = es.search(index="_all", body=queryMaker(state)) 
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py", line 68, in _wrapped 
    return func(*args, params=params, **kwargs) 
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.py", line 440, in search 
    params=params, body=body) 
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.py", line 276, in perform_request 
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout) 
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_urllib3.py", line 55, in perform_request 
    self._raise_error(response.status, raw_data) 
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/base.py", line 97, in _raise_error 
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info) 
elasticsearch.exceptions.TransportError: TransportError(500, u'OutOfMemoryError[Java heap space]') 

我注意到,这种情况发生时,大小设置为甚至只是700我不想增加我的Java堆大小。有没有一种方法可以分批执行我的搜索?

回答

0

我不认为你可以批量请求不增加Java Heap Space,服务器仍然会存储5000个结果并返回。

我认为你可以使用scroll来获得请求,scroll可以从大量的结果中快速检索,它在传统数据库中喜欢cursor

示例请求:

$ curl -XGET 'localhost:9200/world/test/_search?scroll=1m&pretty' -d ' 
{ 
    "size": 50, 
    "query": { 
     "match_all": {} 
    } 
}' 

样本响应:

{ 
    "_scroll_id" : "cXVlcnlUaGVuRmV0Y2g7NTszNjpXZW9lRnJXSFItT0U2YUtIM1hOa0FBOzM3Oldlb2VGcldIUi1PRTZhS0gzWE5rQUE7Mzg6V2VvZUZyV0hSLU9FNmFLSDNYTmtBQTs0MDpXZW9lRnJXSFItT0U2YUtIM1hOa0FBOzM5Oldlb2VGcldIUi1PRT 
    "took" : 5, 
    "timed_out" : false, 
    "_shards" : { 
    "total" : 5, 
    "successful" : 5, 
    "failed" : 0 
    }, 
    "hits" : {.... 

结果将返回滚动ID,它可以用来获取下一个命中。

样品scroll请求(-d _scroll_id):

$ curl -XGET 'localhost:9200/_search/scroll?scroll=1m&pretty' -d 'cXVlcnlUaGVuRmV0Y2g7NTszMTpXZW9lRnJXSFItT0U2YUtIM1hOa0FBOzMyOldlb2VGcldIUi1PRTZhS0gzWE5rQUE7MzM6V2VvZUZyV0hSLU9FNmFLSDNYT 
mtBQTszNDpXZW9lRnJXSFItT0U2YUtIM1hOa0FBOzM1Oldlb2VGcldIUi1PRTZhS0gzWE5rQUE7MDs=' 

公文:Scroll