2016-05-31 95 views
0

对于ELK堆栈,尤其是ES,我是全新的。 我正在尝试导入使用Google Admin SDK API的JSON文件,并且想将其导入Elasticsearch。将Google API JSON文件导入Elasticsearch

到目前为止,这是我的数据的JSON结构:

{ 
"kind": "reports#activities", 
"nextPageToken": string, 
"items": [ 
{ 
"kind": "audit#activity", 
    "id": { 
    "time": datetime, 
    "uniqueQualifier": long, 
    "applicationName": string, 
    "customerId": string 
    }, 
    "actor": { 
    "callerType": string, 
    "email": string, 
    "profileId": long, 
    "key": string 
    }, 
    "ownerDomain": string, 
    "ipAddress": string, 
    "events": [ 
    { 
     "type": string, 
     "name": string, 
     "parameters": [ 
     { 
      "name": string, 
      "value": string, 
      "intValue": long, 
      "boolValue": boolean 
     } 
     ] 
    } 
    ] 
    } 
] 
} 

所以我决定先使用此命令上传的JSON文件到ES:

curl -s -XPOST 'localhost:9200/_bulk' --data-binary @documents.json 

,但我得到了一些错误:

{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [START_ARRAY]"}],"type":"illegal_argument_exception","reason":"Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [START_ARRAY]"},"status":400} 

我该怎么办?

谢谢你的帮助!

回答

0

JSON似乎是定义了您的文档结构,因此您首先需要创建一个索引并使用与该结构匹配的映射。在你的情况,你可以做这样的:

curl -XPUT localhost:9200/reports -d '{ 
    "nextPageToken": { 
    "type": "string" 
    }, 
    "items": { 
    "properties": { 
     "kind": { 
     "type": "string" 
     }, 
     "id": { 
     "properties": { 
      "time": { 
      "type": "date", 
      "format": "date_time" 
      }, 
      "uniqueQualifier": { 
      "type": "long" 
      }, 
      "applicationName": { 
      "type": "string" 
      }, 
      "customerId": { 
      "type": "string" 
      } 
     } 
     }, 
     "actor": { 
     "properties": { 
      "callerType": { 
      "type": "string" 
      }, 
      "email": { 
      "type": "string" 
      }, 
      "profileId": { 
      "type": "long" 
      }, 
      "key": { 
      "type": "string" 
      } 
     } 
     }, 
     "ownerDomain": { 
     "type": "string" 
     }, 
     "ipAddress": { 
     "type": "string" 
     }, 
     "events": { 
     "properties": { 
      "type": { 
      "type": "string" 
      }, 
      "name": { 
      "type": "string" 
      }, 
      "parameters": { 
      "properties": { 
       "name": { 
       "type": "string" 
       }, 
       "value": { 
       "type": "string" 
       }, 
       "intValue": { 
       "type": "long" 
       }, 
       "boolValue": { 
       "type": "boolean" 
       } 
      } 
      } 
     } 
     } 
    } 
    } 
}' 

这个正在做,你可以遵循使用批量调用上面的结构,现在指数的reports#activities文件。批量调用的语法被精确地定义为here,即您需要一个命令行(该怎么做),然后是下一行文档来源(要索引什么),它不能包含任何新行!

所以,你需要像这样重新格式化你的documents.json文件(确保在第二行之后添加一个新行)。另外请注意,我已经添加了一些虚拟数据来说明该过程:

{"index": {"_index": "reports", "_type": "activity"}} 
{"kind":"reports#activities","nextPageToken":"string","items":[{"kind":"audit#activity","id":{"time":"2016-05-31T00:00:00.000Z","uniqueQualifier":1,"applicationName":"string","customerId":"string"},"actor":{"callerType":"string","email":"string","profileId":1,"key":"string"},"ownerDomain":"string","ipAddress":"string","events":[{"type":"string","name":"string","parameters":[{"name":"string","value":"string","intValue":1,"boolValue":true}]}]}]} 
+0

感谢您的提示Val!实际上,我的JSON数据包含数组(项目[],事件[]和参数[]),所以我稍微编辑了有关索引创建的代码,方法是将括号替换为括号,并且工作正常! – Felz

+0

不,你不应该这样做,ES会为你创建这些数组;请参见[this](https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html) ) – Val