2015-09-07 66 views
0

我正在使用flume将日志行处理为hdfs,并使用ElasticSearchSink将它们记录到ElasticSearch中。Flume ElasticSearchSink不会消耗所有消息

这里是我的配置:

agent.channels.memory-channel.type = memory 

agent.sources.tail-source.type = exec 
agent.sources.tail-source.command = tail -4000 /home/cto/hs_err_pid11679.log 
agent.sources.tail-source.channels = memory-channel 

agent.sinks.log-sink.channel = memory-channel 
agent.sinks.log-sink.type = logger 

#####INTERCEPTORS 

agent.sources.tail-source.interceptors = timestampInterceptor 
agent.sources.tail-source.interceptors.timestampInterceptor.type = org.apache.flume.interceptor.TimestampInterceptor$Builder 

####SINK 
# Setting the sink to HDFS 
agent.sinks.hdfs-sink.channel = memory-channel 
agent.sinks.hdfs-sink.type = hdfs 
agent.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/data/flume/%y-%m-%d/ 
agent.sinks.hdfs-sink.hdfs.fileType = DataStream 
agent.sinks.hdfs-sink.hdfs.inUsePrefix =. 
agent.sinks.hdfs-sink.hdfs.rollCount = 0 
agent.sinks.hdfs-sink.hdfs.rollInterval = 0 
agent.sinks.hdfs-sink.hdfs.rollSize = 10000000 
agent.sinks.hdfs-sink.hdfs.idleTimeout = 10 
agent.sinks.hdfs-sink.hdfs.writeFormat = Text 

agent.sinks.elastic-sink.channel = memory-channel 
agent.sinks.elastic-sink.type = org.apache.flume.sink.elasticsearch.ElasticSearchSink 
agent.sinks.elastic-sink.hostNames = 127.0.0.1:9300 
agent.sinks.elastic-sink.indexName = flume_index 
agent.sinks.elastic-sink.indexType = logs_type 
agent.sinks.elastic-sink.clusterName = elasticsearch 
agent.sinks.elastic-sink.batchSize = 500 
agent.sinks.elastic-sink.ttl = 5d 
agent.sinks.elastic-sink.serializer = org.apache.flume.sink.elasticsearch.ElasticSearchDynamicSerializer 


# Finally, activate. 
agent.channels = memory-channel 
agent.sources = tail-source 
agent.sinks = log-sink hdfs-sink elastic-sink 

的问题是,我只看到在HDFS文件中使用弹性1-2 kibana消息和大量消息。

任何想法我在这里失踪?

+0

不知道是什么问题,但我想知道这是否可能是由于使用一个通道发送事件到多个接收器? [见这里](https://mail-archives.apache.org/mod_mbox/flume-user/201306.mbox/%[email protected]%3E)也了解到' agent.sinks.elastic-sink.serializer = org.apache.flume.sink.elasticsearch.ElasticSearchDynamicSerializer'应该被删除,否则时间戳字段被创建为错误的类型。 –

回答

0

该问题与串行器中的错误有关。 如果我们放弃该行:

agent.sinks.elastic-sink.serializer = org.apache.flume.sink.elasticsearch.ElasticSearchDynamicSerializer 

消息消耗没有问题。 问题在于使用序列化程序时创建@timestamp字段的方式。