0
我正在使用flume将日志行处理为hdfs,并使用ElasticSearchSink将它们记录到ElasticSearch中。Flume ElasticSearchSink不会消耗所有消息
这里是我的配置:
agent.channels.memory-channel.type = memory
agent.sources.tail-source.type = exec
agent.sources.tail-source.command = tail -4000 /home/cto/hs_err_pid11679.log
agent.sources.tail-source.channels = memory-channel
agent.sinks.log-sink.channel = memory-channel
agent.sinks.log-sink.type = logger
#####INTERCEPTORS
agent.sources.tail-source.interceptors = timestampInterceptor
agent.sources.tail-source.interceptors.timestampInterceptor.type = org.apache.flume.interceptor.TimestampInterceptor$Builder
####SINK
# Setting the sink to HDFS
agent.sinks.hdfs-sink.channel = memory-channel
agent.sinks.hdfs-sink.type = hdfs
agent.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/data/flume/%y-%m-%d/
agent.sinks.hdfs-sink.hdfs.fileType = DataStream
agent.sinks.hdfs-sink.hdfs.inUsePrefix =.
agent.sinks.hdfs-sink.hdfs.rollCount = 0
agent.sinks.hdfs-sink.hdfs.rollInterval = 0
agent.sinks.hdfs-sink.hdfs.rollSize = 10000000
agent.sinks.hdfs-sink.hdfs.idleTimeout = 10
agent.sinks.hdfs-sink.hdfs.writeFormat = Text
agent.sinks.elastic-sink.channel = memory-channel
agent.sinks.elastic-sink.type = org.apache.flume.sink.elasticsearch.ElasticSearchSink
agent.sinks.elastic-sink.hostNames = 127.0.0.1:9300
agent.sinks.elastic-sink.indexName = flume_index
agent.sinks.elastic-sink.indexType = logs_type
agent.sinks.elastic-sink.clusterName = elasticsearch
agent.sinks.elastic-sink.batchSize = 500
agent.sinks.elastic-sink.ttl = 5d
agent.sinks.elastic-sink.serializer = org.apache.flume.sink.elasticsearch.ElasticSearchDynamicSerializer
# Finally, activate.
agent.channels = memory-channel
agent.sources = tail-source
agent.sinks = log-sink hdfs-sink elastic-sink
的问题是,我只看到在HDFS文件中使用弹性1-2 kibana消息和大量消息。
任何想法我在这里失踪?
不知道是什么问题,但我想知道这是否可能是由于使用一个通道发送事件到多个接收器? [见这里](https://mail-archives.apache.org/mod_mbox/flume-user/201306.mbox/%[email protected]%3E)也了解到' agent.sinks.elastic-sink.serializer = org.apache.flume.sink.elasticsearch.ElasticSearchDynamicSerializer'应该被删除,否则时间戳字段被创建为错误的类型。 –