2017-08-03 74 views
0

我使用Druid Kafka Indexing服务来加载我自己的卡夫卡流。我使用Load from Kafka tutorial来实现它。德鲁伊卡夫卡摄取(暗示-2.2.3):卡夫卡错误NoReplicaOnlineException

卡夫卡的所有设置默认(仅从tgz提取)。

当我开始暗示-2.2.3(德鲁伊)与空数据(后VAR删除文件夹)中的所有工作正常。

但是,当我停止卡夫卡2.11-0.10.2.0并启动它,直到我停止暗示(德鲁伊),并删除所有数据再次出现错误和德鲁伊卡夫卡摄入没有更多的作品(即除去VAR文件夹)。

有时候德鲁伊不会从卡夫卡摄取数据,甚至在卡夫卡也没有错误。 当我删除var文件夹在德鲁伊全部被修复,直到下一个相同的错误。

错误:

kafka.common.NoReplicaOnlineException: No replica for partition [__consumer_offsets,19] is alive. Live brokers are: [Set()], Assigned replicas are: [List(0)] 
    at kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:73) ~[kafka_2.11-0.10.2.0.jar:?] 
    at kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:339) ~[kafka_2.11-0.10.2.0.jar:?] 
    at kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:200) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:115) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:112) [kafka_2.11-0.10.2.0.jar:?] 
    at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) [scala-library-2.11.8.jar:?] 
    at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) [scala-library-2.11.8.jar:?] 
    at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) [scala-library-2.11.8.jar:?] 
    at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) [scala-library-2.11.8.jar:?] 
    at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) [scala-library-2.11.8.jar:?] 
    at scala.collection.mutable.HashMap.foreach(HashMap.scala:99) [scala-library-2.11.8.jar:?] 
    at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) [scala-library-2.11.8.jar:?] 
    at kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:112) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.controller.PartitionStateMachine.startup(PartitionStateMachine.scala:67) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:342) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:160) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.server.ZookeeperLeaderElector.elect(ZookeeperLeaderElector.scala:85) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply$mcZ$sp(ZookeeperLeaderElector.scala:51) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:49) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:49) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.server.ZookeeperLeaderElector.startup(ZookeeperLeaderElector.scala:49) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.controller.KafkaController$$anonfun$startup$1.apply$mcV$sp(KafkaController.scala:681) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:677) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:677) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.controller.KafkaController.startup(KafkaController.scala:677) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.server.KafkaServer.startup(KafkaServer.scala:224) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:39) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.Kafka$.main(Kafka.scala:67) [kafka_2.11-0.10.2.0.jar:?] 
    at kafka.Kafka.main(Kafka.scala) [kafka_2.11-0.10.2.0.jar:?] 

,我做的步骤:

1.启动意味着:

bin/supervise -c conf/supervise/quickstart.conf 

2.启动卡夫卡:

./bin/kafka-server-start.sh config/server.properties 

3.创建话题:

./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic wikiticker 

4.启用德鲁伊卡夫卡摄入:

curl -XPOST -H'Content-Type: application/json' -d @quickstart/wikiticker-kafka-supervisor.json http://localhost:8090/druid/indexer/v1/supervisor 

5.后事件到卡夫卡的主题,然后将其摄取到德鲁伊由卡夫卡索引服务

在所有的.properties文件(common.runtime.properties,经纪人,协调器,历史,middlemanager,霸王)添加的属性:

druid.extensions.loadList=["druid-caffeine-cache", "druid-histogram", "druid-datasketches", "druid-kafka-indexing-service"] 

包括“德鲁伊卡夫卡索引服务”提供摄取服务。

我相信这样的问题不应该发生在Druid Kafka Indexing

有没有办法找出这个问题?

回答

0

该消息表示代理ID为0的代理已关闭,并且因为它是托管该分区的唯一代理,所以您现在无法使用该分区。您必须确保代理0已启动并投放。

0

看起来您拥有单个节点Kafka集群,并且唯一的代理节点已关闭。这不是一个非常容错的设置。您应该有3个卡夫卡经纪人并设置复制因子为3的所有主题,以便系统即使在一个或两个卡夫卡经纪人关闭时也能正常工作。单节点群集通常仅用于开发。

0

我通过添加3个Kafka代理并设置复制因子为3的所有主题来解决Kafka稳定性问题。

在德鲁伊我修正了问题,增加druid.worker.capacity在middleManager和减少taskDurationioConfig监督规范。

详情在another question