2016-11-14 35 views
1

嘿,我有一个群集ID不匹配出于某种原因,我是有1个节点上,然后清除数据目录几次,改变簇标记和节点名称后disapperead,但apperead另一个ETCD集群ID mistmatch

这里我使用

IP0=10.150.0.1 
IP1=10.150.0.2 
IP2=10.150.0.3 
IP3=10.150.0.4 
NODENAME0=node0 
NODENAME1=node1 
NODENAME2=node2 
NODENAME3=node3 

# changing these on each box 
THISIP=$IP2 
THISNODENAME=$NODENAME2 

etcd --name $THISNODENAME --initial-advertise-peer-urls http://$THISIP:2380 \ 
--data-dir /root/etcd-data \ 
--listen-peer-urls http://$THISIP:2380 \ 
--listen-client-urls http://$THISIP:2379,http://127.0.0.1:2379 \ 
--advertise-client-urls http://$THISIP:2379 \ 
--initial-cluster-token etcd-cluster-2 \ 
--initial-cluster $NODENAME0=http://$IP0:2380,$NODENAME1=http://$IP1:2380,$NODENAME2=http://$IP2:2380,$NODENAME3=http://$IP3:2380 \ 
--initial-cluster-state new 

剧本我得到

2016-11-11 22:13:12.090515 I | etcdmain: etcd Version: 2.3.7 
2016-11-11 22:13:12.090643 N | etcdmain: the server is already initialized as member before, starting as etcd member... 
2016-11-11 22:13:12.090713 I | etcdmain: listening for peers on http://10.150.0.3:2380 
2016-11-11 22:13:12.090745 I | etcdmain: listening for client requests on http://10.150.0.3:2379 
2016-11-11 22:13:12.090771 I | etcdmain: listening for client requests on http://127.0.0.1:2379 
2016-11-11 22:13:12.090960 I | etcdserver: name = node2 
2016-11-11 22:13:12.090976 I | etcdserver: data dir = /root/etcd-data 
2016-11-11 22:13:12.090983 I | etcdserver: member dir = /root/etcd-data/member 
2016-11-11 22:13:12.090990 I | etcdserver: heartbeat = 100ms 
2016-11-11 22:13:12.090995 I | etcdserver: election = 1000ms 
2016-11-11 22:13:12.091001 I | etcdserver: snapshot count = 10000 
2016-11-11 22:13:12.091011 I | etcdserver: advertise client URLs = http://10.150.0.3:2379 
2016-11-11 22:13:12.091269 I | etcdserver: restarting member 7fbd572038b372f6 in cluster 4e73d7b9b94fe83b at commit index 4 
2016-11-11 22:13:12.091317 I | raft: 7fbd572038b372f6 became follower at term 8 
2016-11-11 22:13:12.091346 I | raft: newRaft 7fbd572038b372f6 [peers: [], term: 8, commit: 4, applied: 0, lastindex: 4, lastterm: 1] 
2016-11-11 22:13:12.091516 I | etcdserver: starting server... [version: 2.3.7, cluster version: to_be_decided] 
2016-11-11 22:13:12.091869 E | etcdmain: failed to notify systemd for readiness: No socket 
2016-11-11 22:13:12.091894 E | etcdmain: forgot to set Type=notify in systemd service file? 
2016-11-11 22:13:12.096380 N | etcdserver: added member 7508b3e625cfed5 [http://10.150.0.4:2380] to cluster 4e73d7b9b94fe83b 
2016-11-11 22:13:12.099800 N | etcdserver: added member 14c76eb5d27acbc5 [http://10.150.0.1:2380] to cluster 4e73d7b9b94fe83b 
2016-11-11 22:13:12.100957 N | etcdserver: added local member 7fbd572038b372f6 [http://10.150.0.2:2380] to cluster 4e73d7b9b94fe83b 
2016-11-11 22:13:12.102711 N | etcdserver: added member d416fca114f17871 [http://10.150.0.3:2380] to cluster 4e73d7b9b94fe83b 
2016-11-11 22:13:12.134330 E | rafthttp: request cluster ID mismatch (got cfd5ef74b3dcf6fe want 4e73d7b9b94fe83b) 

给其他成员,甚至没有运行,如何这是可能的?

谢谢

回答

1

所有这些谁从谷歌在中国原创:

误差在对等成员ID,即试图加入同名集群的另一个成员(可能是旧的实例)已经存在于群集中(具有相同的对等名称,但具有另一个ID,这是问题)。

你应该删除等,它喜欢在这个有用的帖子所示重新添加:https://ngineered.co.uk/blog/how-to-replace-a-etcd-node

使用“etcdctl成员列表”中找到什么是当前成员的ID,并找到其试图加入一个具有错误ID的群集,然后使用“etcdctl member remove”从“成员”中删除该对等方并尝试重新加入。 希望它有帮助。

0

我的--data-dir =/var/etcd/data,删除并重新创建它,这对我很有用。看来以前的etcd集群中的某些东西留在了这个目录中,这可能会影响到etcd设置。

0

我都面临着同样的问题,我们的领袖ETCD台服务器宕机,并用新的替换它后,我们得到一个错误

rafthttp: request sent was ignored (cluster ID mismatch) 

有人找老cluster-id和产生一些随机的本地集群有一些配置错误。

按照以下步骤解决问题。如果ETCD进程正在运行systemctl etcd2 stop

  • 从删除数据

    1. 登录到其他工作组,并从 撤销不可达成员集群

      etcdctl cluster-health etcdctl member remove member-id

    2. 登录到新服务器,并停止数据目录rm -rf /var/etcd2/data在删除之前,请将此数据备份到其他文件夹的某处。

    3. 现在使用--initial-cluster-state existing参数启动群集,如果您已将服务器添加到现有群集,请不要使用--initial-cluster-state new

    4. 现在返回到正在运行的ETCD服务器之一,并添加这个新成员集群etcdctl member add node0 http://$IP:2380

    我花了很多时间调试这个问题,现在我的集群正在运行的所有健康成员。希望这些信息有帮助。

  • 0
    在我的情况

    我得到了错误

    rafthttp:请求集群ID不匹配(有1b3a88599e79f82b想b33939d80a381a57)

    由于不正确的配置的一个节点

    我的两个节点中的配置

    抓住

    env ETCD_INITIAL_CLUSTER =“etcd-01 = http://172.16.50.101:2380,etcd-02=http://172.16.50.102:2380,etcd-03=http://172.16.50.103:2380

    and one node got

    ENV ETCD_INITIAL_CLUSTER = “ETCD-01 = http://172.16.50.101:2380

    解决我停止ETCD所有节点上的问题,编辑不正确的配置, 删除的/ var/lib中/ ETCD /成员中的所有节点的文件夹,重新启动ETCD在所有节点上,瞧!

    p.s.

    /var/lib/etcd - etcd保存其数据的文件夹