在Windows 10主机上有两个运行openSUSE的VirtualBox,每个运行着一个JBoss DataGrid。 iptables
规则在两个VM上都在系统范围内禁用。 VM的网络适配器都配置为Bridged Adapters
。从一个虚拟机到另一个虚拟机的JBoss DataGrid udp复制
当第二个实例启动时,首先观察它并成为主人。第二个成为奴隶。
有如下每个数据网格配置分布式缓存:
<cache-container name="clustered" default-cache="oaas-properties-cache" statistics="true">
<transport stack="udp" cluster="oaas-cluster" lock-timeout="60000"/>
<distributed-cache name="code-error-message-cache" mode="ASYNC" batching="false">
<eviction strategy="LIRS" max-entries="10000"/>
<expiration max-idle="${oaas.maxidle.lifespan:87400000}" lifespan="${oaas.properties.lifespan:86400000}"/>
</distributed-cache>
</cache-container>
套接字绑定还配置:
<socket-binding-group name="standard-sockets" default-interface="public" port-offset="${jboss.socket.binding.port-offset:5}">
<socket-binding name="management-native" interface="management" port="${jboss.management.native.port:9999}"/>
<socket-binding name="management-http" interface="management" port="${jboss.management.http.port:9990}"/>
<socket-binding name="management-https" interface="management" port="${jboss.management.https.port:9443}"/>
<socket-binding name="ajp" port="8009"/>
<socket-binding name="hotrod" interface="management" port="11222"/>
<socket-binding name="http" port="8080"/>
<socket-binding name="https" port="8443"/>
<socket-binding name="jgroups-mping" port="0" multicast-address="${jboss.default.multicast.address:234.99.54.14}" multicast-port="45700"/>
<socket-binding name="jgroups-tcp" port="7600"/>
<socket-binding name="jgroups-tcp-fd" port="57600"/>
<socket-binding name="jgroups-udp" port="55200" multicast-address="${jboss.default.multicast.address:234.99.54.14}" multicast-port="45688"/>
<socket-binding name="jgroups-udp-fd" port="54200"/>
<socket-binding name="memcached" interface="management" port="11211"/>
<socket-binding name="modcluster" port="0" multicast-address="<A_REAL_IP_HOES_HERE>" multicast-port="23364"/>
<socket-binding name="remoting" port="4447"/>
<socket-binding name="txn-recovery-environment" port="4712"/>
<socket-binding name="txn-status-manager" port="4713"/>
</socket-binding-group>
据对两个VM UDP包从一个去Wireshark的一边到另一边,反之亦然。但我看不到这些软件包在wireshark中通过主机(奇怪,是不是?)。
最后,检查DataGrid的其余接口我注意到缓存复制实际上并不工作。把价值主宰,我不能从奴隶获得它。
最后,因为从正在运行,主定期记录此:
10:26:03,839 WARN [org.jgroups.protocols.TP$ProtocolAdapter] (INT-1,shared=udp) JGRP000031: linux-bb91/oaas-cluster: dropping unicast message to wrong destination linux-bb91/oaas-cluster
和从日志,这些日志:
10:26:03,903 WARN [org.jgroups.protocols.UDP] (TransferQueueBundler,shared=udp) JGRP000032: null: no physical address for 4d05dc4d-66ac-1943-4e97-92c6e2b471c0, dropping message
想不通究竟是什么错。这里有的iperf实用程序(由从到主,反之亦然)的结果:
#iperf -s -u -B $MASTER_IP -i 1
bind failed: Cannot assign requested address
------------------------------------------------------------
Server listening on UDP port 5001
Binding to local address $MASTER_IP
Receiving 1470 byte datagrams
UDP buffer size: 208 KByte (default)
------------------------------------------------------------
#iperf -c $MASTER_IP -u -T 32 -t 3 -i 1
------------------------------------------------------------
Client connecting to $MASTER_IP, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 208 KByte (default)
------------------------------------------------------------
[ 3] local 10.27.11.11 port 36857 connected with 10.27.11.87 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 1.0 sec 121 KBytes 988 Kbits/sec
[ 3] 1.0- 2.0 sec 126 KBytes 1.03 Mbits/sec
[ 3] 2.0- 3.0 sec 126 KBytes 1.03 Mbits/sec
[ 3] 0.0- 3.0 sec 375 KBytes 1.02 Mbits/sec
[ 3] Sent 269 datagrams
read failed: Connection refused
[ 3] WARNING: did not receive ack of last datagram after 5 tries.
但是,当我从主机器Wireshark的发送nc $SLAVE_HOST 45688
显示传入的包和ACK。
需要帮助。甚至不知道我可以在哪里挖掘。谢谢。
UPD
UDP封装现在似乎要两侧。当我在服务器模式下运行iperf3从另一个虚拟机作为客户端结果检查是:
# iperf3 -c 10.27.11.87 -u -T 32 -t 3 -i 1
32: Connecting to host 10.27.11.87, port 5201
32: [ 4] local 10.27.11.11 port 58036 connected to 10.27.11.87 port 5201
32: [ ID] Interval Transfer Bandwidth Total Datagrams
32: [ 4] 0.00-1.00 sec 120 KBytes 983 Kbits/sec 15
32: [ 4] 1.00-2.00 sec 128 KBytes 1.05 Mbits/sec 16
32: [ 4] 2.00-3.00 sec 128 KBytes 1.05 Mbits/sec 16
32: - - - - - - - - - - - - - - - - - - - - - - - - -
32: [ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
32: [ 4] 0.00-3.00 sec 376 KBytes 1.03 Mbits/sec 2.462 ms 0/47 (0%)
32: [ 4] Sent 47 datagrams
32:
32: iperf Done.
和服务器接收包:
# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.27.11.11, port 50940
[ 5] local 10.27.11.87 port 5201 connected to 10.27.11.11 port 58036
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 5] 0.00-1.00 sec 120 KBytes 983 Kbits/sec 0.361 ms 0/15 (0%)
[ 5] 1.00-2.00 sec 128 KBytes 1.05 Mbits/sec 6.556 ms 0/16 (0%)
[ 5] 2.00-3.00 sec 128 KBytes 1.05 Mbits/sec 2.462 ms 0/16 (0%)
[ 5] 3.00-3.04 sec 0.00 Bytes 0.00 bits/sec 2.462 ms 0/0 (-nan%)
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 5] 0.00-3.04 sec 376 KBytes 1.01 Mbits/sec 2.462 ms 0/47 (0%)
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
似乎问题是JBoss无法组装集群。 –
任何时候节点都不集群,首先要在'org.jgroups'上启用TRACE日志记录。这比wirehark提供了更多的线索。另外,请检查'netstat',看看它是不是IPv4/v6问题。所有地址都是IPv4吗? –