2015-12-30 56 views
1

下面是CM上报告的健康问题的快照。列表中的datanode不断变化。从数据管理部日志中的一些错误:Cloudera Manager健康问题:NameNode连接,Web服务器状态

3:59:31.859 PM ERROR org.apache.hadoop.hdfs.server.datanode.DataNode 
    datanode05.hadoop.com:50010:DataXceiver error processing WRITE_BLOCK operation src: /10.248.200.113:45252 dest: /10.248.200.105:50010 
    java.io.IOException: Premature EOF from inputStream 
     at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) 
     at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:414) 
     at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:635) 
     at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:564) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:103) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:67) 
     at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221) 
     at java.lang.Thread.run(Thread.java:662) 
5:46:03.606 PM INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
    Exception for BP-846315089-10.248.200.4-1369774276029:blk_-780307518048042460_200374997 
    java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.248.200.105:50010 remote=/10.248.200.122:43572] 
     at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:165) 
     at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:156) 
     at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:129) 
     at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) 
     at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) 
     at java.io.BufferedInputStream.read(BufferedInputStream.java:317) 
     at java.io.DataInputStream.read(DataInputStream.java:132) 
     at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) 
     at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:414) 
     at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:635) 
     at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:564) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:103) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:67) 
     at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221) 
     at java.lang.Thread.run(Thread.java:662) 

快照:

Health Issues reported on CM

我无法找出问题的根源。我可以手动从一个datanode连接到另一个没有问题,我不相信这是一个网络问题。此外,缺少的块和不足复制的块计数也会发生变化(向下调高&)。

Cloudera的经理:Cloudera的标准4.8.1

CDH 4.7

解决这个问题,表示赞赏任何帮助。

更新:2016年1月1日

对于列为坏的数据节点,当我看到dadanode日志,我看到这条消息了很多...

11:58:30.066 AM INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
Receiving BP-846315089-10.248.200.4-1369774276029:blk_-706861374092956879_36606459 src: /10.248.200.123:56795 dest: /10.248.200.112:50010 

为什么这个数据节点接收很多来自其他数据节点的块在同一时间?看起来由于这个活动,datanode无法及时响应namenode请求,因此超时。所有不好的datanodes都显示相同的模式。

回答

0

类似的问题获得解答

hdfs data node disconnected from namenode
请检查您的防火墙。使用

telnet ipaddress port 

检查连通性。

+0

我试过telnet,它连接到其他节点成功。 它似乎不是防火墙问题。现在列为连接问题的节点...在几分钟后不会显示在列表中。它的开关。 – scott