求助:Hadoop Cluster在各个服务正常运行,各个节点通信正常的情况下,出现slave访问namenode超时

wx_start_ag 2019-06-17 04:48:27
求助各位大神:
我在按照官往文档"Hadoop Cluster Setup"一章部署的集群环境下,尝试《Hadoop 权威指南》 2.6.2 章用python代码做streaming实现的时候,提示如下错误
代码如下:(书上的代码)
max_temprature_map.py:
#!/usr/bin/python
import re
import sys

for line in sys.stdin:
val = line.strip()
(year, temp, q) = (val[15:19], val[87:92], val[92:93])
if temp != "+9999" and re.match("[01459]", q):
print "%s\t%s" % (year, temp)

max_temprature_reduce.py
#!/usr/bin/python
import sys

(last_key, max_val) = (None, -sys.maxint)
for line in sys.stdin:
(key, values) = line.strip().split("\t")
if last_key and last_key != key:
print "%s\t%s" % (last_key, max_val)
(last_key, max_val) = (key, int(values))
else:
(last_key, max_val) = (key, max(max_val, int(values)))
if last_key:
print "%s\t%s" % (last_key, max_val)

命令如下:
hadoop jar /opt/hadoopCluster/hadoop-2.9.2/share/hadoop/tools/lib/hadoop-streaming-2.9.2.jar \
-input /input/1902 \
-output /output/max_temprature_20910617 \
-mapper /root/code/upload_from_pycharm/max_temprature_map.py \
-reducer /root/code/upload_from_pycharm/max_temprature_reduce.py \
-file /root/code/upload_from_pycharm/max_temprature_map.py \
-file /root/code/upload_from_pycharm/max_temprature_reduce.py

报错信息如下:
2019-06-17 23:19:56,046 FATAL [IPC Server handler 8 on 59456] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1560760626985_0002_m_000001_0 - exited : java.io.IOException: Failed on local exception: java.io.IOException: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.111.140:42113 remote=client001/192.168.111.129:9400]; Host Details : local host is: "client003/192.168.111.140"; destination host is: "client001":9400; 
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:805)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
at org.apache.hadoop.ipc.Client.call(Client.java:1453)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy14.getBlockLocations(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:259)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:847)
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:836)
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:825)
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:330)
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:289)
at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:274)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1064)
at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:332)
at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:329)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:329)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:914)
at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:356)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:177)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:171)
Caused by: java.io.IOException: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.111.140:42113 remote=client001/192.168.111.129:9400]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:760)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:723)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:817)
at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:412)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1568)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
... 35 more
Caused by: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.111.140:42113 remote=client001/192.168.111.129:9400]
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1812)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:365)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:618)
at org.apache.hadoop.ipc.Client$Connection.access$2200(Client.java:412)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:804)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:800)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:799)
... 38 more

看到有报连接超时,我在client003节点上用telnet命令发现端口是通的,且三个节点ssh也是没有问题:
[root@client003 hadoop-2.9.2]# telnet client001 9400
Trying 192.168.111.129...
Connected to client001.
Escape character is '^]'.

[root@client001 ~]# jps
1856 DataNode
7045 Jps
1991 SecondaryNameNode
2317 NodeManager
1759 NameNode
[root@client001 ~]# netstat -nap|grep 9400
tcp 0 0 192.168.111.129:9400 0.0.0.0:* LISTEN 1759/java
tcp 0 0 192.168.111.129:9400 192.168.111.129:59726 ESTABLISHED 1759/java
tcp 0 0 192.168.111.129:59763 192.168.111.129:9400 TIME_WAIT -
tcp 0 0 192.168.111.129:59726 192.168.111.129:9400 ESTABLISHED 1856/java
tcp 0 0 192.168.111.129:9400 192.168.111.140:42224 ESTABLISHED 1759/java
tcp 0 0 192.168.111.129:9400 192.168.111.139:54617 ESTABLISHED 1759/java
...全文
44 1 打赏 收藏 转发到动态 举报
写回复
用AI写文章
1 条回复
切换为时间正序
请发表友善的回复…
发表回复
guolindi 2019-10-11
  • 打赏
  • 举报
回复
看下是否是时间不同步

20,809

社区成员

发帖
与我相关
我的任务
社区描述
Hadoop生态大数据交流社区,致力于有Hadoop,hive,Spark,Hbase,Flink,ClickHouse,Kafka,数据仓库,大数据集群运维技术分享和交流等。致力于收集优质的博客
社区管理员
  • 分布式计算/Hadoop社区
  • 涤生大数据
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧