Hbase集群 java.net.NoRouteToHostException: No route to host问题

TimeStarRipple 2016-12-21 04:14:13
问题描述:
Hbase集群启动,HRegionServer节点连接不到HMaster节点,报错java.net.NoRouteToHostException: No route to host

环境描述:
我在rancher上面搭建了hadoop2.6.5+zookeeper3.4.6+Hbase1.2.4的集群,共3台,每个docker容器中都有一个hadoop,zookeeper,和hbase。采用的环境是ubuntu14.04+openjdk8.

hosts配置文件如下:
10.42.127.91     master
10.42.131.60 slave1
10.42.232.200 slave2
10.42.127.91 13f1b9519c3f
10.42.232.200 5872757960fc
172.17.0.9 604bd3dbcad6
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters


hbase-site.xml文件
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave1,slave2</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/usr/data/zookeeper-3.4.6/data</value>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>120000</value>
</property>
<property>
<name>hbase.rpc.timeout</name>
<value>300000</value>
</property>
</configuration>


现象描述:
hadoop集群启动成功,如下


master节点jps运行结果如下:
841 SecondaryNameNode
5436 Jps
670 NameNode
990 ResourceManager


slave节点jps运行结果如下:

582 NodeManager
2360 Jps
521 DataNode


master上hbase管理zookeeper集群
在master的hbase中启动集群
zookeeper启动成功,jps结果正确,但是HRegionServer连接不到HMaster
hbase界面如下:

没有连到RegionServer

master的JPS结果如下:
5890 Jps
5623 HQuorumPeer
841 SecondaryNameNode
5690 HMaster
670 NameNode
990 ResourceManager


slave的jps结果如下:
2610 Jps
582 NodeManager
2407 HQuorumPeer
521 DataNode
2475 HRegionServer



master报错如下:
2016-12-21 10:54:32,971 INFO  [13f1b9519c3f:16000.activeMasterManager] master.Se
rverManager: Waiting for region servers count to settle; currently checked in 0,
slept for 580884 ms, expecting minimum of 1, maximum of 2147483647, timeout of
4500 ms, interval of 1500 ms.
2016-12-21 10:54:34,521 INFO [13f1b9519c3f:16000.activeMasterManager] master.Se
rverManager: Waiting for region servers count to settle; currently checked in 0,
slept for 582434 ms, expecting minimum of 1, maximum of 2147483647, timeout of
4500 ms, interval of 1500 ms.
2016-12-21 10:54:36,041 INFO [13f1b9519c3f:16000.activeMasterManager] master.Se
rverManager: Waiting for region servers count to settle; currently checked in 0,
slept for 583954 ms, expecting minimum of 1, maximum of 2147483647, timeout of
4500 ms, interval of 1500 ms.
2016-12-21 10:54:37,560 INFO [13f1b9519c3f:16000.activeMasterManager] master.Se
rverManager: Waiting for region servers count to settle; currently checked in 0,
slept for 585473 ms, expecting minimum of 1, maximum of 2147483647, timeout of
4500 ms, interval of 1500 ms.
2016-12-21 10:54:39,074 INFO [13f1b9519c3f:16000.activeMasterManager] master.Se
rverManager: Waiting for region servers count to settle; currently checked in 0,
slept for 586987 ms, expecting minimum of 1, maximum of 2147483647, timeout of
4500 ms, interval of 1500 ms.


slave报错如下:
2016-12-21 10:54:49,104 WARN  [regionserver/604bd3dbcad6/172.17.0.9:16020] regio
nserver.HRegionServer: reportForDuty failed; sleeping and then retrying.
2016-12-21 10:54:52,177 INFO [regionserver/604bd3dbcad6/172.17.0.9:16020] regio
nserver.HRegionServer: reportForDuty to master=13f1b9519c3f,16000,1482288284770
with port=16020, startcode=1482288327260
2016-12-21 10:54:55,225 WARN [regionserver/604bd3dbcad6/172.17.0.9:16020] regio
nserver.HRegionServer: error telling master we are up
com.google.protobuf.ServiceException: java.net.NoRouteToHostException: No route
to host

at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(Abst
ractRpcClient.java:240)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImple
mentation.callBlockingMethod(AbstractRpcClient.java:336)
at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$R
egionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProt
os.java:8982)
at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HReg
ionServer.java:2296)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.
java:906)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717
)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout
.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupConnection(
RpcClientImpl.java:416)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(R
pcClientImpl.java:722)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(Rpc
ClientImpl.java:906)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteReque
st(RpcClientImpl.java:873)
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:124
1)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(Abst
ractRpcClient.java:227)
... 5 more


个人看法:
我猜想这个是通信问题,因为部署到rancher上的镜像我本地测试过没有任何问题,但是部署到rancher上就有问题,之前认为是网络延迟的问题,在hbase中添加时间配置,但是还是没有用,找了很多资料都说是防火墙的问题,但是容器里面并没有这个服务,根本没法关闭,作为新手的我,实在是没有办法了,希望大家能给予指点,谢谢各位技术大牛。
...全文
2250 3 打赏 收藏 转发到动态 举报
写回复
用AI写文章
3 条回复
切换为时间正序
请发表友善的回复…
发表回复
book_reinforce 2017-02-15
  • 打赏
  • 举报
回复
这问题绝对是防火墙把端口屏蔽了
winds_xp 2017-02-03
  • 打赏
  • 举报
回复
我觉得可能是防火墙的命令你用的不对。 你尝试把防火墙打开再查看下状态 再关闭看看。
TimeStarRipple 2016-12-21
  • 打赏
  • 举报
回复
我顶!有想法的帅哥美女留个言呀,,谢谢

20,808

社区成员

发帖
与我相关
我的任务
社区描述
Hadoop生态大数据交流社区,致力于有Hadoop,hive,Spark,Hbase,Flink,ClickHouse,Kafka,数据仓库,大数据集群运维技术分享和交流等。致力于收集优质的博客
社区管理员
  • 分布式计算/Hadoop社区
  • 涤生大数据
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧