namenode 无法启动
Hello,
我有5 台机器, Centos7, hadoop 是2.8.3. 两台是 namenode, 另外三个是datanode。 具体见下面表格:
HostName Softwares Process
kencentos1 JDK, hadoop namenode, zkfc (active), resourceManager
kencentos2 JDK, hadoop namenode, zkfc (active), resourceManager
kencentosClient1 JDK, hadoop,zookeeper QuorumPeefMain(zookeeper),journnode, datanode,nodeManger
kencentosClient2 JDK, hadoop,zookeeper QuorumPeefMain(zookeeper),journnode, datanode,nodeManger
kencentosClient3 JDK, hadoop,zookeeper QuorumPeefMain(zookeeper),journnode, datanode,nodeManger
按照文档配好以后,core-site.xml 是这样的:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/data</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://kenns</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>kencentosClient1:2181,kencentosClient3:2181,kencentosClient2:2181</value>
</property>
</configuration>
hdfs-site.xml 如下:
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.block.size</name>
<value>64M</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>kenns</value>
</property>
<property>
<name>dfs.ha.namenodes.kenns</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.kenns.nn1</name>
<value>kencentos1:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.kenns.nn1</name>
<value>kencentos1:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.kenns.nn2</name>
<value>kencentos2:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.kenns.nn1</name>
<value>kencentos2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://kencentosClient1:8485;kencentosClient2:8485;kencentosClient3:8485/kenns</value>
</property>
<!-- JournalNode data location -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/hadoop/journal</value>
</property>
<!-- name node fail over -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- fail over way -->
<property>
<name>dfs.client.failover.proxy.provider.kenns</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
slaves 和 hosts 里面配置好,SSH 免密登陆也都配置好。互相可以访问。这五台机器在一个物理机器上,是5个VM。
格式化namenode 以后,启动,一直在启动的那个namenode上说第二个namenode 的50070 被占用。查看另外一个namenode 的log,里面说没有被format。
下面是第一个namenode 的错误信息:
2018-01-09 12:34:31,773 INFO org.apache.hadoop.http.HttpServer2: HttpServer.start() threw a non Bind IOException
java.net.BindException: Port in use: kencentos2:50070
at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:998)
at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:935)
at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:171)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:842)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:693)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:906)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:885)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1626)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1694)
Caused by: java.net.BindException: Cannot assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
下面是第二个namenode的错误信息:
2018-01-09 12:34:32,747 WARN org.apache.hadoop.hdfs.server.common.Storage: Storage directory /opt/hadoop/data/dfs/name does not exist
2018-01-09 12:34:32,748 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /opt/hadoop/data/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:369)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:220)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1044)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:707)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:635)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:696)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:906)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:885)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1626)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1694)
hosts文件:
127.0.0.1 localhost.localdomain localhost
192.168.1.237 kencentosClient2
192.168.1.221 kencentosClient1
192.168.1.248 kencentos1
192.168.1.252 kencentos2
192.168.1.217 kencentosClient3
0.0.0.0 kencentos2
求大神帮忙解决。谢谢