windows下采用cygwin部置hadoop,由于文件夹链接方式在操作系统差异导致FileNotFoundException异常问题.
系统自带的WordCount发布,运行产生如下错误:
2013-12-05 17:11:21,956 WARN org.apache.hadoop.mapred.TaskRunner: attempt_201312051709_0001_m_000002_1 : Child Error
java.io.IOException: Task process exit with nonzero status of -1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
2013-12-05 17:11:22,221 WARN org.apache.hadoop.mapred.TaskLog: Failed to retrieve stdout log for task: attempt_201312051709_0001_m_000002_0
java.io.FileNotFoundException: D:\hadoop\logs\userlogs\job_201312051709_0001\attempt_201312051709_0001_m_000002_0\stdout (系统找不到指定的路径。)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:120)
at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:455)
at org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:914)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
搜了很多资料,这个问题和这个问题是一样的:
http://lucene.472066.n3.nabble.com/In-cygwin-hadoop-throws-exception-when-running-wordcount-td3863923.html
自己的分析:
mapred.child.tmp配置为:/cygdrive/d/hadoop/tmp
故运行过程中生成如下目录: D:\cygdrive\d\hadoop\tmp\mapred\local\userlogs\job_201312051709_0001\attempt_201312051709_0001_m_000002_0
另一个logs里指向它的链接,在cygwin里:
D:\hadoop\logs\userlogs\job_201312051709_0001\attempt_201312051709_0001_m_000002_0
但JDK运行过程中,判断logs里的(D:\hadoop\logs\userlogs\job_201312051709_0001\attempt_201312051709_0001_m_000002_0)不是一个文件夹,而是一个文件,故出该异常.
用cygwin的估计都遇到这个问题吧,但找了些资料,一直没有好的方案.