Spark通过web调用yarnclient,报错FileNotFoundException:__spark_libs__*.zip

bfz0d003 2017-05-18 12:49:23
通过JAVA调用Spark API(yarnclient模式),服务器环境使用tomcat8,web方式触发任务,tomcat日志报错如下:
http-nio-8080-exec-3 2017-05-17 23:55:16-[INFO] [org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:174)] Submitted application application_1495078972652_0002 to ResourceManager at /0.0.0.0:8032
http-nio-8080-exec-3 2017-05-17 23:55:16-[INFO] [org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)] Starting Yarn extension services with app application_1495078972652_0002 and attemptId None
http-nio-8080-exec-3 2017-05-17 23:55:17-[INFO] [org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)] Application report for application_1495078972652_0002 (state: FAILED)
http-nio-8080-exec-3 2017-05-17 23:55:17-[INFO] [org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)]
client token: N/A
diagnostics: Application application_1495078972652_0002 failed 2 times due to AM Container for appattempt_1495078972652_0002_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://dev1:8088/cluster/app/application_1495078972652_0002Then, click on links to logs of each attempt.
Diagnostics: File file:/usr/local/tomcat/temp/spark-aae1afc9-5738-4e5b-ae29-f8935adf53b8/__spark_libs__1867469074993542381.zip does not exist
java.io.FileNotFoundException: File file:/usr/local/tomcat/temp/spark-aae1afc9-5738-4e5b-ae29-f8935adf53b8/__spark_libs__1867469074993542381.zip does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


spark-env.sh如下:

export SPARK_LOCAL_IP=dev1
export SPARK_LOCAL_DIRS=/usr/local/spark/local_dirs
export SPARK_CLASSPATH=$SPARK_CLASSPATH:${SPARK_HOME}/lib/
export HADOOP_CONF_DIR=/usr/local/hadoop/hadoop-2.7.3/etc/hadoop
export SPARK_EXECUTOR_INSTANCES=3
export SPARK_EXECUTOR_CORES=2
export SPARK_EXECUTOR_MEMORY=1G
export SPARK_DRIVER_MEMORY=2G
export SCALA_HOME=/usr/local/scala
export JAVA_HOME=/usr/local/java/jdk1.8.0_121
export SPARK_MASTER_IP=dev1
export SPARK_MASTER_WEBUI_PORT=8090
export SPARK_WORKER_WEBUI_PORT=8099
export SPARK_WORKER_MEMORY=8g


spark-defaults.conf如下:

spark.master spark://dev1:7077
spark.eventLog.enabled true
spark.eventLog.dir hdfs://dev1:9000/spark/eventlogs
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.executor.memory 2g
spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"


错误原因提示在tomcat下找不到__spark_libs__*.zip的文件,我在master上(tomcat在master上运行)可以找到这个文件,但是在其他节点上是没有tomcat的,报错也是其他节点报的,为什么集群模式运行会出现这种情况,请高人指点
...全文
568 回复 打赏 收藏 转发到动态 举报
写回复
用AI写文章
回复
切换为时间正序
请发表友善的回复…
发表回复

20,808

社区成员

发帖
与我相关
我的任务
社区描述
Hadoop生态大数据交流社区,致力于有Hadoop,hive,Spark,Hbase,Flink,ClickHouse,Kafka,数据仓库,大数据集群运维技术分享和交流等。致力于收集优质的博客
社区管理员
  • 分布式计算/Hadoop社区
  • 涤生大数据
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧