s3distcp失败|复合运行时出现异常

woshiitsuperman 2014-10-21 03:35:23
我们准备将总计大小约为100MB的1000个文件从S3的一个bucket移动到另一个bucket,(也就是内部转区),出于测试目的,,我们试图用s3distcp转移1000文件的的EMR聚类,并且由此产生了五个X-Large实例。
我们越来越多的运行时异常而导致工作运行的最终失败。
我们用下列代码来运行我们的s3distcp。
hadoop jar /home/hadoop/lib/emr-s3distcp-1.0.jar --src s3://<source-bucket>/srcdistcp --dest s3://<destination-bucket>/destdistcp
之后我们用下面的代码来运行exception:
14/04/03 11:36:38 INFO mapred.JobClient: Task Id : attempt_201404031115_0004_r_000009_0, Status : FAILED
java.lang.RuntimeException: Reducer task failed to copy 1 files: s3://<source-bucket>/testdistcp/sampletestfileversion1sequence19 etc
at com.amazon.elasticmapreduce.s3distcp.CopyFilesReducer.close(CopyFilesReducer.java:75)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:538)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:429)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

attempt_201404031115_0004_r_000009_0: log4j:WARN No appenders could be found for logger (org.apache.hadoop.hdfs.DFSClient).
attempt_201404031115_0004_r_000009_0: log4j:WARN Please initialize the log4j system properly.
attempt_201404031115_0004_r_000009_0: log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
14/04/03 11:36:39 INFO mapred.JobClient: map 100% reduce 52%
14/04/03 11:36:41 INFO mapred.JobClient: Task Id : attempt_201404031115_0004_r_000006_0, Status : FAILED
java.lang.RuntimeException: Reducer task failed to copy 1 files: s3://<source-bucket>/testdistcp/sampletestfileversion1sequence16 etc
at com.amazon.elasticmapreduce.s3distcp.CopyFilesReducer.close(CopyFilesReducer.java:75)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:538)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:429)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
其中有什么问题吗,大家有什么建议吗?
...全文
315 2 打赏 收藏 转发到动态 举报
写回复
用AI写文章
2 条回复
切换为时间正序
请发表友善的回复…
发表回复
johnleewokao 2014-10-22
  • 打赏
  • 举报
回复
s3distcp失败|复合运行时出现异常 我们准备将总计大小约为100MB的1000个文件从S3的一个bucket移动到另一个bucket,(也就是内部转区),出于测试目的,,我们试图用s3distcp转移1000文件的的EMR聚类,并且由此产生了五个X-Large实例。 我们越来越多的运行时异常而导致工作运行的最终失败。 我们用下列代码来运行我们的s3distcp。 hadoop jar /home/hadoop/lib/emr-s3distcp-1.0.jar --src s3://<source-bucket>/srcdistcp --dest s3://<destination-bucket>/destdistcp 之后我们用下面的代码来运行exception: 14/04/03 11:36:38 INFO mapred.JobClient: Task Id : attempt_201404031115_0004_r_000009_0, Status : FAILED java.lang.RuntimeException: Reducer task failed to copy 1 files: s3://<source-bucket>/testdistcp/sampletestfileversion1sequence19 etc at com.amazon.elasticmapreduce.s3distcp.CopyFilesReducer.close(CopyFilesReducer.java:75) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:538) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:429) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132) at org.apache.hadoop.mapred.Child.main(Child.java:249) attempt_201404031115_0004_r_000009_0: log4j:WARN No appenders could be found for logger (org.apache.hadoop.hdfs.DFSClient). attempt_201404031115_0004_r_000009_0: log4j:WARN Please initialize the log4j system properly. attempt_201404031115_0004_r_000009_0: log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. 14/04/03 11:36:39 INFO mapred.JobClient: map 100% reduce 52% 14/04/03 11:36:41 INFO mapred.JobClient: Task Id : attempt_201404031115_0004_r_000006_0, Status : FAILED java.lang.RuntimeException: Reducer task failed to copy 1 files: s3://<source-bucket>/testdistcp/sampletestfileversion1sequence16 etc at com.amazon.elasticmapreduce.s3distcp.CopyFilesReducer.close(CopyFilesReducer.java:75) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:538) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:429) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132) at org.apache.hadoop.mapred.Child.main(Child.java:249) 其中有什么问题吗,大家有什么建议吗? 这里所描述的错误可以有多个原因,比如说,EMR AMI 3.2.0 和3.2.1都存在一定的bug,这个bug造成了Hadoop配置器不合适的使用了根实例的体积。为了解决这些版本这个问题需要添加bootstrap action s3://support.elasticmapreduce/bootstrapactions/ami/3.2.1/CheckandFixMisconfiguredMounts.bash 这个事件已经非常久远而且已经被解决了。 http://stackoverflow.com/questions/14631152/copy-files-from-amazon-s3-to-hdfs-using-s3distcp-fails 在目前的电子病历系统的用户仍然有这些错误的描述,请提供例子的工作流程系统

409

社区成员

发帖
与我相关
我的任务
社区描述
AWS
社区管理员
  • AWS
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧