TOMCAT 用 Spring for hadoop 连 HADOOP 并使用MAPREDUCE时发生 MAPPER NOT FIND

lycankiss 2014-02-13 06:09:46
其实如果把MR的程序打成JAR包放到主节点下面运行话,已经OK了。但是本人要做一个JAVAWEB的项目所以用spring for hadoop,希望做到调用MR功能时可以向别的JAVA类一样直接运行,而不需要再把MR的代码拿出来打JAR包
目前可以说HADOOP和TOMCAT之间已经通了,HDFS操作比如文件上传,删除都没问题。但是MR就出问题了。目前程序能跑,但是跑的时候报错:

2014-02-13 17:50:30,616 WARN [http-8080-1] conf.Configuration (Configuration.java:warnOnceIfDeprecated(981)) - mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
2014-02-13 17:50:30,617 WARN [http-8080-1] conf.Configuration (Configuration.java:warnOnceIfDeprecated(981)) - mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
2014-02-13 17:50:30,825 INFO [http-8080-1] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(454)) - Submitting tokens for job: job_1392171916215_0091
2014-02-13 17:50:31,246 INFO [http-8080-1] mapred.YARNRunner (YARNRunner.java:createApplicationSubmissionContext(367)) - Job jar is not present. Not adding any jar to the list of resources.
2014-02-13 17:50:31,290 INFO [http-8080-1] client.YarnClientImpl (YarnClientImpl.java:submitApplication(124)) - Submitted application application_1392171916215_0091 to ResourceManager at HADOOP1/172.16.0.137:9080
2014-02-13 17:50:31,354 INFO [http-8080-1] mapreduce.Job (Job.java:submit(1273)) - The url to track the job: http://HADOOP1:8088/proxy/application_1392171916215_0091/
2014-02-13 17:50:31,355 INFO [http-8080-1] mapreduce.Job (Job.java:monitorAndPrintJob(1318)) - Running job: job_1392171916215_0091
2014-02-13 17:50:37,880 INFO [http-8080-1] mapreduce.Job (Job.java:monitorAndPrintJob(1339)) - Job job_1392171916215_0091 running in uber mode : false
2014-02-13 17:50:37,882 INFO [http-8080-1] mapreduce.Job (Job.java:monitorAndPrintJob(1346)) - map 0% reduce 0%
2014-02-13 17:50:41,953 INFO [http-8080-1] mapreduce.Job (Job.java:printTaskEvents(1425)) - Task Id : attempt_1392171916215_0091_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.springforhadoop.mapReduce.MyMapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1774)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:196)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:715)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:338)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:160)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:155)
Caused by: java.lang.ClassNotFoundException: Class com.springforhadoop.mapReduce.MyMapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1680)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1772)
... 8 more

2014-02-13 17:50:46,001 INFO [http-8080-1] mapreduce.Job (Job.java:printTaskEvents(1425)) - Task Id : attempt_1392171916215_0091_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.springforhadoop.mapReduce.MyMapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1774)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:196)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:715)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:338)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:160)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:155)
Caused by: java.lang.ClassNotFoundException: Class com.springforhadoop.mapReduce.MyMapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1680)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1772)
... 8 more

2014-02-13 17:50:50,038 INFO [http-8080-1] mapreduce.Job (Job.java:printTaskEvents(1425)) - Task Id : attempt_1392171916215_0091_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.springforhadoop.mapReduce.MyMapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1774)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:196)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:715)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:338)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:160)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:155)
Caused by: java.lang.ClassNotFoundException: Class com.springforhadoop.mapReduce.MyMapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1680)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1772)
... 8 more

2014-02-13 17:52:32,879 INFO [http-8080-1] mapreduce.Job (Job.java:monitorAndPrintJob(1346)) - map 100% reduce 0%
2014-02-13 17:52:32,885 INFO [http-8080-1] mapreduce.Job (Job.java:monitorAndPrintJob(1359)) - Job job_1392171916215_0091 failed with state FAILED due to: Task failed task_1392171916215_0091_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

2014-02-13 17:52:33,009 INFO [http-8080-1] mapreduce.Job (Job.java:monitorAndPrintJob(1364)) - Counters: 6
Job Counters
Failed map tasks=4
Launched map tasks=4
Other local map tasks=3
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=10238
Total time spent by all reduces in occupied slots (ms)=0


导致每次任务都失败,其实就是mapper类找不到,但是我甚至已经把MAP类和REDUCE类写到JOB类中了,而且安装网上说的 job.setJarByClass(HellozyxMapRed.class);也设置了,但是还是跑不通。望高手指点。


下面是MAPREDUCE类的源码:
package com.springforhadoop.mapReduce;

import java.io.IOException;
import java.util.Iterator;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.ClassPathXmlApplicationContext;
import org.springframework.stereotype.Service;
import org.springframework.test.context.ContextConfiguration;

@ContextConfiguration("spring/spring3mvc-servlet.xml")
@Service
public class HellozyxMapRed {

private static Log tLog=LogFactory.getLog(HellozyxMapRed.class);
private static final Configuration conf = new Configuration();

@Autowired
private MyInputFormat MyInputFormat;

public MyInputFormat getMyInputFormat() {
return MyInputFormat;
}



public void setMyInputFormat(MyInputFormat myInputFormat) {
MyInputFormat = myInputFormat;
}

public void HellozyxMapRedRun() throws IOException, InterruptedException, ClassNotFoundException{

// ApplicationContext ctx = new ClassPathXmlApplicationContext("/MR/mapreduce.xml");

Path file=new Path("upload/mobile2.txt");
Path outFile=new Path("upload/countResult");
Job job=new Job(conf,"TxtCounter");
job.setJarByClass(HellozyxMapRed.class);
job.setJarByClass(MyMapper.class);
job.setJarByClass(MyReduce.class);
FileInputFormat.addInputPath(job, file);
FileOutputFormat.setOutputPath(job, outFile);

job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReduce.class);
// job.setMapOutputKeyClass(LongWritable.class);
// job.setMapOutputValueClass(Text.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.waitForCompletion(true);

}
}

class MyMapper extends Mapper<LongWritable, Text, Text, IntWritable>{
private static Log tLog=LogFactory.getLog(MyMapper1.class);

protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException{
tLog.info("_+_+_+_+_+_+ now is map");
// String str1="i am your father, i am the greatest man";
String[] strs=value.toString().split(",");
// String[] strs=str1.split(" ");
for(String str:strs){
context.write(new Text(str), new IntWritable(1));
}
}
}

class MyReduce extends Reducer<Text,IntWritable, Text, IntWritable>{
// private static Logger logger=Logger.getLogger(TxtReducer.class);
private static Log tLog=LogFactory.getLog(MyReduce1.class);
protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException{
tLog.info("++++++++++++++++++start reduce job");
int sum=0;

Iterator<IntWritable> it=values.iterator();
while(it.hasNext()){
IntWritable value=it.next();
sum+=value.get();
tLog.info("key is "+key+" sum is"+sum);

}
context.write(key, new IntWritable(sum));
}
}
...全文
2109 5 打赏 收藏 转发到动态 举报
AI 作业
写回复
用AI写文章
5 条回复
切换为时间正序
请发表友善的回复…
发表回复
新世界的海贼 2015-04-15
  • 打赏
  • 举报
回复
楼主,请问你的问题解决了吗? 怎么配置才能不用打成jar上传到集群,就能直接运行啊?
lycankiss 2014-02-18
  • 打赏
  • 举报
回复
哥们有QQ吗,我的QQ157715022可以加我Q聊聊,我是用spring for hadoop 配置的。这里是我的SPRING配置文件: http://www.springframework.org/schema/beans/spring-beans-3.0.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.0.xsd http://www.springframework.org/schema/mvc http://www.springframework.org/schema/mvc/spring-mvc-3.0.xsd http://www.springframework.org/schema/hadoop http://www.springframework.org/schema/hadoop/spring-hadoop.xsd "> <!-- 默认扫描的包路径 --> <context:component-scan base-package="net.spring.controller" /> <!-- 添加注解驱动 --> <mvc:annotation-driven /> <context:property-placeholder location="classpath*:*.properties" /> <context:component-scan base-package="com.ibm.crl" /> <hdp:configuration id="hadoopConfiguration" resources="classpath:/core-site.xml, classpath:/hdfs-site.xml, classpath:/mapred-site.xml, classpath:/yarn-site.xml"/> <!-- 定义跳转的文件的前后缀 --> <bean id="viewResolver" class="org.springframework.web.servlet.view.UrlBasedViewResolver"> <property name="viewClass" value="org.springframework.web.servlet.view.JstlView" /> <property name="prefix" value="/" /> <property name="suffix" value=".jsp" /> </bean> <bean id="multipartResolver" class="org.springframework.web.multipart.commons.CommonsMultipartResolver"> <property name="maxUploadSize" value="8000000"/> </bean> <bean id="hadoopDao" class="com.springforhadoop.dao.HadoopDao"/> <bean id="myInputFormat" class="com.springforhadoop.mapReduce.MyInputFormat"/> <!-- <bean id="myInputSplit" class="com.springforhadoop.mapReduce.MyInputSplit"/> --> <!-- <bean id="myMapper" class="com.springforhadoop.mapReduce.MyMapper"/> <bean id="myReduce" class="com.springforhadoop.mapReduce.MyReduce"/> --> <bean id="hellozyxMapRed" class="com.springforhadoop.mapReduce.HellozyxMapRed"> <property name="myInputFormat" ref="myInputFormat"></property> </bean> 还有就是yarn-site.xml代码如下: <?xml version="1.0"?> <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.resourcemanager.address</name> <value>HADOOP1:9080</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>HADOOP1:9081</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>HADOOP1:9082</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce.shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.app.mapreduce.am.staging-dir</name> <value>/user</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> </configuration>
撸大湿 2014-02-17
  • 打赏
  • 举报
回复
好吧,LZ没理解我的意思 从你的日志上看,你用的是hadoop20后的版本,因为我看到了yarn lz去找找yarn调用发放 还有,我没看到你有加载CLASSPATH的方法,你的yarn&mr conf怎么加载进来?
lycankiss 2014-02-17
  • 打赏
  • 举报
回复
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop1:10020</value> </property> <property> <name>mapreduce.shuffle.port</name> <value>8888</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop1:19888</value> </property> <property> <name>mapreduce.jobhistory.intermediate-done-dir</name> <value>/user/history/done_intermediate</value> </property> <property> <name>mapreduce.jobhistory.done-dir</name> <value>/user/history/done</value> </property> </configuration> 这是所有的mapred.job.tracker配置
撸大湿 2014-02-13
  • 打赏
  • 举报
回复
mapred.job.tracker配置了吗?

20,848

社区成员

发帖
与我相关
我的任务
社区描述
Hadoop生态大数据交流社区,致力于有Hadoop,hive,Spark,Hbase,Flink,ClickHouse,Kafka,数据仓库,大数据集群运维技术分享和交流等。致力于收集优质的博客
社区管理员
  • 分布式计算/Hadoop社区
  • 涤生大数据
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧