nutch2.2.1 hadoop 环境 报错

末日周五 2014-11-21 03:03:44
我在用crawl脚本运行的时候在solr的部分会报错,内容如下:
14/11/20 22:34:49 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates: starting...
14/11/20 22:34:49 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates: Solr url: http://192.168.83.208:8983/solr/xhnutch
14/11/20 22:34:50 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
14/11/20 22:34:51 INFO mapred.JobClient: Running job: job_201411190341_0076
14/11/20 22:34:52 INFO mapred.JobClient: map 0% reduce 0%
14/11/20 22:35:02 INFO mapred.JobClient: Task Id : attempt_201411190341_0076_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
... 8 more

14/11/20 22:35:06 INFO mapred.JobClient: Task Id : attempt_201411190341_0076_m_000001_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
... 8 more

14/11/20 22:35:14 INFO mapred.JobClient: Task Id : attempt_201411190341_0076_m_000000_1, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
... 8 more

14/11/20 22:35:16 INFO mapred.JobClient: Task Id : attempt_201411190341_0076_m_000001_1, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
... 8 more

14/11/20 22:35:20 INFO mapred.JobClient: Task Id : attempt_201411190341_0076_m_000000_2, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
... 8 more

14/11/20 22:35:24 INFO mapred.JobClient: Task Id : attempt_201411190341_0076_m_000001_2, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

但是在编译的job里面这个东西是有的啊,而且在local里面运行的话也没有错,在Google上面搜了一下,有用的信息几乎没有,希望大神能给个答复
...全文
691 15 打赏 收藏 转发到动态 举报
写回复
用AI写文章
15 条回复
切换为时间正序
请发表友善的回复…
发表回复
Jeelon 2016-08-02
  • 打赏
  • 举报
回复
引用 13 楼 nihaoaqi 的回复:
各位大神好,我在nutch2.3 增加了job.setJarByClass(SolrDeleteDuplicates.class); 报错变化了,知道什么原因么,麻烦帮忙下呢 Error: java.lang.NullPointerException at org.apache.hadoop.io.Text.encode(Text.java:450) at org.apache.hadoop.io.Text.set(Text.java:198) at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrRecordReader.nextKeyValue(SolrDeleteDuplicates.java:233) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
请问 解决了吗??怎么解决的啊??求分享一下啊。。
Jeelon 2016-08-02
  • 打赏
  • 举报
回复
引用 13 楼 nihaoaqi 的回复:
各位大神好,我在nutch2.3 增加了job.setJarByClass(SolrDeleteDuplicates.class); 报错变化了,知道什么原因么,麻烦帮忙下呢
Error: java.lang.NullPointerException
at org.apache.hadoop.io.Text.encode(Text.java:450)
at org.apache.hadoop.io.Text.set(Text.java:198)
at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrRecordReader.nextKeyValue(SolrDeleteDuplicates.java:233)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

请问 解决了吗??怎么解决的啊??求分享一下啊。。
nihaoaqi 2015-09-01
  • 打赏
  • 举报
回复
各位大神好,我在nutch2.3 增加了job.setJarByClass(SolrDeleteDuplicates.class); 报错变化了,知道什么原因么,麻烦帮忙下呢 Error: java.lang.NullPointerException at org.apache.hadoop.io.Text.encode(Text.java:450) at org.apache.hadoop.io.Text.set(Text.java:198) at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrRecordReader.nextKeyValue(SolrDeleteDuplicates.java:233) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
sinat_15987731 2015-03-22
  • 打赏
  • 举报
回复
求大神告知在哪里修改啊?
skyWalker_ONLY 2014-11-24
  • 打赏
  • 举报
回复
引用 7 楼 tianjintd2008 的回复:
引用 6 楼 tianjintd2008 的回复:
引用 5 楼 sky_walker85 的回复:
可以了,在Job job = new Job(getConf(), "solrdedup");后面添加
job.setJarByClass(SolrDeleteDuplicates.class);
帅哥,到底是哪个文件啊?求详细的修改文件
我找到了,谢谢哈,我也编译一下试试!
应该可以的,试试吧,试完了尽快散分
末日周五 2014-11-24
  • 打赏
  • 举报
回复
引用 6 楼 tianjintd2008 的回复:
引用 5 楼 sky_walker85 的回复:
可以了,在Job job = new Job(getConf(), "solrdedup");后面添加
job.setJarByClass(SolrDeleteDuplicates.class);
帅哥,到底是哪个文件啊?求详细的修改文件
我找到了,谢谢哈,我也编译一下试试!
末日周五 2014-11-24
  • 打赏
  • 举报
回复
引用 5 楼 sky_walker85 的回复:
可以了,在Job job = new Job(getConf(), "solrdedup");后面添加
job.setJarByClass(SolrDeleteDuplicates.class);
帅哥,到底是哪个文件啊?求详细的修改文件
skyWalker_ONLY 2014-11-24
  • 打赏
  • 举报
回复
可以了,在Job job = new Job(getConf(), "solrdedup");后面添加
job.setJarByClass(SolrDeleteDuplicates.class);
skyWalker_ONLY 2014-11-24
  • 打赏
  • 举报
回复
引用 3 楼 tianjintd2008 的回复:
[quote=引用 2 楼 sky_walker85 的回复:] 找到解决的方法了吗
没有啊,我比较了一下local和deploy之间的lib的差异,发现local里面的hadoop-core是1.2.0版本的,我集群的环境是1.2.1,local运行时没有错的,所以我想试试看是不是集群环境的版本不对,正在试验中[/quote] 我修改了那个类的源代码,编译后正在试,有结果了互相通知一下吧
末日周五 2014-11-24
  • 打赏
  • 举报
回复
引用 2 楼 sky_walker85 的回复:
找到解决的方法了吗
没有啊,我比较了一下local和deploy之间的lib的差异,发现local里面的hadoop-core是1.2.0版本的,我集群的环境是1.2.1,local运行时没有错的,所以我想试试看是不是集群环境的版本不对,正在试验中
skyWalker_ONLY 2014-11-24
  • 打赏
  • 举报
回复
找到解决的方法了吗
末日周五 2014-11-24
  • 打赏
  • 举报
回复
引用 10 楼 sky_walker85 的回复:
引用 9 楼 tianjintd2008 的回复:
引用 8 楼 sky_walker85 的回复:
[quote=引用 7 楼 tianjintd2008 的回复:] [quote=引用 6 楼 tianjintd2008 的回复:] [quote=引用 5 楼 sky_walker85 的回复:] 可以了,在Job job = new Job(getConf(), "solrdedup");后面添加
job.setJarByClass(SolrDeleteDuplicates.class);
帅哥,到底是哪个文件啊?求详细的修改文件
我找到了,谢谢哈,我也编译一下试试!
应该可以的,试试吧,试完了尽快散分[/quote]我已经试完了,果然是可以的,谢谢你哈,区区百分,还望笑纳。还有,我能不能加你个qq啊[/quote] qq:1044610527[/quote]这个……验证消息是你的名字,so……敢问芳名?
skyWalker_ONLY 2014-11-24
  • 打赏
  • 举报
回复
引用 9 楼 tianjintd2008 的回复:
引用 8 楼 sky_walker85 的回复:
引用 7 楼 tianjintd2008 的回复:
[quote=引用 6 楼 tianjintd2008 的回复:] [quote=引用 5 楼 sky_walker85 的回复:] 可以了,在Job job = new Job(getConf(), "solrdedup");后面添加
job.setJarByClass(SolrDeleteDuplicates.class);
帅哥,到底是哪个文件啊?求详细的修改文件
我找到了,谢谢哈,我也编译一下试试!
应该可以的,试试吧,试完了尽快散分[/quote]我已经试完了,果然是可以的,谢谢你哈,区区百分,还望笑纳。还有,我能不能加你个qq啊[/quote] qq:1044610527
末日周五 2014-11-24
  • 打赏
  • 举报
回复
引用 8 楼 sky_walker85 的回复:
引用 7 楼 tianjintd2008 的回复:
引用 6 楼 tianjintd2008 的回复:
[quote=引用 5 楼 sky_walker85 的回复:] 可以了,在Job job = new Job(getConf(), "solrdedup");后面添加
job.setJarByClass(SolrDeleteDuplicates.class);
帅哥,到底是哪个文件啊?求详细的修改文件
我找到了,谢谢哈,我也编译一下试试!
应该可以的,试试吧,试完了尽快散分[/quote]我已经试完了,果然是可以的,谢谢你哈,区区百分,还望笑纳。还有,我能不能加你个qq啊
skyWalker_ONLY 2014-11-21
  • 打赏
  • 举报
回复
期待答案,也曾经遇到该问题,没有解决

20,808

社区成员

发帖
与我相关
我的任务
社区描述
Hadoop生态大数据交流社区,致力于有Hadoop,hive,Spark,Hbase,Flink,ClickHouse,Kafka,数据仓库,大数据集群运维技术分享和交流等。致力于收集优质的博客
社区管理员
  • 分布式计算/Hadoop社区
  • 涤生大数据
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧