在Ubuntu 服务器上 nutch 启动 网页抓取 报错

a_1515 2016-05-18 02:49:07
错误如下 ,有懂的大神往解答。现在急等。。谢了。

java.lang.Exception: java.lang.RuntimeException: java.util.NoSuchElementException
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: java.util.NoSuchElementException
at org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:122)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.NoSuchElementException
at java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:1959)
at org.apache.gora.memory.store.MemStore.execute(MemStore.java:128)
at org.apache.gora.query.impl.QueryBase.execute(QueryBase.java:73)
at org.apache.gora.mapreduce.GoraRecordReader.executeQuery(GoraRecordReader.java:67)
at org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:109)
... 12 more
2016-05-17 23:10:08,431 ERROR crawl.GeneratorJob - GeneratorJob: java.lang.RuntimeException: job failed: name=[test]generate: 1463497802-24076, jobid=job_local629936811_0001
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:119)
at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:227)
at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:256)
at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:322)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.GeneratorJob.main(GeneratorJob.java:330)

2016-05-17 23:33:48,415 INFO crawl.InjectorJob - InjectorJob: starting at 2016-05-17 23:33:48
2016-05-17 23:33:48,416 INFO crawl.InjectorJob - InjectorJob: Injecting urlDir: urls
2016-05-17 23:33:49,136 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-05-17 23:33:49,994 INFO crawl.InjectorJob - InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora storage class.
2016-05-17 23:33:50,965 WARN conf.Configuration - file:/tmp/hadoop-yangfan/mapred/staging/yangfan1969384701/.staging/job_local1969384701_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-05-17 23:33:50,989 WARN conf.Configuration - file:/tmp/hadoop-yangfan/mapred/staging/yangfan1969384701/.staging/job_local1969384701_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-05-17 23:33:51,438 WARN conf.Configuration - file:/tmp/hadoop-yangfan/mapred/local/localRunner/yangfan/job_local1969384701_0001/job_local1969384701_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-05-17 23:33:51,459 WARN conf.Configuration - file:/tmp/hadoop-yangfan/mapred/local/localRunner/yangfan/job_local1969384701_0001/job_local1969384701_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-05-17 23:33:51,885 INFO regex.RegexURLNormalizer - can't find rules for scope 'inject', using default
2016-05-17 23:33:52,591 INFO crawl.InjectorJob - InjectorJob: total number of urls rejected by filters: 1
2016-05-17 23:33:52,591 INFO crawl.InjectorJob - InjectorJob: total number of urls injected after normalization and filtering: 0
2016-05-17 23:33:52,593 INFO crawl.InjectorJob - Injector: finished at 2016-05-17 23:33:52, elapsed: 00:00:04
2016-05-17 23:33:54,312 INFO crawl.GeneratorJob - GeneratorJob: starting at 2016-05-17 23:33:54
2016-05-17 23:33:54,313 INFO crawl.GeneratorJob - GeneratorJob: Selecting best-scoring urls due for fetch.
2016-05-17 23:33:54,313 INFO crawl.GeneratorJob - GeneratorJob: starting
2016-05-17 23:33:54,313 INFO crawl.GeneratorJob - GeneratorJob: filtering: false
2016-05-17 23:33:54,313 INFO crawl.GeneratorJob - GeneratorJob: normalizing: false
2016-05-17 23:33:54,313 INFO crawl.GeneratorJob - GeneratorJob: topN: 50000
2016-05-17 23:33:54,744 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-05-17 23:33:54,758 INFO crawl.FetchScheduleFactory - Using FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule
2016-05-17 23:33:54,759 INFO crawl.AbstractFetchSchedule - defaultInterval=2592000
2016-05-17 23:33:54,759 INFO crawl.AbstractFetchSchedule - maxInterval=7776000
2016-05-17 23:33:56,425 WARN conf.Configuration - file:/tmp/hadoop-yangfan/mapred/staging/yangfan2089781770/.staging/job_local2089781770_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-05-17 23:33:56,445 WARN conf.Configuration - file:/tmp/hadoop-yangfan/mapred/staging/yangfan2089781770/.staging/job_local2089781770_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-05-17 23:33:56,711 WARN conf.Configuration - file:/tmp/hadoop-yangfan/mapred/local/localRunner/yangfan/job_local2089781770_0001/job_local2089781770_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-05-17 23:33:56,729 WARN conf.Configuration - file:/tmp/hadoop-yangfan/mapred/local/localRunner/yangfan/job_local2089781770_0001/job_local2089781770_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-05-17 23:33:57,392 INFO crawl.FetchScheduleFactory - Using FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule
2016-05-17 23:33:57,392 INFO crawl.AbstractFetchSchedule - defaultInterval=2592000
2016-05-17 23:33:57,392 INFO crawl.AbstractFetchSchedule - maxInterval=7776000
2016-05-17 23:33:57,401 ERROR mapreduce.GoraRecordReader - Error reading Gora records: null
2016-05-17 23:33:57,472 WARN mapred.LocalJobRunner - job_local2089781770_0001
java.lang.Exception: java.lang.RuntimeException: java.util.NoSuchElementException
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: java.util.NoSuchElementException
at org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:122)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.NoSuchElementException
at java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:1959)
at org.apache.gora.memory.store.MemStore.execute(MemStore.java:128)
at org.apache.gora.query.impl.QueryBase.execute(QueryBase.java:73)
at org.apache.gora.mapreduce.GoraRecordReader.executeQuery(GoraRecordReader.java:67)
at org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:109)
... 12 more
2016-05-17 23:33:57,769 ERROR crawl.GeneratorJob - GeneratorJob: java.lang.RuntimeException: job failed: name=[test]generate: 1463499232-7398, jobid=job_local2089781770_0001
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:119)
at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:227)
at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:256)
at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:322)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.GeneratorJob.main(GeneratorJob.java:330)
...全文
2941 1 打赏 收藏 转发到动态 举报
写回复
用AI写文章
1 条回复
切换为时间正序
请发表友善的回复…
发表回复
ooo_elang 2017-04-05
  • 打赏
  • 举报
回复
楼主问题解决了吗?

2,161

社区成员

发帖
与我相关
我的任务
社区描述
Linux/Unix社区 UNIX文化
社区管理员
  • UNIX文化社区
  • 文天大人
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧