求救--cygwin中运行nutch抓取网页报错,

o_o_o0 2015-08-27 03:12:02
Injecting seed URLs
/cygdrive/d/nutch/apache-nutch-2.3/runtime/local/bin/nutch inject urls -crawlId TestCrawl
InjectorJob: starting at 2015-08-27 15:09:57
InjectorJob: Injecting urlDir: urls
InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora stora ge class.
InjectorJob: total number of urls rejected by filters: 0
InjectorJob: total number of urls injected after normalization and filtering: 1
Injector: finished at 2015-08-27 15:10:00, elapsed: 00:00:02
2015年08月27日 15:10:00 : Iteration 1 of 2
Generating batchId
Generating a new fetchlist
/cygdrive/d/nutch/apache-nutch-2.3/runtime/local/bin/nutch generate -D mapred.re duce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculat ive.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.co mpress.map.output=true -topN 50000 -noNorm -noFilter -adddays 0 -crawlId TestCra wl -batchId 1440659400-22627
GeneratorJob: starting at 2015-08-27 15:10:02
GeneratorJob: Selecting best-scoring urls due for fetch.
GeneratorJob: starting
GeneratorJob: filtering: false
GeneratorJob: normalizing: false
GeneratorJob: topN: 50000
java.util.NoSuchElementException
at java.util.TreeMap.key(TreeMap.java:1323)
at java.util.TreeMap.firstKey(TreeMap.java:290)
at org.apache.gora.memory.store.MemStore.execute(MemStore.java:125)
at org.apache.gora.query.impl.QueryBase.execute(QueryBase.java:73)
at org.apache.gora.mapreduce.GoraRecordReader.executeQuery(GoraRecordRea der.java:68)
at org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordRea der.java:110)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue (MapTask.java:531)
at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:6 7)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(Local JobRunner.java:223)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:51 1)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor. java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor .java:617)
at java.lang.Thread.run(Thread.java:745)
GeneratorJob: finished at 2015-08-27 15:10:04, time elapsed: 00:00:02
GeneratorJob: generated batch id: 1440659400-22627 containing 0 URLs
Generate returned 1 (no new segments created)
Escaping loop: no more URLs to fetch now
...全文
180 2 打赏 收藏 转发到动态 举报
写回复
用AI写文章
2 条回复
切换为时间正序
请发表友善的回复…
发表回复
a_1515 2016-05-18
  • 打赏
  • 举报
回复
这个问题解决了吗??遇到了 同样的问题。。纠结了很久 求解决方案
o_o_o0 2015-08-27
  • 打赏
  • 举报
回复
大神们,帮帮忙怎么解决这个问题。。。。。。。

50,550

社区成员

发帖
与我相关
我的任务
社区描述
Java相关技术讨论
javaspring bootspring cloud 技术论坛(原bbs)
社区管理员
  • Java相关社区
  • 小虚竹
  • 谙忆
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧