初学webmagic,使用了官网的例子,但是为什么控制台不显示抓取到的信息啊
恐惧大师 2017-03-24 11:08:29 如题,在学webmagic的时候用的抓取例子是这样的:
package com.example;
import us.codecraft.webmagic.Page;
import us.codecraft.webmagic.Site;
import us.codecraft.webmagic.Spider;
import us.codecraft.webmagic.pipeline.ConsolePipeline;
import us.codecraft.webmagic.processor.PageProcessor;
public class MyClass implements PageProcessor{
private Site site = Site.me().setRetryTimes(3).setSleepTime(100);
@Override
public void process(Page page) {
page.addTargetRequests(page.getHtml().links().regex("(https://github\\.com/\\w+/\\w+)").all());
page.putField("author", page.getUrl().regex("https://github\\.com/(\\w+)/.*").toString());
page.putField("name", page.getHtml().xpath("//h1[@class='entry-title public']/strong/a/text()").toString());
if (page.getResultItems().get("name")==null){
//skip this page
page.setSkip(true);
}
page.putField("readme", page.getHtml().xpath("//div[@id='readme']/tidyText()"));
}
@Override
public Site getSite() {
return site;
}
public static void main(String[] args) {
Spider.create(new MyClass()).addUrl("https://github.com/code4craft");//thread(5).run();
}
}
看网上的教程应该能够显示抓取到的数据,但是为什么我的运行结果不报错但是控制台就是只显示以下信息而不显示数据呢?
log4j:WARN No appenders could be found for logger (us.codecraft.webmagic.scheduler.QueueScheduler).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Process finished with exit code 0