python,scrapy爬虫问题
先贴上我用scrapy写的爬虫运行日志:
2016-09-01 14:39:48 [scrapy] INFO: Crawled 71 pages (at 71 pages/min), scraped 6969 items (at 6969 items/min)
2016-09-01 14:40:48 [scrapy] INFO: Crawled 155 pages (at 84 pages/min), scraped 15150 items (at 8181 items/min)
2016-09-01 14:41:48 [scrapy] INFO: Crawled 238 pages (at 83 pages/min), scraped 15251 items (at 101 items/min)
2016-09-01 14:42:48 [scrapy] INFO: Crawled 317 pages (at 79 pages/min), scraped 15263 items (at 12 items/min)
2016-09-01 14:43:48 [scrapy] INFO: Crawled 398 pages (at 81 pages/min), scraped 15344 items (at 81 items/min)
2016-09-01 14:44:48 [scrapy] INFO: Crawled 483 pages (at 85 pages/min), scraped 15428 items (at 84 items/min)
2016-09-01 14:45:48 [scrapy] INFO: Crawled 570 pages (at 87 pages/min), scraped 15430 items (at 2 items/min)
2016-09-01 14:46:48 [scrapy] INFO: Crawled 652 pages (at 82 pages/min), scraped 15449 items (at 19 items/min)
2016-09-01 14:47:48 [scrapy] INFO: Crawled 732 pages (at 80 pages/min), scraped 15527 items (at 78 items/min)
问题是:从日志可以看出,前两分钟效率很高,但是从第三分钟开始性能突然下降,跟着的cpu占用也突然下降。也试过修改一些配置参数,检查了自己写的代码;但是都未发现、解决问题,求大神帮助分析一下这是什么原因。
tips:楼主刚刚接触爬虫,并且打算在这个道路上继续走一下,希望广交对爬虫同样感兴趣的朋友,共同学习交流经验。