爬虫Heritrix写出这样的代码来,是不是很失败?
这是在类org.archive.crawler.writer.MirrorWriterProcessor 里的方法:
private URIToFileReturn uriToFile(CrawlURI curi, String host, int port,
String uriPath, String query, String suffix, String baseDir,
int maxSegLen, int maxPathLen, boolean caseSensitive,
String dirFile, Map characterMap, String dotBegin, String dotEnd,
String tooLongDir, boolean suffixAtEnd, Set underscoreSet)
throws IOException
DirSegment(String uriPath, int beginIndex, int endIndex, int maxSegLen,
boolean caseSensitive, CrawlURI curi, Map characterMap,
String dotBegin, String dotEnd, Set underscoreSet) {
......
我查看了写这个类的老外叫paul_jack。