lucene分词索引问题
网上有好多lucene的简单用法
例如
try {
Analyzer analyzer = new StandardAnalyzer();
//将索引存在内存中
Directory directory = new RAMDirectory();
//将索引存在磁盘的话,用下面这句
//Directory directory = FSDirectory.getDirectory("/tmp/testindex", true);
IndexWriter iwriter = new IndexWriter(directory, analyzer, true);
iwriter.setMaxFieldLength(25000);
Document doc = new Document();
TokenStream stream=analyzer.tokenStream("content",new StringReader("你们好啊,为什么"));
/* String text = "你们好啊,为什么nn";
doc.add(new Field("fieldname", text, Field.Store.YES,
Field.Index.TOKENIZED)); */
doc.add(new Field("fieldname",stream));
iwriter.addDocument(doc);
iwriter.close();
IndexSearcher isearcher = new IndexSearcher(directory);
// Parse a simple query that searches for "text":
QueryParser parser = new QueryParser("fieldname",analyzer);
Query query = parser.parse("为什么");
Hits hits = isearcher.search(query);
if(hits.length() == 1)
{
System.out.println("搜索\"text\"");
Document d = hits.doc(0);
System.out.println(d.get("fieldname"));
}else
{
System.out.println("没有搜索到结果");
}
// 遍历搜索结果:
for (int i = 0; i < hits.length(); i++) {
Document hitDoc = hits.doc(i);
System.out.println(hitDoc.get("fieldname"));
}
isearcher.close();
directory.close();
} catch (IOException e) {
e.printStackTrace();
} catch (ParseException e) {
e.printStackTrace();
}
}
如上面所示例子,在拥红色部分时可以出结果,但用蓝色tokenstream后就不行,这是为什么呢
我想用那个tokenstream的分词结果。
还有tokenStream(String fieldName, Reader reader)
中的fieldName与doc.add(new Field("fieldname",stream));中的fieldname有什么区别或联系啊