c# - Lucene - How to index a value with special characters -
i have value trying index looks this:
test (test)
using standardanalyzer, attempted add document using:
field.store.yes, field.index.tokenized
when search value of 'test (test)' queryparser generates following tags:
+name:test +name:test
this operates expect because not escaping special characters.
however, if queryparser.escape('test (test)') while indexing value, creates terms:
[test] , [test]
then when search such:
queryparser.escape('test (test)')
i same 2 terms (as expect). problem if have 2 documents indexed names:
test test (test)
it matches on both. if specify search value of 'test (test)' want second document. curious why escaping special characters not preserve them in created terms. there alternate analyzer should at? looked @ whitespaceanalyzer , keywordanalyzer. whitespanceanalyzer case sensitive , keywordanalyzer stores single term of:
[test (test)]
which means if search 'test' not able return both documents.
any ideas on how implement this? doesn't seem should difficult.
if search 'test (test)' , want retrieve documents contains exact expression, must enclose search expression between "..." lucene knows want phrase search.
see lucene documentation details:
http://lucene.apache.org/java/3_0_1/queryparsersyntax.html#terms
Comments
Post a Comment