Hibernate search highlighting not analyzed fields

Ilya Zinkovich Source

I'd like to highlight the whole not analyzed fields if they match the search query.
The indexed entity looks as follows:

@Entity
@Indexed
@AnalyzerDef(
        name = "documentAnalyzer",
        tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
        filters = {
                @TokenFilterDef(factory = ASCIIFoldingFilterFactory.class),
                @TokenFilterDef(factory = LowerCaseFilterFactory.class),
                @TokenFilterDef(
                        factory = StopFilterFactory.class,
                        params = {
                                @Parameter(name = "words", value = "stoplist.properties"),
                                @Parameter(name = "ignoreCase", value = "true")
                        }
                )
        }
)
public class Document {

    ...

    @Field(analyze = Analyze.NO)
    private String notAnalyzedField; // has "x-xxx-xxx" format

    @Field(analyze = Analyze.YES)
    private String analyzedField;   

}

Suppose I have a Document with notAnalyzedField: "a-bbb-ccc", then I run a search query with the same value and highlight search results using the following code:

String highlightText(Query query, Analyzer analyzer, String fieldName, String text) {
    QueryScorer queryScorer = new QueryScorer(query);
    SimpleHTMLFormatter formatter = new SimpleHTMLFormatter("<span>", "</span>");
    Highlighter highlighter = new Highlighter(formatter, queryScorer);
    return highlighter.getBestFragment(analyzer, fieldName, text);
}

As a result I get the following snippet:"a-<span>bbb</span>-<span>ccc</span>".
And it seems reasonable because the analyzer treats a symbol as a stop word and - as a delimiter and doesn't highlight them. But I cannot figure out how I can avoid using analyzer while highlighting this field. There are a few methods in Highlighter class that require TokenStream instead of Analyzer but I'm not sure how to use them.

A result I want to achieve is the whole highlighted field: "<span>a-bbb-ccc</span>"
Is there a way to achieve this with hibernate-search?

javahibernatelucenehibernate-search

Answers

answered 5 months ago Yoann Rodière #1

Where does your analyzer come from?

You might want to get it from Hibernate Search:

FullTextEntityManager em = /*...*/;
Analyzer analyzer = em.getSearchFactory()
    .getAnalyzer(Document.class);
highlightText(query, analyzer, fieldName, text);

If it doesn't work, try using a KeywordAnalyzer: highlightText(query, new KeywordAnalyzer(), fieldName, text);

comments powered by Disqus