Built-in analyzers & custom analyzer

1. standard analyzer

split text at word boundaries and removes punctuation
- by standard tokenizer
lowercases letters
- by lowercase token filter
contains stop token filter
- disabled by default

2. simple analyzer

similar to standard analyzer

split text into tokens when encountering anything else than letters
lowercases letters
- by lowercase tokenizer

3. whitespace analyzer

split text into tokens by whitespace
does not lowercase letters

4. keyword analyzer

no-op analyzer that leaves the text intact
- outputs as a single token
used for keyword field by default

5. pattern analyzer

regex is used to match token separators
default pattern matches all non-word characters
lowercases letters

6. language specific analyzers

ex) english analyzer

standard tokenizer
filters : english possessive stemmer, lowercase, english stop, english keywords, english stemmer

ex) using english analyzer

PUT /products
{
    "mappings" : {
        "properties" : {
            "description" : {
                "type" : "text",
                "analyzer" : "english"
            }
        }
    }
}

7. configure built-in analyzers

ex) adding stopwords filter to standard analyzer

PUT /products
{
    "settings" : {
        "analyzer" : {
            "remove_english_stop_words" : {
                "type" : "standard",
                "stopwords" : "_english_"
            }
        }
    }
}

parameters supported by analyzer type can be found in analyzer reference

8. custom analyzers

PUT /analyzer_test
{
    "settings" : {
        "analysis" : {
            "filter" : {
                "danish_stop" : {
                    "type" : "stop",
                    "stopwords" : "_danish_"
                }
            },
            "char_filter" : {},
            "tokenizer" : {},
            "analyzer" : {
                "my_custom_analyzer" : {
                    "type" : "custom",
                    "char_filter" : ["html_strip"],
                    "tokenizer" : "standard",
                    "filter" : ["lowercase", "danish_stop", "asciifolding"]
                }
            }
        }
    }
}

analyzer must be declared with in "analyzer" object

Author And Source

이 문제에 관하여(Built-in analyzers & custom analyzer), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://velog.io/@sangmin7648/Built-in-analyzers-custom-analyzer

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)