Types of Analyzers.md

Types of Analyzers

Analyzer	Description	Example Use
Standard	Default; breaks text by word boundaries, removes most punctuation, lowercases tokens.	English prose, general search
Simple	Splits on non-letter, lowercases.	Part numbers, technical terms
Whitespace	Splits on whitespace only, preserves case.	Code, serial numbers
Keyword	Does not split; treats entire text as a single token.	Exact match fields, IDs, tags
Pattern	Uses regex for splitting.	Log files, custom tokenization
Stop	Simple, but removes English stopwords.	Basic English filtering
Language	Language-specific, handles stemming, stopwords for various languages.	Multi-language text search
Fingerprint	Produces sorted, deduped, lowercased tokens—great for deduplication.	Address or name normalization
Custom	Chain any char-filters, tokenizer, token filters as needed.	Highly specific domain use-cases