Morpheme Suggestion Strategy
When an OOV word is a near-miss compound (one morpheme has a typo) or near-miss reduplication, this strategy corrects the individual morpheme and reconstructs the valid word.How It Works
- Split the OOV word into morphemes using
CompoundResolver - Identify which morpheme is invalid (exactly one must be OOV)
- Correct the invalid morpheme using SymSpell
- Reconstruct the compound with the corrected morpheme
Example
Usage
The morpheme suggestion strategy is automatically included in the suggestion pipeline when compound resolution is enabled:Configuration
Medial Swap Strategy
Myanmar’s most common error type is ya-pin/ya-yit confusion (ျ ↔ ြ). SymSpell’s delete-distance model cannot reliably generate medial swaps as edit-distance-1 candidates. This strategy directly generates orthographic variants by swapping, inserting, or deleting Myanmar medial consonants.Myanmar Medials
| Medial | Unicode | Name | Example |
|---|---|---|---|
| ျ | U+103B | Ya-pin | ကျ |
| ြ | U+103C | Ya-yit | ကြ |
| ွ | U+103D | Wa-hswe | ကွ |
| ှ | U+103E | Ha-htoe | ကှ |
Operations
The strategy generates candidates through three operations:- Swap: Replace one medial with another (e.g., ျ → ြ)
- Insert: Add a medial where none exists
- Delete: Remove an existing medial
Medial Swap Pairs
Common swap pairs are defined inrules/medial_swap_pairs.yaml:
Usage
The medial swap strategy is automatically included in the suggestion pipeline:Direct Usage
Integration
Both strategies are part of the multi-strategy suggestion pipeline:See Also
- Suggestion Strategy — Strategy pattern for suggestions
- Suggestion Ranking — Multi-factor ranking
- Neural Reranker — ONNX-based reranking
- Compound Resolution — Compound word validation
- SymSpell — Primary suggestion algorithm