The core design principle is fail fast, go deeper only when needed. Cheap deterministic checks (syllable structure, dictionary lookups) run first and reject ~90% of errors before expensive operations (N-gram context, grammar rules, AI inference) are ever invoked. Each layer receives only the output that passed the layer below.Documentation Index
Fetch the complete documentation index at: https://docs.myspellchecker.com/llms.txt
Use this file to discover all available pages before exploring further.
Design Philosophy
Myanmar text has no spaces between words, so splitting on whitespace doesn’t work. Instead, the pipeline starts from syllables and builds up:- Break into syllables (deterministic, fast)
- Validate syllables (catches ~90% of errors)
- Assemble into words (only with valid syllables)
- Check grammar and context (only with valid words)
High-Level Architecture
Core Components
| Component | Purpose |
|---|---|
| SpellChecker | Main coordinator that orchestrates all validation layers |
| SpellCheckerBuilder | Fluent interface for constructing SpellChecker instances |
| DictionaryProvider | Pluggable storage backend (SQLite, Memory, JSON) |
| Segmenter | Text segmentation (syllable + word) |
| SyllableValidator | Layer 1: syllable structure validation |
| WordValidator | Layer 2: word lookup + SymSpell suggestions |
| ContextValidator | Layer 3: N-gram + validation strategies |
Validation Strategies
The context validation layer uses a Strategy pattern for modular, priority-ordered validation:ValidationStrategy (interface)
ToneValidationStrategy (10)
OrthographyValidationStrategy (15)
SyntacticValidationStrategy (20)
StatisticalConfusableStrategy (24)
BrokenCompoundStrategy (25)
POSSequenceValidationStrategy (30)
QuestionStructureValidationStrategy (40)
HomophoneValidationStrategy (45)
ConfusableCompoundClassifierStrategy (47), AI, opt-in
ConfusableSemanticStrategy (48), AI, opt-in
NgramContextValidationStrategy (50)
SemanticValidationStrategy (70), AI, opt-in
Offline Systems
Data Pipeline
Transforms raw corpus into optimized dictionary database:Training Pipeline
Creates AI models for semantic checking (BYOM):Design Principles
- Fail Fast: Catch errors at the earliest possible layer
- Layered Validation: Each layer adds accuracy at a cost
- Pluggable Components: Swap providers, segmenters, taggers
- Graceful Degradation: Continue working even if optional components fail
- Performance First: Optimize hot paths with Cython/OpenMP
Architecture Documents
System Design
Detailed component architecture and class responsibilities
Validation Pipeline
Pipeline deep-dive with execution flow
Component Diagram
Visual component relationships
Data Flow
Data flow through the system
Extension Points
How to extend the system
Dependency Injection
DI container system