Overview
The data flow follows: Input Text → Preprocessing → Validation → ResponseDetailed Data Flow
Input Processing
Copy
Ask AI
+-----------------+ +-------------------+ +---------------------+ +--------------+
| Raw Text | --> | Zawgyi Detection | --> | Normalize | --> | 'မြန်မာစာ' |
| 'မြန်မာစာ' | | | | - Remove zero-width | | |
+-----------------+ +-------------------+ | - Unicode NFC | +--------------+
| - Whitespace |
+---------------------+
Segmentation
Copy
Ask AI
+--------------+ +----------------------+ +------------------------+ +--------------------+
| 'မြန်မာစာ' | --> | Syllable Segmenter | --> | ['မြန်', 'မာ', 'စာ'] | --> | Word Assembly |
| | | - Consonant bounds | | | | - Dictionary lookup|
+--------------+ | - Handle stacking | +------------------------+ | - Statistical model|
+----------------------+ +--------+-----------+
|
v
+--------------------+
| ['မြန်မာ', 'စာ'] |
+--------------------+
Validation Pipeline
Copy
Ask AI
Client SpellChecker Normalizer Segmenter SyllableVal WordVal ContextVal Provider
| | | | | | | |
|-- check() --->| | | | | | |
| | | | | | | |
| | [Pre-processing] | | | | |
| |-- normalize() --> | | | | |
| |<- "မြန်မာစာ" --| | | | | |
| | | | | | | |
| | [Segmentation] | | | | |
| |-- segment_syllables() ------>| | | | |
| |<- ["မြန်","မာ","စာ"] --------| | | | |
| | | | | | | |
| | [Layer 1: Syllable Validation] | | | |
| |-- validate(text) ----------------------> | | | |
| | | | |-- is_valid_syllable("မြန်") ------->|
| | | | |<- True (freq: 5000) ----------------|
| | | | |-- is_valid_syllable("မာ") --------->|
| | | | |<- True (freq: 3000) ----------------|
| | | | |-- is_valid_syllable("စာ") --------->|
| | | | |<- True (freq: 4000) ----------------|
| |<- [] (no errors) -------------------------.| | | |
| | | | | | | |
| | [Layer 2: Word Validation] | | | | |
| |-- validate(text) --------------------------------------->| | |
| | | | | |-- is_valid_word ------->|
| | | | | |<- True (freq,POS) -----|
| |<- [] (no errors) ----------------------------------------| | |
| | | | | | | |
| | [Layer 2.5: Grammar Checking] | | | |
| | SyntacticValidationStrategy calls SyntacticRuleChecker.check_sequence(words) |
| | Grammar rules loaded from YAML config | | | |
| | | | | | | |
| | [Layer 3: Context Validation] | | | |
| |-- validate(text) -------------------------------------------------------------> |
| | | | | | |-- bigram ->|
| | | | | | |<-- 0.15 ---|
| |<- [] (no errors) ------------------------------------------------------------|
| | | | | | | |
|<-- Response(has_errors=false) | | | | | |
| | | | | | | |
Response Generation
Copy
Ask AI
+------------------------+ +------------------+ +-------------------+
| Validation Results | --> | Response Builder | --> | Response |
| - syllable_errors | | | | - text, errors |
| - word_errors | +------------------+ | - syllables, words|
| - grammar_errors | | - stats |
| - context_errors | +-------------------+
+------------------------+
Error Flow Example
Copy
Ask AI
Client SpellChecker SyllableVal RuleValidator SymSpell Provider
| | | | | |
|-- check() --->| | | | |
| | | | | |
| | [Layer 1: Syllable Validation] | |
| |-- validate() --> | | |
| | |-- validate("ကျြောင်") ----->| |
| | | | | |
| | | Check medial compatibility | |
| | | Found: both ျ and ြ | |
| | | No dictionary match | |
| | | | | |
| | |<- INVALID (not in dict) -----| |
| | | | | |
| | | [Generate Suggestions] | |
| | |-- lookup(max_dist=2) -----------------> |
| | | | |-- similar ->|
| | | | |<- cands ----|
| | |<- ["ကြောင်", "ကျောင်း"] ------------| |
| | | | | |
| |<- SyllableError(suggestions=["ကြောင်", "ကျောင်း"]) |
| | | | | |
|<-- Response(has_errors=true, errors=[SyllableError(error_type="invalid_syllable", ...)])
| | | | | |
Batch Processing Flow
Copy
Ask AI
+----------------------------+ +--------------------------------------+ +----------------------------------+
| [text1, text2, ..., textN] | --> | CorpusSegmenter (Cython + OpenMP) | --> | [result1, result2, ..., resultN] |
+----------------------------+ | | +----------------------------------+
| +----------+ +----------+ |
| | Thread 1 | | Thread 2 | |
| +----------+ +----------+ |
| +----------+ +----------+ |
| | Thread 3 | | Thread 4 | |
| +----------+ +----------+ |
+--------------------------------------+
See Also
- Component Diagram - System components
- Architecture Overview - High-level design
- Extension Points - Customization options