Skip to main content
Syllable validation is the first layer in the pipeline. It checks Myanmar character sequences against structural rules before attempting more expensive word or context analysis.

Why Start from Syllables?

Myanmar has no spaces between words, so you can’t split text into words without a dictionary. But you can split it into syllables using regex alone. Since every word is made of one or more syllables, checking syllable structure first catches the majority of errors cheaply. Traditional spell checkers would try to segment text into words first, which is:
  1. Expensive computationally
  2. Error-prone on misspelled text
  3. Wasteful when obvious typos exist
mySpellChecker inverts this:
  1. Break text into syllables first (fast, deterministic)
  2. Validate syllables (catches 90%+ of typos immediately)
  3. Only assemble valid syllables into words for deeper checking

Syllable Anatomy

A Myanmar syllable follows this pattern:
Syllable = Consonant + [Stacked]* + [Medial]* + [Vowel]* + [Final]*
ComponentRequiredPositionExamples
ConsonantYesInitialက, ခ, မ
StackedNoAfter consonant္က, ္ခ
MedialNoAfter consonantျ, ြ, ွ, ှ
VowelNoVariousါ, ိ, ု, ေ
FinalNoEnd်, ံ, း

Simple Syllables

Note: Tonal information is omitted from these transcriptions for simplicity. Standard Burmese has four tones (low, high, creaky, checked).
Consonant only:     က = "ka",  မ = "ma",  သ = "tha"
Consonant + Vowel:  ကာ = "ka", ကိ = "ki", ကု = "ku", ကေ = "kay"
Consonant + Asat:   က် = "k" (inherent vowel killed), မ် = "m" (inherent vowel killed)
Checked syllable:   ကတ် = "kat" (final stop, glottal closure)

Complex Syllables (with Medials)

Single medial:  ကျ = "kya", ကြ = "kra" (merged to kya in modern Burmese), ကွ = "kwa"
Combined:       ကြွ = "krwa", ကျွ = "kywa"
Ha-htoe:        နှ = "hna", မှ = "hma", လှ = "hla" (voiceless sonorants)
Medial order (Unicode canonical order, UTN #11):
1. ျ (ya-pin/medial ya, U+103B)
2. ြ (ya-yit/medial ra, U+103C)
3. ွ (wa-hswe/medial wa, U+103D)
4. ှ (ha-htoe/medial ha, U+103E) - always last
Valid: ကြွ (ြ before ွ) | Invalid: ကွြ (wrong order)

Common Syllable Patterns

PatternExamplePhonetic
CV (Consonant + Vowel)မာ, နေ, သူma, ne, thu
CVC (Consonant + Vowel + Consonant)ကန်, သင်, ကိန်းkan, thin, kein:
CMV (Consonant + Medial + Vowel)မြေ, ကျော်, ကြီးmye, kyaw, kyi:
Complexကြောင်, မြန်မာkyaung, myanma

How It Works

1

Syllable Segmentation

Text is broken into syllables using Myanmar orthographic rules:
text = "မြန်မာနိုင်ငံ"
# Segments to: ["မြန်", "မာ", "နိုင်", "ငံ"]
2

Rule-Based Validation

Each syllable is checked against 5 structural rules:Rule 1: Must start with consonant
"ကာ" → Valid  |  "ာက" → Invalid (starts with vowel)
Rule 2: Medials in correct order (Ya < Ra < Wa < Ha)
"ကြွ" → Valid (ြ before ွ)  |  "ကွြ" → Invalid (wrong order)
Rule 3: No duplicate medials
"ကြ" → Valid  |  "ကြြ" → Invalid (duplicate ြ)
Rule 4: Vowel compatibility
"ကိ" → Valid  |  "ကိီ" → Invalid (both are above vowels)
"ကု" → Valid  |  "ကုူ" → Invalid (both are below vowels)
Rule 5: Finals at end position
"မြန်" → Valid (် at end)  |  Finals (်, း, ံ) must not precede non-finals
3

Dictionary Lookup

Valid syllable structures are checked against the syllable dictionary:
# Syllable exists in dictionary
"မြန်" → Valid

# Valid structure but not in dictionary
"ဆြန်" → May be invalid (flagged for review)

Stacked Consonants

Kinzi (special stacking with င):
မင်္ဂလာ = /mingala/  (မ + င် ++++ ာ)
Regular stacking (using virama ္):
သ္တ = /sta/  (သ ++ တ)
ဗ္ဗ = /bba/  (ဗ ++ ဗ)

Configuration

Enable/Disable Syllable Validation

from myspellchecker import SpellChecker
from myspellchecker.providers import SQLiteProvider
from myspellchecker.core.constants import ValidationLevel

# Validation level is specified per-check, not in configuration
provider = SQLiteProvider(database_path="path/to/dictionary.db")
checker = SpellChecker(provider=provider)

# Syllable-only validation (fastest)
result = checker.check(text, level=ValidationLevel.SYLLABLE)

# Word validation includes syllable validation
result = checker.check(text, level=ValidationLevel.WORD)

Syllable Rule Configuration

from myspellchecker.core.syllable_rules import SyllableRuleValidator

# Custom rule validator
validator = SyllableRuleValidator(
    max_syllable_length=15,        # Max characters per syllable (default: 15)
    corruption_threshold=3,        # Max consecutive identical chars (default: 3)
    strict=True,                   # Enforce strict Pali/Sanskrit rules (default: True)
    allow_extended_myanmar=False,  # Accept Extended-A/B blocks (default: False)
)

Syllable Error Types

Invalid Structure

Syllable doesn’t follow Myanmar orthographic rules:
result = checker.check("ကက")  # Invalid: double consonant without medial/vowel
# Error: SyllableError with error_type=ErrorType.SYLLABLE

Unknown Syllable

Valid structure but not in dictionary:
result = checker.check("ဆြန်")  # Valid structure, unknown syllable
# Error: SyllableError with suggestions from similar syllables

Medial Confusion

Common error with similar-looking medials:
# ျ (ya-pin) vs ြ (ya-yit) confusion
result = checker.check("ကျြောင်")  # Incompatible medials (both ya-pin and ya-yit)
# Suggestion: "ကြောင်"

Performance Characteristics

MetricValue
SpeedVery Fast
Time ComplexityO(n) where n = syllable count
Lookup ComplexityO(1) per syllable
Syllable validation is the fastest layer in the pipeline. Each syllable is validated independently with O(1) dictionary lookups, making it suitable for real-time typing feedback.

API Reference

Using SpellChecker for Syllable Validation

from myspellchecker import SpellChecker
from myspellchecker.providers import SQLiteProvider
from myspellchecker.core.constants import ValidationLevel

provider = SQLiteProvider(database_path="path/to/dictionary.db")
checker = SpellChecker(provider=provider)

# Validate text at syllable level (default)
result = checker.check("မြန်မာ", level=ValidationLevel.SYLLABLE)

# Check for syllable-level errors
for error in result.errors:
    print(f"Error at position {error.position}: {error.text}")
    print(f"Suggestions: {error.suggestions}")

# Check if text is valid at syllable level
print(f"Has errors: {result.has_errors}")
Note: Direct instantiation of SyllableValidator requires a DI container setup. For most use cases, use SpellChecker.check() instead.

SyllableRuleValidator

from myspellchecker.core.syllable_rules import SyllableRuleValidator

rule_validator = SyllableRuleValidator()

# Check if syllable follows structural rules (returns bool)
is_valid = rule_validator.validate("မြန်")  # True

# Invalid syllable structures
is_valid = rule_validator.validate("ာက")   # False - starts with vowel
is_valid = rule_validator.validate("ကွြ")  # False - wrong medial order
is_valid = rule_validator.validate("ကိီ")  # False - incompatible vowels

Common Patterns

Real-Time Validation

from myspellchecker import SpellChecker
from myspellchecker.providers import SQLiteProvider
from myspellchecker.core.constants import ValidationLevel

def validate_realtime(text: str) -> dict:
    """Fast validation for typing feedback."""
    provider = SQLiteProvider(database_path="path/to/dictionary.db")
    checker = SpellChecker(provider=provider)

    # Use syllable-level validation for fastest response
    result = checker.check(text, level=ValidationLevel.SYLLABLE)

    return {
        "valid": not result.has_errors,
        "errors": [
            {"position": e.position, "text": e.text}
            for e in result.errors
        ]
    }

Syllable-Only Suggestions

from myspellchecker import SpellChecker
from myspellchecker.providers import SQLiteProvider
from myspellchecker.core.constants import ValidationLevel

def get_syllable_suggestions(syllable: str) -> list:
    """Get suggestions for a single syllable."""
    provider = SQLiteProvider(database_path="path/to/dictionary.db")
    checker = SpellChecker(provider=provider)

    # Use syllable-level validation
    result = checker.check(syllable, level=ValidationLevel.SYLLABLE)

    if not result.has_errors:
        return []  # Already valid

    # Return suggestions from first error
    return result.errors[0].suggestions if result.errors else []

Troubleshooting

Issue: Valid syllables marked as errors

Cause: Syllable not in dictionary Solution: Add to custom dictionary or update database:
myspellchecker build --input additional_syllables.txt --output dictionary.db --incremental

Issue: Slow syllable validation

Cause: Missing Cython extensions Solution: Rebuild extensions:
python setup.py build_ext --inplace

Issue: Incorrect syllable segmentation

Cause: Complex stacked consonants or rare characters Solution: Use custom segmenter or report issue:
from myspellchecker.segmenters import DefaultSegmenter

# Use strict Myanmar-only mode (no extended characters)
segmenter = DefaultSegmenter(allow_extended_myanmar=False)
syllables = segmenter.segment_syllables(text)

Next Steps