Skip to main content
No existing spell checker handles Myanmar’s unique challenges: no spaces between words, complex syllable structures, legacy Zawgyi encoding, and rich particle-based grammar. This page compares mySpellChecker against the commonly suggested alternatives and explains why general-purpose tools fall short.

The Gap

As of 2025, major spell checking tools do not support Myanmar:
ToolMyanmar SupportWhy It Fails
HunspellNo usable dictionaryRequires word boundaries (spaces), which Myanmar lacks
LanguageToolNot supportedNo Myanmar grammar rules, Java dependency
GrammarlyNot supportedCloud-only, English-focused
Microsoft EditorNot supportedNo Myanmar language pack
pyspellcheckerNot supportedASCII/Latin focused, Levenshtein-based
SymSpellNot supportedRequires pre-segmented words

Previous Attempts

ProjectStatusLimitation
mySpellCorrect (2022)DormantGitHub scripts only, not pip-installable, limited to character substitution
myoooext (ThanLwinSoft)DormantOpenOffice extension from 2010s era, dictionary from 1918 Judson’s
Hunspell Myanmar dictUnavailableBroken links, unconfirmed compatibility
Academic research exists (SymSpell4Burmese 2021, Tsetlin Machine error classification 2024), but none produced a maintained, installable library.

What Makes mySpellChecker Different

mySpellChecker is not a port of an English spell checker. It was designed from the ground up for Myanmar script.

Progressive Validation Pipeline

Traditional spell checkers split text on spaces. Myanmar has none, so they fail entirely. mySpellChecker starts from syllables instead, then runs up to 10 validation strategies in layers:
1

Input Text (no spaces)

2

Layer 1: Syllable Segmentation + Validation (tone, orthography)

3

Layer 2: Word Assembly + SymSpell Correction (broken compounds, POS sequence)

4

Layer 2.5: Grammar Rules + Homophone Detection (question structure, confusables)

5

Layer 3: N-gram Context + AI Semantic Analysis

6

Errors + Ranked Suggestions

End-to-End Pipeline

Everything you need ships in one pip install:
CapabilityWhat It Does
Dictionary BuildingBuild optimized SQLite dictionaries from your own corpus (CLI + Python API)
Syllable ValidationRegex-based syllable segmentation with Cython acceleration
Word ValidationSymSpell O(1) lookup with phonetic matching
Context CheckingBigram/trigram N-gram probabilities for real-word error detection
Grammar CheckingYAML-based rules with POS sequence validation
Homophone DetectionContext-aware homophone confusion resolution
POS TaggingPluggable: rule-based, Viterbi HMM, or transformer
NERNamed Entity Recognition for Myanmar text
MorphologyStemming, reduplication detection, compound analysis
SegmentationSyllable (regex) + word (myword/CRF/transformer)
Zawgyi SupportAuto-detection and conversion (via Google’s myanmartools)
Text NormalizationNFC normalization, diacritic reordering, zero-width removal

AI-Powered Validation (BYOM)

Two optional AI strategies that you train on your own corpus:
StrategyApproachSpeedOutput
Semantic CheckingTrain RoBERTa/BERT masked language modelSlowError flags + suggestions
Neural RerankerTrain MLP on suggestion featuresFastRe-ranked suggestions
The library provides a complete training pipeline (train-model): you bring the corpus, and it handles tokenizer training, model training, and ONNX export.

Production-Ready

FeatureDetails
Async APIcheck_async(), check_batch_async() for web frameworks
StreamingMemory-bounded processing for large files
Batch Processingcheck_batch() with parallelization
Connection PoolingThread-safe SQLite for concurrent access
DockerMulti-stage Dockerfile with GPU support
CLIFull command-line interface for all operations
CythonOptional C extensions for 2-10x performance

Summary

mySpellCheckerGeneric Spell Checkers
Myanmar syllable validationBuilt-inNot possible
No-space word segmentationmyword / CRF / transformerRequires whitespace
Context-aware checkingN-gram + semantic AILanguageTool only (no Myanmar)
Grammar rulesPOS-aware, Myanmar-specificNot for Myanmar
Dictionary buildingEnd-to-end pipelineManual
AI training pipelinesIncludedNot applicable
Python-nativeYesHunspell = C wrapper, LT = Java
Open sourceMITVaries

Acknowledgments

mySpellChecker integrates tools and research from the Myanmar NLP community:

See Also