Skip to main content
mySpellChecker is the first production-grade spell checking library built specifically for Myanmar (Burmese) language. It fills a gap that has existed since the beginning of Myanmar digital text — no existing spell checker handles Myanmar’s unique challenges: no spaces between words, complex syllable structures, legacy Zawgyi encoding, and rich particle-based grammar.

The Gap

Major spell checking tools do not support Myanmar:
ToolMyanmar SupportWhy It Fails
HunspellNo usable dictionaryRequires word boundaries (spaces) — Myanmar has none
LanguageToolNot supportedNo Myanmar grammar rules, Java dependency
GrammarlyNot supportedCloud-only, English-focused
Microsoft EditorNot supportedNo Myanmar language pack
pyspellcheckerNot supportedASCII/Latin focused, Levenshtein-based
SymSpellNot supportedRequires pre-segmented words

Previous Attempts

ProjectStatusLimitation
mySpellCorrect (2022)DormantGitHub scripts only, not pip-installable, limited to character substitution
myoooext (ThanLwinSoft)DormantOpenOffice extension from 2010s era, dictionary from 1918 Judson’s
Hunspell Myanmar dictUnavailableBroken links, unconfirmed compatibility
Academic research exists (SymSpell4Burmese 2021, Tsetlin Machine error classification 2024), but none produced a maintained, installable library.

What Makes mySpellChecker Different

mySpellChecker is not a port of an English spell checker — it was designed from the ground up for Myanmar script.

Syllable-First Architecture

Traditional spell checkers split text by whitespace, which fails entirely for Myanmar. mySpellChecker uses a syllable-first pipeline that works without spaces:
Input Text (no spaces)

Layer 1: Syllable Segmentation + Validation

Layer 2: Word Assembly + SymSpell Correction

Layer 2.5: Grammar Rule Validation (POS-aware)

Layer 3: N-gram Context + AI Semantic Analysis

Errors + Ranked Suggestions

End-to-End Pipeline

Everything you need ships in one pip install:
CapabilityWhat It Does
Dictionary BuildingBuild optimized SQLite dictionaries from your own corpus (CLI + Python API)
Syllable ValidationRegex-based syllable segmentation with Cython acceleration
Word ValidationSymSpell O(1) lookup with phonetic matching
Context CheckingBigram/trigram N-gram probabilities for real-word error detection
Grammar CheckingYAML-based rules with POS sequence validation
Homophone DetectionContext-aware homophone confusion resolution
POS TaggingPluggable: rule-based, Viterbi HMM, or transformer
NERNamed Entity Recognition for Myanmar text
MorphologyStemming, reduplication detection, compound analysis
SegmentationSyllable (regex) + word (myword/CRF/transformer)
Zawgyi SupportAuto-detection and conversion (via Google’s myanmartools)
Text NormalizationNFC normalization, diacritic reordering, zero-width removal

AI-Powered Validation (BYOM)

Two optional AI strategies that you train on your own corpus:
StrategyApproachSpeedOutput
Error DetectionFine-tune XLM-RoBERTa for token classification~10msError flags
Semantic CheckingTrain RoBERTa/BERT masked language model~200msError flags + suggestions
The library provides complete training pipelines (train-model, train-detector) — you bring the corpus, it handles tokenizer training, model training, and ONNX export.

Production-Ready

FeatureDetails
Async APIcheck_async(), check_batch_async() for web frameworks
StreamingMemory-bounded processing for large files
Batch Processingcheck_batch() with parallelization
Connection PoolingThread-safe SQLite for concurrent access
DockerMulti-stage Dockerfile with GPU support
CLIFull command-line interface for all operations
CythonOptional C extensions for 2-10x performance

Summary

mySpellCheckerGeneric Spell Checkers
Myanmar syllable validationBuilt-inNot possible
No-space word segmentationmyword / CRF / transformerRequires whitespace
Context-aware checkingN-gram + semantic AILanguageTool only (no Myanmar)
Grammar rulesPOS-aware, Myanmar-specificNot for Myanmar
Dictionary buildingEnd-to-end pipelineManual
AI training pipelinesIncludedNot applicable
Python-nativeYesHunspell = C wrapper, LT = Java
Open sourceMITVaries

Acknowledgments

mySpellChecker builds on foundational work in Myanmar NLP:

See Also