SpellCheckerBuilder for convenient construction, or inject dependencies directly via the constructor for advanced use cases.
Class: SpellChecker
Initialization
The recommended way to initializeSpellChecker is via the SpellCheckerBuilder.
SpellChecker(config, segmenter, provider, syllable_validator, word_validator, context_validator, factory) is still available for advanced users who need direct dependency injection but is less convenient. All parameters are optional (defaulting to None).
Factory Methods
Convenience class methods for common configurations:Context Manager
SpellChecker implements the context manager protocol. Use with to ensure resources (database connections, model sessions) are released automatically:
checker.close() manually if not using a context manager.
check()
Performs spell checking on the given text.
The input Myanmar text to check.
Validation depth.
SYLLABLE for fast checks, WORD for full validation including context.Override semantic checking for this call.
None uses config default, True/False forces on/off.Response
segment_and_tag(text: str) -> tuple[list[str], list[str]]
Segments text into words and assigns Part-of-Speech tags using the configured method (Joint or Sequential).
Returns:
- Tuple of
(words, tags).
check_async()
Asynchronous version of check. Runs the CPU-intensive logic in a thread pool to avoid blocking the event loop.
The input Myanmar text to check.
Validation depth.
Override semantic checking for this call.
Response
Usage Example:
- Web APIs (FastAPI/Sanic): Keeps the server responsive while processing text.
- Concurrent Batching: Processing multiple texts in parallel using
asyncio.gather.
check_batch()
Efficiently checks a list of texts sequentially.
List of texts to check.
Validation depth applied to all texts.
list[Response]
check_batch_async()
Asynchronously checks multiple texts with configurable concurrency using a semaphore.
List of texts to check.
Validation depth applied to all texts.
Maximum concurrent operations.
Override semantic checking for this batch.
True forces semantic checking on, False forces it off, None uses the config default.list[Response] (same order as input)
get_pos_tags()
Gets the most likely POS tag sequence for text or pre-segmented words.
Input text to tag (optional if
words is provided).Pre-segmented words (optional if
text is provided).list[str], one POS tag per word.
cache_stats()
Returns unified cache statistics from all components (provider, joint tagger, semantic checker, Viterbi).
Returns: dict[str, Any]
close()
Closes underlying resources (database connections, model sessions). Idempotent. Also called automatically when using the context manager.
Properties
| Property | Type | Description |
|---|---|---|
symspell | SymSpell | None | Access SymSpell instance for direct suggestion lookups |
context_checker | NgramContextChecker | None | Access N-gram context checker |
syllable_rule_validator | SyllableRuleValidator | None | Access syllable rule validator |
ner_model | Any | None | Access NER model instance |
name_heuristic | NameHeuristic | None | Access proper noun detection |
semantic_checker | SemanticChecker | None | Access semantic checker |
phonetic_hasher | PhoneticHasher | None | Access phonetic similarity hasher |
Convenience Function: check_text()
A one-call function for quick spell checking without manually constructing a SpellChecker.
Myanmar text to check.
Validation level:
"syllable" or "word".Optional path to a SQLite dictionary database. When
None, uses the default database lookup.Response
Raises:
MissingDatabaseErrorif no database is availableValueErroriflevelis not"syllable"or"word"
This creates a new
SpellChecker instance per call. For repeated use, create a SpellChecker instance directly for better performance.ActionType and Error Classification
TheActionType enum classifies the recommended action for each detected error. Every Error object exposes an .action property that returns one of these values.
- AUTO_FIX: Deterministic, high-confidence structural repairs (Zawgyi encoding, particle typos, medial confusion, medial order errors, medial compatibility error, ha-htoe confusion, broken virama/stacking, incomplete stacking, missing asat, leading vowel-e, vowel after asat, duplicate punctuation)
- INFORM: Advisory errors (colloquial variants, colloquial info) or any error with confidence below 0.60
- SUGGEST: Everything else, shown to user for confirmation
Internationalization (i18n)
Error messages can be localized. The library supports English ("en") and Myanmar ("my").
Class: Response
The check method returns a Response object.
Original input text.
Text with top suggestions applied automatically.
True if any errors were found.Validation level used (
"syllable" or "word").List of error objects found in the text.
Processing metadata including
processing_time, error counts, and validation statistics.Serialization
BothResponse and all Error subclasses (SyllableError, WordError, ContextError, GrammarError) provide to_dict() and to_json() methods for easy serialization.
to_dict() -> dict[str, Any]
Converts the object to a plain dictionary. For Response, all nested Error objects are also converted.
to_json(indent: int = 2) -> str
Converts the object to a JSON string with Myanmar Unicode preserved (ensure_ascii=False). Set indent=None for compact output.