Skip to main content
Use SpellCheckerBuilder for convenient construction, or inject dependencies directly via the constructor for advanced use cases.

Class: SpellChecker

Initialization

The recommended way to initialize SpellChecker is via the SpellCheckerBuilder.
from myspellchecker.core import SpellCheckerBuilder, ConfigPresets

# Quick start (uses default config and provider)
checker = SpellCheckerBuilder().build()

# Optimized for speed
checker = (
    SpellCheckerBuilder()
    .with_config(ConfigPresets.FAST)
    .build()
)

# Custom configuration with multiple features
checker = (
    SpellCheckerBuilder()
    .with_phonetic(True)
    .with_context_checking(True)
    .with_ner(True)
    .with_max_suggestions(5)
    .with_max_edit_distance(2)
    .build()
)
The constructor SpellChecker(config, segmenter, provider, syllable_validator, word_validator, context_validator, factory) is still available for advanced users who need direct dependency injection but is less convenient. All parameters are optional (defaulting to None).

Factory Methods

Convenience class methods for common configurations:
from myspellchecker import SpellChecker

# Balanced performance/accuracy (equivalent to SpellCheckerBuilder().build())
checker = SpellChecker.create_default()

# Optimized for speed (disables context checking, NER, phonetic)
checker = SpellChecker.create_fast()

# Optimized for accuracy (higher edit distance, lower thresholds)
checker = SpellChecker.create_accurate()

# Minimal features (basic syllable validation only)
checker = SpellChecker.create_minimal()

Context Manager

SpellChecker implements the context manager protocol. Use with to ensure resources (database connections, model sessions) are released automatically:
with SpellChecker.create_default() as checker:
    result = checker.check("မြန်မာ")
# Resources released automatically on exit
You can also call checker.close() manually if not using a context manager.

check()

Performs spell checking on the given text.
text
str
required
The input Myanmar text to check.
level
ValidationLevel
default:"ValidationLevel.SYLLABLE"
Validation depth. SYLLABLE for fast checks, WORD for full validation including context.
use_semantic
bool | None
default:"None"
Override semantic checking for this call. None uses config default, True/False forces on/off.
Returns: Response

segment_and_tag(text: str) -> tuple[list[str], list[str]]

Segments text into words and assigns Part-of-Speech tags using the configured method (Joint or Sequential). Returns:
  • Tuple of (words, tags).
Example:
words, tags = checker.segment_and_tag("မြန်မာနိုင်ငံ")
# words: ['မြန်မာ', 'နိုင်ငံ']
# tags: ['N', 'N']

check_async()

Asynchronous version of check. Runs the CPU-intensive logic in a thread pool to avoid blocking the event loop.
text
str
required
The input Myanmar text to check.
level
ValidationLevel
default:"ValidationLevel.SYLLABLE"
Validation depth.
use_semantic
bool | None
default:"None"
Override semantic checking for this call.
Returns: Response Usage Example:
import asyncio
from myspellchecker import SpellChecker

async def main():
    checker = SpellChecker()
    
    # Run in event loop without blocking
    result = await checker.check_async("မြန်မာ")
    print(result.corrected_text)

asyncio.run(main())
Ideal for:
  • Web APIs (FastAPI/Sanic): Keeps the server responsive while processing text.
  • Concurrent Batching: Processing multiple texts in parallel using asyncio.gather.

check_batch()

Efficiently checks a list of texts sequentially.
texts
list[str]
required
List of texts to check.
level
ValidationLevel
default:"ValidationLevel.SYLLABLE"
Validation depth applied to all texts.
Returns: list[Response]

check_batch_async()

Asynchronously checks multiple texts with configurable concurrency using a semaphore.
texts
list[str]
required
List of texts to check.
level
ValidationLevel
default:"ValidationLevel.SYLLABLE"
Validation depth applied to all texts.
max_concurrency
int
default:"4"
Maximum concurrent operations.
use_semantic
bool | None
default:"None"
Override semantic checking for this batch. True forces semantic checking on, False forces it off, None uses the config default.
Returns: list[Response] (same order as input)

get_pos_tags()

Gets the most likely POS tag sequence for text or pre-segmented words.
text
str
default:""
Input text to tag (optional if words is provided).
words
list[str] | None
default:"None"
Pre-segmented words (optional if text is provided).
Returns: list[str], one POS tag per word.
tags = checker.get_pos_tags("သူသွားသည်")
# ['N', 'V', 'PPM']

cache_stats()

Returns unified cache statistics from all components (provider, joint tagger, semantic checker, Viterbi). Returns: dict[str, Any]
stats = checker.cache_stats()
# {'dictionary': {'hits': 1234, 'misses': 56}, 'frequency': {...}, ...}

close()

Closes underlying resources (database connections, model sessions). Idempotent. Also called automatically when using the context manager.

Properties

PropertyTypeDescription
symspellSymSpell | NoneAccess SymSpell instance for direct suggestion lookups
context_checkerNgramContextChecker | NoneAccess N-gram context checker
syllable_rule_validatorSyllableRuleValidator | NoneAccess syllable rule validator
ner_modelAny | NoneAccess NER model instance
name_heuristicNameHeuristic | NoneAccess proper noun detection
semantic_checkerSemanticChecker | NoneAccess semantic checker
phonetic_hasherPhoneticHasher | NoneAccess phonetic similarity hasher

Convenience Function: check_text()

A one-call function for quick spell checking without manually constructing a SpellChecker.
from myspellchecker import check_text

# Quick check with defaults
result = check_text("မြန်မာနိုငံ")
print(result.has_errors)  # True
print(result.corrected_text)

# Specify validation level and database
result = check_text(
    "မြန်မာနိုငံ",
    level="word",
    database_path="./mySpellChecker.db",
)
text
str
required
Myanmar text to check.
level
str
default:"syllable"
Validation level: "syllable" or "word".
database_path
str | None
default:"None"
Optional path to a SQLite dictionary database. When None, uses the default database lookup.
Returns: Response Raises:
  • MissingDatabaseError if no database is available
  • ValueError if level is not "syllable" or "word"
This creates a new SpellChecker instance per call. For repeated use, create a SpellChecker instance directly for better performance.

ActionType and Error Classification

The ActionType enum classifies the recommended action for each detected error. Every Error object exposes an .action property that returns one of these values.
from myspellchecker import ActionType, classify_action

# ActionType values
ActionType.AUTO_FIX   # Safe to apply silently (e.g., Zawgyi conversion, particle typos)
ActionType.SUGGEST    # Show to user for confirmation (e.g., word errors, context errors)
ActionType.INFORM     # Advisory note only (e.g., colloquial variants)

# Classify manually
action = classify_action(error_type="particle_typo", confidence=0.95)
# ActionType.AUTO_FIX

# Access via Error object
for error in result.errors:
    if error.action == ActionType.AUTO_FIX:
        # Safe to apply automatically
        pass
    elif error.action == ActionType.SUGGEST:
        # Show suggestions to user
        pass
The classification logic:
  • AUTO_FIX: Deterministic, high-confidence structural repairs (Zawgyi encoding, particle typos, medial confusion, medial order errors, medial compatibility error, ha-htoe confusion, broken virama/stacking, incomplete stacking, missing asat, leading vowel-e, vowel after asat, duplicate punctuation)
  • INFORM: Advisory errors (colloquial variants, colloquial info) or any error with confidence below 0.60
  • SUGGEST: Everything else, shown to user for confirmation

Internationalization (i18n)

Error messages can be localized. The library supports English ("en") and Myanmar ("my").
from myspellchecker import set_language, get_language, get_message, get_supported_languages

# Check supported languages
print(get_supported_languages())  # ["en", "my"]

# Switch to Myanmar
set_language("my")
print(get_language())  # "my"

# Get localized error message
print(get_message("invalid_syllable"))  # "စာလုံးပေါင်း မမှန်ကန်ပါ"

# Error messages are automatically localized
result = checker.check("...")
for error in result.errors:
    print(error.message)  # Uses current language setting
Language settings are thread-local, so different threads can use different languages concurrently without interference.

Class: Response

The check method returns a Response object.
text
str
required
Original input text.
corrected_text
str
required
Text with top suggestions applied automatically.
has_errors
bool
required
True if any errors were found.
level
str
required
Validation level used ("syllable" or "word").
errors
list[Error]
required
List of error objects found in the text.
metadata
dict[str, Any]
required
Processing metadata including processing_time, error counts, and validation statistics.

Serialization

Both Response and all Error subclasses (SyllableError, WordError, ContextError, GrammarError) provide to_dict() and to_json() methods for easy serialization.

to_dict() -> dict[str, Any]

Converts the object to a plain dictionary. For Response, all nested Error objects are also converted.

to_json(indent: int = 2) -> str

Converts the object to a JSON string with Myanmar Unicode preserved (ensure_ascii=False). Set indent=None for compact output.
from myspellchecker.core import SpellCheckerBuilder

checker = SpellCheckerBuilder().build()
result = checker.check("မျန်မာ")

# Serialize the full response
data = result.to_dict()
json_str = result.to_json()

# Serialize individual errors
for error in result.errors:
    print(error.to_dict())
    print(error.to_json(indent=None))  # compact JSON