Documentation Index
Fetch the complete documentation index at: https://docs.myspellchecker.com/llms.txt
Use this file to discover all available pages before exploring further.
Install the library, build a dictionary from your corpus, and run your first spell check, all in a few lines of Python.
Prerequisite: You must build a dictionary database before spell checking will work. mySpellChecker does not include a bundled dictionary.
Setup
Before checking spelling, you need a dictionary database:
# Build a sample database for testing
myspellchecker build --sample
This creates ./mySpellChecker-default.db in the current directory. For production, build from your own corpus:
myspellchecker build --input your_corpus.txt --output dictionary.db
Your First Spell Check
Quick Check (One-liner)
from myspellchecker import check_text
# One-liner spell check
result = check_text("မြန်မာနိုင်ငံသည်အာရှတွင်ရှိသည်")
print(f"Has errors: {result.has_errors}")
Standard Usage
from myspellchecker import SpellChecker
from myspellchecker.providers import SQLiteProvider
# Create a spell checker instance (requires a built database)
checker = SpellChecker(provider=SQLiteProvider(database_path="mySpellChecker-default.db"))
# Check some text
text = "မြန်မာနိုင်ငံသည်အာရှတွင်ရှိသည်"
result = checker.check(text)
# Examine the result
print(f"Original: {result.text}")
print(f"Has errors: {result.has_errors}")
print(f"Error count: {len(result.errors)}")
Understanding Results
The check() method returns a Response object:
result = checker.check("ကျေးဇူးတင်ပါသည်")
# Access the original text
print(result.text) # "ကျေးဇူးတင်ပါသည်"
# Check if errors exist
if result.has_errors:
# Iterate through errors
for error in result.errors:
print(f"Position: {error.position}")
print(f"Error text: {error.text}")
print(f"Error type: {error.error_type}")
print(f"Suggestions: {error.suggestions}")
print(f"Confidence: {error.confidence}")
Error Types
mySpellChecker identifies several types of errors:
| Error Type | Value | Description | Example |
|---|
ErrorType.SYLLABLE | invalid_syllable | Invalid syllable structure | ကွြ (“invalid medial order”) → ကြွ (“correct medial order”) |
ErrorType.WORD | invalid_word | Valid syllables but unknown word | ကျောင်သား (“missing visarga”) → ကျောင်းသား (“student”) |
ErrorType.GRAMMAR | grammar_error | Syntactic/grammar issues | သွားသည် + တယ် (“formal + colloquial endings mixed”) |
ErrorType.CONTEXT_PROBABILITY | context_probability | Low probability word sequence | ထမင်းသွား (“rice go”) → ထမင်းစား (“rice eat”) |
from myspellchecker.core.constants import ErrorType
for error in result.errors:
if error.error_type == ErrorType.SYLLABLE:
print("Syllable-level error (typo)")
elif error.error_type == ErrorType.WORD:
print("Word-level error (unknown word)")
elif error.error_type == ErrorType.GRAMMAR:
print("Grammar error (syntactic issue)")
elif error.error_type == ErrorType.CONTEXT_PROBABILITY:
print("Context error (unlikely word sequence)")
Validation Levels
Control the depth of checking with validation levels at check time:
from myspellchecker import SpellChecker
from myspellchecker.providers import SQLiteProvider
from myspellchecker.core.constants import ValidationLevel
provider = SQLiteProvider(database_path="mySpellChecker-default.db")
checker = SpellChecker(provider=provider)
# Fast: syllable-only validation (default, catches most errors)
result = checker.check(text, level=ValidationLevel.SYLLABLE)
# Standard: syllable + word validation
result = checker.check(text, level=ValidationLevel.WORD)
Note: Validation level is specified per-check via the level parameter, not in configuration.
Getting Suggestions
Access correction suggestions for errors:
result = checker.check("နိူင်ငံ") # Example with error
for error in result.errors:
print(f"Error: {error.text}")
# Get top suggestion
if error.suggestions:
print(f"Best suggestion: {error.suggestions[0]}")
# Get all suggestions with scores
for suggestion in error.suggestions[:5]:
print(f" - {suggestion}")
Batch Processing
Process multiple texts efficiently:
texts = [
"မြန်မာနိုင်ငံ",
"ကျေးဇူးတင်ပါသည်",
"နေကောင်းလား"
]
# Check multiple texts
results = checker.check_batch(texts)
for text, result in zip(texts, results):
print(f"{text}: {len(result.errors)} errors")
Async Processing
For web applications and async workflows:
import asyncio
from myspellchecker import SpellChecker
from myspellchecker.providers import SQLiteProvider
provider = SQLiteProvider(database_path="mySpellChecker-default.db")
checker = SpellChecker(provider=provider)
async def check_texts():
# Single text
result = await checker.check_async("မြန်မာနိုင်ငံ")
print(result.has_errors)
# Batch async
texts = ["text1", "text2", "text3"]
results = await checker.check_batch_async(texts)
return results
# Run async
results = asyncio.run(check_texts())
Using with FastAPI
from fastapi import FastAPI
from myspellchecker import SpellChecker
from myspellchecker.providers import SQLiteProvider
app = FastAPI()
provider = SQLiteProvider(database_path="mySpellChecker-default.db")
checker = SpellChecker(provider=provider)
@app.post("/check")
async def check_spelling(text: str):
result = await checker.check_async(text)
return {
"has_errors": result.has_errors,
"error_count": len(result.errors),
"errors": [
{
"position": e.position,
"text": e.text,
"suggestions": e.suggestions[:3]
}
for e in result.errors
]
}
Context Manager Usage
Ensure proper resource cleanup:
from myspellchecker import SpellChecker
from myspellchecker.providers import SQLiteProvider
# Automatic cleanup
provider = SQLiteProvider(database_path="mySpellChecker-default.db")
with SpellChecker(provider=provider) as checker:
result = checker.check("မြန်မာနိုင်ငံ")
print(result.has_errors)
# Resources released here
Custom Database
Use your own dictionary database:
from myspellchecker import SpellChecker
from myspellchecker.providers import SQLiteProvider
# Specify database path
provider = SQLiteProvider(database_path="/path/to/my/dictionary.db")
checker = SpellChecker(provider=provider)
Configuration Presets
Use built-in presets for common scenarios:
from myspellchecker import SpellChecker
from myspellchecker.core.config import get_profile
# Fast mode: maximum speed, minimal validation
config = get_profile("fast")
# Production mode: good balance of speed and accuracy
config = get_profile("production")
# Accurate mode: maximum accuracy, all validations
config = get_profile("accurate")
checker = SpellChecker(config=config)
Available profiles: "development", "production", "testing", "fast", "accurate"
CLI Usage
Check text from command line:
# Check a file
myspellchecker check input.txt
# Check with output
myspellchecker check input.txt -o results.json
# Interactive mode
echo "မြန်မာနိုင်ငံ" | myspellchecker check -
# Use specific format
myspellchecker check input.txt --format json
myspellchecker check input.txt --format csv
Colloquial Handling
Control how colloquial (informal) spellings are handled:
from myspellchecker import SpellChecker
from myspellchecker.core.config import SpellCheckerConfig
from myspellchecker.core.config.validation_configs import ValidationConfig
# Lenient mode (default): accept colloquial forms with info note
config = SpellCheckerConfig(
validation=ValidationConfig(colloquial_strictness="lenient")
)
# Strict mode: flag all colloquial forms as errors
config = SpellCheckerConfig(
validation=ValidationConfig(colloquial_strictness="strict")
)
checker = SpellChecker(config=config)
Localized Error Messages
Display error messages in Myanmar:
from myspellchecker.core.i18n import set_language
# Switch to Myanmar
set_language("my")
# Now error messages will be in Myanmar
result = checker.check("invalid text")
# Error message: စာလုံးပေါင်း မမှန်ကန်ပါ
Streaming Large Files
Process large files with bounded memory:
from myspellchecker import SpellChecker
from myspellchecker.providers import SQLiteProvider
from myspellchecker.core.streaming import StreamingChecker
provider = SQLiteProvider(database_path="mySpellChecker-default.db")
checker = SpellChecker(provider=provider)
streaming = StreamingChecker(checker)
with open("large_file.txt") as f:
for result in streaming.check_stream(f):
if result.response.has_errors:
print(f"Line {result.line_number}: {len(result.response.errors)} errors")
Text Normalization
Always normalize input for consistent results:
from myspellchecker.text.normalize import normalize
# Normalize before checking
text = normalize("မြန်မာ")
result = checker.check(text)
Zawgyi Detection
Handle legacy Zawgyi encoding:
from myspellchecker.text.normalize import is_likely_zawgyi, convert_zawgyi_to_unicode
text = "..." # User input
is_zawgyi, confidence = is_likely_zawgyi(text)
if is_zawgyi and confidence > 0.95:
text = convert_zawgyi_to_unicode(text)
result = checker.check(text)
Error Handling
Handle errors gracefully with specific exception types:
from myspellchecker import SpellChecker
from myspellchecker.providers import SQLiteProvider
from myspellchecker.core.exceptions import (
DataLoadingError,
ConfigurationError,
ValidationError,
)
try:
provider = SQLiteProvider(database_path="mySpellChecker-default.db")
checker = SpellChecker(provider=provider)
result = checker.check(text)
except DataLoadingError as e:
print(f"Database error: {e}")
except ConfigurationError as e:
print(f"Configuration error: {e}")
except ValidationError as e:
print(f"Validation error: {e}")
Summary Table
| Use Case | Method | Speed |
|---|
| Quick check | check_text(text) | Convenient |
| Single text | checker.check(text) | Fast |
| Multiple texts | checker.check_batch(texts) | Faster (batched) |
| Async single | await checker.check_async(text) | Non-blocking |
| Async batch | await checker.check_batch_async(texts) | Non-blocking, batched |
| Large files | streaming.check_stream(file) | Memory-bounded |
| Fast validation | checker.check(text, level=ValidationLevel.SYLLABLE) | Fastest |
| Full validation | checker.check(text, level=ValidationLevel.WORD) | More thorough |
Next Steps