Skip to main content
mySpellChecker includes six specialized grammar checkers that target common Myanmar grammatical errors. Each checker focuses on a specific grammar domain:
CheckerPurposeError Types
AspectCheckerVerb aspect markersTypos, invalid sequences
ClassifierCheckerNumeral classifiersTypos, agreement errors
CompoundCheckerCompound wordsTypos, malformed compounds
MergedWordCheckerMerged word detectionSegmenter merge errors
NegationCheckerNegation patternsTypos, missing endings
RegisterCheckerFormal/colloquial registerMixed register usage

AspectChecker

Validates Myanmar verb aspect markers that modify verbs to express temporal, modal, and aspectual meanings.

Aspect Categories

CategoryMarkersMeaningExample
Completionပြီ, ပြီးAction completedသွားပြီ (went)
ProgressiveနေOngoing actionစားနေ (eating)
Habitualတတ်Habitual actionစားတတ် (eats habitually)
ResultativeထားMaintained stateရေးထား (have written)
Directionalလာ, သွားMotion directionပြန်လာ (come back)
Desiderativeချင်Desire/wantလာချင် (want to come)
Potentialနိုင်, ရAbility/possibilityလုပ်နိုင် (can do)
Immediateလိုက်Following actionလိုက်သွား (follow and go)
ExperientialဖူးPast experienceရေးဖူး (have written before)

Usage

from myspellchecker.grammar.checkers.aspect import AspectChecker, get_aspect_checker

# Create checker
checker = AspectChecker()
# Or use singleton
checker = get_aspect_checker()

# Check if word is an aspect marker
checker.is_aspect_marker("ပြီ")  # True
checker.is_aspect_marker("စား")  # False

# Check for typos
checker.is_aspect_typo("ပရီ")  # True (typo for ပြီ)
correction = checker.get_typo_correction("ပရီ")  # "ပြီ"

# Get detailed aspect info
info = checker.get_aspect_info("ပြီ")
print(info.category)     # "completion"
print(info.description)  # "Action completed"
print(info.is_final)     # True (typically at phrase end)

# Validate aspect sequences
errors = checker.validate_sequence(["စား", "ပြီး", "သွား"])
for error in errors:
    print(f"{error.text}: {error.reason}")

Detect Aspect Patterns

# Detect aspect marker patterns in text
patterns = checker.detect_aspect_patterns(["လုပ်", "ပြီး", "သွား", "မယ်"])

for pattern in patterns:
    print(f"Markers: {pattern.markers}")
    print(f"Categories: {pattern.categories}")
    print(f"Valid: {pattern.is_valid}")
    print(f"Confidence: {pattern.confidence}")

Analyze Verb Phrases

# Comprehensive verb phrase analysis
result = checker.analyze_verb_phrase(["စား", "ချင်", "တယ်"])
print(result["markers"])  # List of AspectInfo
print(result["errors"])   # List of AspectError
print(result["score"])    # Validity score 0.0-1.0

ClassifierChecker

Validates Myanmar numeral + classifier patterns. Myanmar uses numeral classifiers similar to Chinese/Japanese.

Pattern: Numeral + Classifier + Noun

NumeralClassifierNounMeaning
သုံးယောက်(person)3 people
ငါးကောင်(animal)5 animals
နှစ်အုပ်(book)2 books
တစ်လုံး(round object)1 (round object)

Usage

from myspellchecker.grammar.checkers.classifier import (
    ClassifierChecker,
    get_classifier_checker,
    is_classifier,
    is_numeral
)

checker = ClassifierChecker()

# Check if word is a numeral
is_numeral("သုံး")  # True (three)
is_numeral("၃")     # True (digit 3)

# Check if word is a classifier
is_classifier("ယောက်")  # True (classifier for people)
is_classifier("လူ")     # False (just a noun)

# Get classifier category
category = checker.get_classifier_category("ကောင်")  # "animals"

# Check for classifier typos
typo_result = checker.check_classifier_typo("ယေက်")
if typo_result:
    correction, confidence = typo_result
    print(f"Correction: {correction}")  # "ယောက်"

# Validate classifier usage
errors = checker.validate_sequence(["သုံး", "ယေက်", "ရှိ"])
for error in errors:
    print(f"{error.word}{error.suggestion}")

Classifier-Noun Agreement

# Get compatible classifiers for a noun
classifiers = checker.get_compatible_classifiers("ခွေး")  # ["ကောင်"]

# Suggest classifier for a noun
suggestion = checker.suggest_classifier("လူ")
if suggestion:
    classifier, category = suggestion
    print(f"Use {classifier} ({category})")  # ယောက် (people)

# Check agreement
error = checker.check_agreement("သုံး", "ယောက်", "ခွေး")
if error:
    print(error.reason)  # "ခွေး should use ကောင် not ယောက်"

CompoundChecker

Detects and validates Myanmar compound word formations.

Compound Types

TypePatternExampleResult
Noun-NounN + Nပန်း + ခြံပန်းခြံ (flower garden)
Verb-VerbV + Vစား + သောက်စားသောက် (dine)
ReduplicationX + Xဖြေး →ဖြေးဖြေး (slowly)
AffixedPrefix + Rootအ + လုပ်အလုပ် (work)

Usage

from myspellchecker.grammar.checkers.compound import (
    CompoundChecker,
    get_compound_checker,
    is_compound,
    is_reduplication,
    detect_compound
)

checker = CompoundChecker()

# Check if word is a recognized compound
is_compound("ပန်းခြံ")  # True

# Check for reduplication
is_reduplication("ဖြေးဖြေး")  # True
base = checker.get_reduplication_base("ဖြေးဖြေး")  # "ဖြေး"

# Detect compound pattern
info = detect_compound("အလုပ်")
if info:
    print(info.compound_type)  # "affixed"
    print(info.components)     # ["အ", "လုပ်"]
    print(info.pattern)        # "PREFIX(nominalization) + STEM"

Compound Formation

# Check if two words can form a compound
compound = checker.can_form_compound("ပန်း", "ခြံ")
if compound:
    print(f"Forms: {compound}")  # "ပန်းခြံ"

# Get compound completion suggestions
completions = checker.suggest_compound_completion("စား")
for second, full_compound in completions:
    print(f"စား + {second} = {full_compound}")

Analyze Compounds

# Comprehensive compound analysis
result = checker.analyze_word("ပန်းခြံ")
print(result["is_compound"])      # True
print(result["components"])       # ["ပန်း", "ခြံ"]
print(result["has_prefix"])       # False
print(result["is_reduplication"]) # False
print(result["confidence"])       # 0.95

MergedWordChecker

Detects words that the segmenter may have incorrectly merged from a particle + verb sequence into a single compound word.

Problem

Myanmar word segmenters sometimes merge adjacent tokens when the concatenation forms a valid dictionary word:
InputIntendedSegmentedIssue
သူက စားသောကြောင့်သူ + က + စား + သောကြောင့်သူ + ကစား + သောကြောင့်”က” + “စား” merged to “ကစား” (play)

Detection Strategy

A merged word is flagged ONLY when ALL conditions hold:
  1. The word is in the known ambiguous-merge set (e.g., “ကစား”)
  2. The preceding word is a NOUN or PRONOUN (POS: N, PRON)
  3. The following word is a clause-linking particle or verb-final marker
This three-way evidence requirement prevents false positives on legitimate uses.

Configuration

The checker uses a conservative confidence of 0.75 since this is a heuristic that cannot be 100% certain without semantic understanding.
from myspellchecker.grammar.checkers.merged_word import MergedWordChecker

checker = MergedWordChecker()
errors = checker.validate_sequence(words, pos_tags)

NegationChecker

Validates Myanmar negation patterns. Myanmar negation follows specific structures.

Negation Patterns

PatternStructureExampleMeaning
Standardမ + verb + ဘူးမသွားဘူးdon’t go
Politeမ + verb + ပါဘူးမသွားပါဘူးpolitely don’t go
Prohibitionမ + verb + နဲ့မလုပ်နဲ့Don’t do!
Formalမ + verb + ပါမရှိပါdoesn’t exist (formal)

Usage

from myspellchecker.grammar.checkers.negation import (
    NegationChecker,
    get_negation_checker,
    is_negation,
    is_negative_ending,
    find_negations
)

checker = NegationChecker()

# Check for negation prefix
is_negation("မသွား")  # True
is_negation("သွား")   # False

# Check negative endings
is_negative_ending("ဘူး")  # True
is_negative_ending("တယ်")  # False

# Check for ending typos
typo_result = checker.check_ending_typo("ဘူ")
if typo_result:
    correction, confidence = typo_result
    print(f"Correction: {correction}")  # "ဘူး"

# Validate negation patterns
errors = checker.validate_sequence(["မ", "သွား", "ဘူ"])
for error in errors:
    print(f"{error.word}{error.suggestion}")

Detect Negation Patterns

# Detect negation pattern
pattern = checker.detect_negation_pattern(["မ", "သွား", "ဘူး"], 0)
if pattern:
    print(pattern.pattern_type)  # "standard_negative"
    print(pattern.verb)          # "သွား"
    print(pattern.ending)        # "ဘူး"
    print(pattern.register)      # "colloquial"

# Find all negations in text
patterns = find_negations(["သူ", "မ", "သွား", "ဘူး", "ငါ", "မ", "စား", "ချင်", "ဘူး"])
print(len(patterns))  # 2

Register Conversion

# Convert between registers
formal = checker.suggest_correction(["မ", "သွား", "ဘူး"], "formal")
print(formal)  # ["မ", "သွား", "ပါ"]

colloquial = checker.suggest_correction(["မ", "သွား", "ပါ"], "colloquial")
print(colloquial)  # ["မ", "သွား", "ဘူး"]

RegisterChecker

Validates register (formal/colloquial) consistency. Mixing registers is a common stylistic error.

Register Examples

RegisterSubjectVerb EndingFull Sentence
Formalသူသည်သည်သူသည် စာအုပ် ဖတ်သည်။
Colloquialသူတယ်သူ စာအုပ် ဖတ်တယ်။
Mixed (error)သူသည်တယ်သူသည် စာအုပ် ဖတ်တယ်။

Usage

from myspellchecker.grammar.checkers.register import (
    RegisterChecker,
    get_register_checker,
    get_word_register,
    detect_register,
    validate_register
)

checker = RegisterChecker()

# Get register of a word
info = get_word_register("သည်")
print(info.register)         # "formal"
print(info.colloquial_form)  # "တယ်"

info = get_word_register("တယ်")
print(info.register)      # "colloquial"
print(info.formal_form)   # "သည်"

# Check register type
checker.is_formal("သည်")      # True
checker.is_colloquial("တယ်")  # True
checker.is_neutral("စာအုပ်")  # True

Detect Sentence Register

# Detect predominant register
register, consistency, infos = checker.detect_sentence_register(
    ["သူ", "သည်", "စာအုပ်", "ဖတ်", "တယ်"]
)
print(register)     # "mixed"
print(consistency)  # 0.5 (50% consistent)

# Get consistency score
score = checker.get_register_score(["သူ", "သည်", "ဖတ်", "သည်"])
print(score)  # 1.0 (perfectly consistent)

Validate Register Consistency

# Check for mixed register errors
errors = validate_register(["သူ", "သည်", "စာအုပ်", "ဖတ်", "တယ်"])
for error in errors:
    print(f"{error.word}: {error.reason}")
    print(f"  Detected: {error.detected_register}")
    print(f"  Expected: {error.expected_register}")
    print(f"  Suggestion: {error.suggestion}")

Convert to Consistent Register

# Convert to formal register
formal = checker.suggest_consistent_version(
    ["သူ", "စာအုပ်", "ဖတ်", "တယ်"],
    "formal"
)
print(formal)  # ["သူ", "စာအုပ်", "ဖတ်", "သည်"]

# Convert to colloquial register
# Note: Topic-marking သည် after a noun is dropped in colloquial speech,
# while sentence-final သည် converts to တယ်.
colloquial = checker.suggest_consistent_version(
    ["သူ", "သည်", "စာအုပ်", "ဖတ်", "သည်"],
    "colloquial"
)
print(colloquial)  # ["သူ", "စာအုပ်", "ဖတ်", "တယ်"]

Integration with SpellChecker

All grammar checkers are automatically used when grammar checking is enabled:
from myspellchecker import SpellChecker
from myspellchecker.core.config import SpellCheckerConfig
from myspellchecker.providers import SQLiteProvider

config = SpellCheckerConfig(
    use_rule_based_validation=True  # Enable all grammar checkers
)

provider = SQLiteProvider(database_path="path/to/dictionary.db")
checker = SpellChecker(config=config, provider=provider)
result = checker.check("သူသည် စာအုပ် ဖတ်တယ်။")

# Grammar errors include all checker types
for error in result.errors:
    if hasattr(error, 'error_type'):
        print(f"Type: {error.error_type}")  # aspect_error, register_error, etc.
        print(f"Word: {error.text}")
        print(f"Reason: {error.reason}")

Error Types Summary

CheckerError Types
AspectCheckeraspect_typo, invalid_sequence, incomplete_aspect
ClassifierCheckertypo, agreement, missing, invalid_pattern
CompoundCheckercompound_typo, invalid_compound, incomplete_reduplication
NegationCheckertypo, missing_ending, invalid_pattern
RegisterCheckermixed_register, wrong_register

See Also