Skip to main content
mySpellChecker includes eight specialized grammar checkers that target common Myanmar grammatical errors. Each checker focuses on a specific grammar domain:
CheckerPurposeError Types
AspectCheckerVerb aspect markersTypos, invalid sequences
ClassifierCheckerNumeral classifiersTypos, agreement errors
CompoundCheckerCompound wordsTypos, malformed compounds
MergedWordCheckerMerged word detectionSegmenter merge errors
NegationCheckerNegation patternsTypos, missing endings
ParticleCheckerParticle context validationParticle misuse
TenseAgreementCheckerTense-time agreementTense mismatch
RegisterChecker3-way register detectionMixed register usage

AspectChecker

Validates Myanmar verb aspect markers that modify verbs to express temporal, modal, and aspectual meanings.

Aspect Categories

CategoryMarkersMeaningExample
Completionပြီ, ပြီးAction completedသွားပြီ (went)
ProgressiveနေOngoing actionစားနေ (eating)
Habitualတတ်Habitual actionစားတတ် (eats habitually)
ResultativeထားMaintained stateရေးထား (have written)
Directionalလာ, သွားMotion directionပြန်လာ (come back)
Desiderativeချင်Desire/wantလာချင် (want to come)
Potentialနိုင်, ရAbility/possibilityလုပ်နိုင် (can do)
Immediateလိုက်Following actionလိုက်သွား (follow and go)
ExperientialဖူးPast experienceရေးဖူး (have written before)

Usage

from myspellchecker.grammar.checkers.aspect import AspectChecker

checker = AspectChecker()

# Check if word is an aspect marker
checker.is_aspect_marker("ပြီ")  # True
checker.is_aspect_marker("စား")  # False

# Check for typos
checker.is_aspect_typo("ပရီ")  # True (typo for ပြီ)
correction = checker.get_typo_correction("ပရီ")  # "ပြီ"

# Get detailed aspect info
info = checker.get_aspect_info("ပြီ")
print(info.category)     # "completion"
print(info.description)  # "Action completed"
print(info.is_final)     # True (typically at phrase end)

# Validate aspect sequences
errors = checker.validate_sequence(["စား", "ပြီး", "သွား"])
for error in errors:
    print(f"{error.text}: {error.reason}")

ClassifierChecker

Validates Myanmar numeral + classifier patterns. Myanmar uses numeral classifiers similar to Chinese/Japanese.

Pattern: Numeral + Classifier + Noun

NumeralClassifierNounMeaning
သုံးယောက်(person)3 people
ငါးကောင်(animal)5 animals
နှစ်အုပ်(book)2 books
တစ်လုံး(round object)1 (round object)

Usage

from myspellchecker.grammar.checkers.classifier import (
    ClassifierChecker,
    get_classifier_checker,
    is_classifier,
    is_numeral
)

checker = ClassifierChecker()

# Check if word is a numeral
is_numeral("သုံး")  # True (three)
is_numeral("၃")     # True (digit 3)

# Check if word is a classifier
is_classifier("ယောက်")  # True (classifier for people)
is_classifier("လူ")     # False (just a noun)

# Get classifier category
category = checker.get_classifier_category("ကောင်")  # "animals"

# Check for classifier typos
typo_result = checker.check_classifier_typo("ယေက်")
if typo_result:
    correction, confidence = typo_result
    print(f"Correction: {correction}")  # "ယောက်"

# Validate classifier usage
errors = checker.validate_sequence(["သုံး", "ယေက်", "ရှိ"])
for error in errors:
    print(f"{error.word}{error.suggestion}")

Classifier-Noun Agreement

# Get compatible classifiers for a noun
classifiers = checker.get_compatible_classifiers("ခွေး")  # ["ကောင်"]

# Check classifier-noun agreement
error = checker.check_agreement(classifier="ယောက်", noun="ခွေး")
if error:
    print(error.reason)  # ခွေး (dog) should use ကောင် not ယောက်

CompoundChecker

Detects and validates Myanmar compound word formations.

Compound Types

TypePatternExampleResult
Noun-NounN + Nပန်း + ခြံပန်းခြံ (flower garden)
Verb-VerbV + Vစား + သောက်စားသောက် (dine)
ReduplicationX + Xဖြေး →ဖြေးဖြေး (slowly)
AffixedPrefix + Rootအ + လုပ်အလုပ် (work)

Usage

from myspellchecker.grammar.checkers.compound import (
    CompoundChecker,
    get_compound_checker,
    is_compound,
    is_reduplication,
)

checker = CompoundChecker()

# Check if word is a recognized compound
is_compound("ပန်းခြံ")  # True

# Check for reduplication
is_reduplication("ဖြေးဖြေး")  # True
base = checker.get_reduplication_base("ဖြေးဖြေး")  # "ဖြေး"

# Detect compound pattern
info = checker.detect_compound_pattern("အလုပ်")
if info:
    print(info.compound_type)  # "affixed"
    print(info.components)     # ["အ", "လုပ်"]
    print(info.pattern)        # "PREFIX(nominalization) + STEM"

Analyze Compounds

# Comprehensive compound analysis
result = checker.analyze_word("ပန်းခြံ")
print(result["is_compound"])      # True
print(result["components"])       # ["ပန်း", "ခြံ"]
print(result["has_prefix"])       # False
print(result["is_reduplication"]) # False
print(result["confidence"])       # 0.95

MergedWordChecker

Detects words that the segmenter may have incorrectly merged from a particle + verb sequence into a single compound word.

Problem

Myanmar word segmenters sometimes merge adjacent tokens when the concatenation forms a valid dictionary word:
InputIntendedSegmentedIssue
သူက စားသောကြောင့်သူ + က + စား + သောကြောင့်သူ + ကစား + သောကြောင့်”က” + “စား” merged to “ကစား” (play)

Detection Strategy

A merged word is flagged ONLY when ALL conditions hold:
  1. The word is in the known ambiguous-merge set (e.g., “ကစား”)
  2. The preceding word is a NOUN or PRONOUN (POS: N, PRON)
  3. The following word is a clause-linking particle or verb-final marker
This three-way evidence requirement prevents false positives on legitimate uses.

Configuration

The checker uses a conservative confidence of 0.80 since this is a heuristic that cannot be 100% certain without semantic understanding.
from myspellchecker.grammar.checkers.merged_word import MergedWordChecker

checker = MergedWordChecker()
errors = checker.validate_sequence(words, pos_tags)

NegationChecker

Validates Myanmar negation patterns. Myanmar negation follows specific structures.

Negation Patterns

PatternStructureExampleMeaning
Standardမ + verb + ဘူးမသွားဘူးdon’t go
Politeမ + verb + ပါဘူးမသွားပါဘူးpolitely don’t go
Prohibitionမ + verb + နဲ့မလုပ်နဲ့Don’t do!
Formalမ + verb + ပါမရှိပါdoesn’t exist (formal)

Usage

from myspellchecker.grammar.checkers.negation import (
    NegationChecker,
    get_negation_checker,
    is_negative_ending,
)

checker = NegationChecker()

# Check for negation prefix
checker.starts_with_negation("မသွား")  # True
checker.starts_with_negation("သွား")   # False

# Check negative endings
is_negative_ending("ဘူး")  # True
is_negative_ending("တယ်")  # False

# Check for ending typos
typo_result = checker.check_ending_typo("ဘူ")
if typo_result:
    correction, confidence = typo_result
    print(f"Correction: {correction}")  # "ဘူး"

# Validate negation patterns
errors = checker.validate_sequence(["မ", "သွား", "ဘူ"])
for error in errors:
    print(f"{error.word}{error.suggestion}")

Detect Negation Patterns

# Detect negation pattern starting at a given index
pattern = checker.detect_negation_pattern(["မ", "သွား", "ဘူး"], 0)
if pattern:
    print(pattern.pattern_type)  # "standard_negative"
    print(pattern.verb)          # "သွား"
    print(pattern.ending)        # "ဘူး"
    print(pattern.register)      # "colloquial"

ParticleChecker

Validates Myanmar particle usage given verb and noun context. Myanmar particles (postpositions) must agree with the verb type and syntactic role of surrounding words.

Common Misuse Patterns

PatternIncorrectCorrectExplanation
Motion verb + static locativeကျောင်းမှာ သွားတယ်ကျောင်းကို သွားတယ်Use ကို/သို့ with motion verbs
Sequential ပြီ where ပြီး neededစားပြီ သွားတယ်စားပြီး သွားတယ်ပြီး links sequential actions
Negation + affirmative endingမသွားတယ်မသွားဘူးNegated sentences need ဘူး

Features

  • Particle confusion pair detection from YAML rules (particle_contexts.yaml)
  • Verb-particle frame checking — validates verb+particle compatibility
  • POS-tag-aware validation with fallback heuristics when tags unavailable
  • Configurable confidence thresholds via ParticleCheckerConfig

Usage

from myspellchecker.grammar.checkers.particle import ParticleChecker

checker = ParticleChecker()

# Validate particle usage in a word sequence
errors = checker.validate_sequence(
    words=["ကျောင်း", "ကို", "ရှိ", "တယ်"],
    pos_tags=["N", "PPM", "V", "SFP"]  # Optional POS tags
)

for error in errors:
    print(f"{error.text}: {error.reason}")
    print(f"  Suggestion: {error.suggestions}")
    print(f"  Confidence: {error.confidence}")

Singleton Access

from myspellchecker.grammar.checkers.particle import get_particle_checker

# Thread-safe singleton (loads YAML once)
checker = get_particle_checker()

YAML Configuration

Particle rules are defined in rules/particle_contexts.yaml:
particle_confusions:
  - particle: "ကို"
    confused_with: "မှာ"
    context: "static_location"
    description: "Static locative used with motion verb"
    confidence: 0.70

verb_particle_frames:
  - verbs: ["သွား", "လာ", "ပြန်"]
    required_particles: ["ကို", "သို့"]
    incompatible_particles: ["မှာ", "တွင်"]
    note: "Motion verbs require directional particles"

TenseAgreementChecker

Validates that aspectual particles (sentence-final markers) agree with temporal adverbials in Myanmar sentences. When a temporal adverb indicates a specific tense, the sentence-final particle must match.

Examples

StatusSentenceExplanation
Correctမနေ့က သွားခဲ့တယ်yesterday + past marker
Incorrectမနေ့က သွားမယ်yesterday + future marker
Correctမနက်ဖြန် သွားမယ်tomorrow + future marker
Incorrectမနက်ဖြန် သွားခဲ့တယ်tomorrow + past marker

Usage

from myspellchecker.grammar.checkers.tense_agreement import TenseAgreementChecker

checker = TenseAgreementChecker()

# Validate tense-time agreement
errors = checker.validate_sequence(["မနေ့က", "ကျောင်း", "သွား", "မယ်"])

for error in errors:
    print(f"{error.text}: {error.reason}")
    print(f"  Time adverb: {error.time_adverb}")
    print(f"  Detected tense: {error.detected_tense}")
    print(f"  Suggestion: {error.suggestions}")

Checking Individual Words

# Check if a word is a temporal adverb
checker.is_time_adverb("မနေ့က")  # True
checker.get_adverb_tense("မနေ့က")  # "past"

# Check if a word is an aspect marker
checker.is_aspect_marker("မယ်")  # True
checker.get_marker_tense("မယ်")  # "future"

YAML Configuration

Tense rules are defined in rules/tense_markers.yaml:
tense_agreement_rules:
  past_time_adverbs: ["မနေ့က", "တုန်းက", "အရင်က"]
  future_time_adverbs: ["မနက်ဖြန်", "နောက်နှစ်", "လာမယ့်"]
  past_aspect_markers: ["ခဲ့တယ်", "ခဲ့သည်"]
  future_aspect_markers: ["မယ်", "မည်"]
  incompatible_pairs:
    - time_class: past
      incompatible_aspects: ["မယ်", "မည်"]
      confidence: 0.80

Configuration

from myspellchecker.core.config import TenseAgreementCheckerConfig

config = TenseAgreementCheckerConfig(
    default_confidence=0.75,  # Default confidence for tense mismatches
    high_confidence=0.85,     # When both adverb and marker are unambiguous
)
checker = TenseAgreementChecker(checker_config=config)

RegisterChecker

Validates register consistency across three tiers: formal, polite, and colloquial. Myanmar has distinct register markers at the sentence-final position, and mixing registers within a sentence is a stylistic error.

Three-Tier Register System

RegisterSentence-Final ParticlesPronounsUse Context
Formalသည်, ၏သူသည်Written prose, news, official documents
Politeပါတယ်, ပါမယ်Respectful speech, customer service
Colloquialတယ်, မယ်သူ, ငါCasual conversation, informal writing

Mixing Severity

CombinationSeverityConfidence
Formal + ColloquialHigh (strong mismatch)0.85
Formal + PoliteLow (formality gap)0.65
Polite + ColloquialMedium0.75

Usage

from myspellchecker.grammar.checkers.register import RegisterChecker

checker = RegisterChecker()

# Get register of a word
info = checker.get_register("သည်")
print(info.register)         # "formal"

info = checker.get_register("ပါတယ်")
print(info.register)         # "polite"

info = checker.get_register("တယ်")
print(info.register)         # "colloquial"

# Check register type
checker.is_formal("သည်")      # True
checker.is_colloquial("တယ်")  # True
checker.is_neutral("စာအုပ်")  # True

Detect Sentence Register

# Detect predominant register (now returns 3-way classification)
register, consistency, infos = checker.detect_sentence_register(
    ["သူ", "သည်", "စာအုပ်", "ဖတ်", "တယ်"]
)
print(register)     # "mixed"
print(consistency)  # 0.5 (50% consistent)

Validate Register Consistency

# Check for mixed register errors
errors = checker.validate_sequence(["သူ", "သည်", "စာအုပ်", "ဖတ်", "တယ်"])
for error in errors:
    print(f"{error.text}: {error.reason}")
    print(f"  Detected: {error.detected_register}")
    print(f"  Expected: {error.expected_register}")
    print(f"  Suggestion: {error.suggestion}")

Configuration

from myspellchecker.core.config import RegisterCheckerConfig

config = RegisterCheckerConfig(
    register_mismatch_confidence=0.85,        # Formal + colloquial mixing
    register_formality_gap_confidence=0.65,    # Formal + polite mixing
)
checker = RegisterChecker(register_config=config)

Integration with SpellChecker

All grammar checkers are automatically used when grammar checking is enabled:
from myspellchecker import SpellChecker
from myspellchecker.core.config import SpellCheckerConfig
from myspellchecker.providers import SQLiteProvider

config = SpellCheckerConfig(
    use_rule_based_validation=True  # Enable all grammar checkers
)

provider = SQLiteProvider(database_path="path/to/dictionary.db")
checker = SpellChecker(config=config, provider=provider)
result = checker.check("သူသည် စာအုပ် ဖတ်တယ်။")

# Grammar errors include all checker types
for error in result.errors:
    if hasattr(error, 'error_type'):
        print(f"Type: {error.error_type}")  # aspect_error, register_error, etc.
        print(f"Word: {error.text}")
        print(f"Reason: {error.reason}")

Error Types Summary

CheckerError Types
AspectCheckeraspect_typo, invalid_sequence, incomplete_aspect
ClassifierCheckertypo, agreement, missing, invalid_pattern
CompoundCheckercompound_typo, invalid_compound, incomplete_reduplication
MergedWordCheckermerged_word
NegationCheckertypo, missing_ending, invalid_pattern
ParticleCheckerparticle_misuse
TenseAgreementCheckertense_mismatch
RegisterCheckerregister_error

See Also