Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.myspellchecker.com/llms.txt

Use this file to discover all available pages before exploring further.

This section covers environment setup, testing workflows, Cython development, and contribution guidelines for the mySpellChecker project.

Contents

Getting Started

Development Workflow

Contributing

Quick Start

Environment Setup

# Clone repository
git clone https://github.com/thettwe/myspellchecker.git
cd my-spellchecker

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install development dependencies
pip install -e ".[dev]"

# Build Cython extensions
python setup.py build_ext --inplace

Running Tests

# All tests
pytest tests/

# With coverage
pytest tests/ --cov=myspellchecker

# Specific test file
pytest tests/test_syllable_rules.py

# By marker
pytest tests/ -m unit
pytest tests/ -m integration

Code Quality

# Format code
ruff format .

# Lint code
ruff check .

# Type checking
mypy src/myspellchecker

Project Structure

myspellchecker
src/myspellchecker
core
algorithms
commands
providers
data_pipeline
segmenters
tokenizers
text
training
grammar
rules
schemas
data
utils
tests
integration
e2e
fixtures
test_*.py
scripts

Development Guidelines

Code Style

  • Follow PEP 8 with 100-character line length
  • Use type hints for all public functions
  • Write docstrings for all public APIs
  • Use meaningful variable and function names

Testing

  • Maintain ≥75% code coverage
  • Write unit tests for all new functions
  • Add integration tests for new features
  • Use pytest fixtures for test data

Documentation

  • Update documentation for all changes
  • Include docstrings with examples
  • Add entries to CHANGELOG.md

Git Workflow

# Create feature branch
git checkout -b feature/my-feature

# Make changes and commit
git add .
git commit -m "feat: add my feature"

# Push and create PR
git push origin feature/my-feature

Commit Messages

Follow conventional commits:
  • feat: New feature
  • fix: Bug fix
  • docs: Documentation
  • test: Tests
  • refactor: Refactoring
  • perf: Performance
  • chore: Maintenance

Key Components

Core Components

ComponentLocationPurpose
SpellCheckercore/spellchecker.pyMain entry point
SpellCheckerBuildercore/builder.pyFluent configuration
Validatorscore/validators/Validation layers
Configcore/config/Configuration package

Algorithms

AlgorithmLocationPurpose
SymSpellalgorithms/symspell.pyFast suggestions
N-gramalgorithms/ngram_context_checker.pyContext validation
Viterbialgorithms/viterbi.pyxPOS tagging

Cython Modules

ModuleLocationPurpose
normalize_ctext/normalize_c.pyxFast normalization
edit_distance_calgorithms/distance/edit_distance_c.pyxEdit distance
batch_processordata_pipeline/batch_processor.pyxParallel processing

Cython Development

Building Extensions

# Rebuild after changes
python setup.py build_ext --inplace

# Clean build
rm -rf build/ src/myspellchecker/**/*.cpp src/myspellchecker/**/*.so
python setup.py build_ext --inplace

Cython Tips

  1. Profile first: Only optimize hot paths
  2. Use typed memoryviews: For array operations
  3. Release GIL: For parallel operations
  4. Provide fallbacks: Pure Python for compatibility

Example Cython Pattern

Not all Cython modules use the same import pattern: normalize.py requires Cython directly (no fallback):
# normalize.py imports Cython extension unconditionally
from myspellchecker.text.normalize_c import (
    remove_zero_width_chars as c_remove_zero_width,
    reorder_myanmar_diacritics as c_reorder_diacritics,
)
viterbi.py has a Python fallback via flag:
# viterbi.py uses try/except with a flag
try:
    from myspellchecker.algorithms import viterbi_c
    _HAS_CYTHON_VITERBI = True
except ImportError:
    _HAS_CYTHON_VITERBI = False

Testing Guide

Test Categories

# Unit test
@pytest.mark.unit
def test_syllable_rules():
    assert is_valid_syllable("မြန်")

# Integration test
@pytest.mark.integration
def test_full_pipeline():
    checker = SpellChecker()
    result = checker.check("test text")
    assert result is not None

# Slow test
@pytest.mark.slow
def test_large_corpus():
    # Long-running test
    pass

Running Specific Tests

# By marker
pytest -m unit
pytest -m "not slow"

# By name
pytest -k "syllable"

# Single file
pytest tests/test_syllable_rules.py

# Single test
pytest tests/test_syllable_rules.py::test_valid_syllable

Test Fixtures

# conftest.py
@pytest.fixture
def spell_checker():
    """Create a SpellChecker instance."""
    return SpellChecker()

@pytest.fixture
def sample_text():
    """Sample Myanmar text."""
    return "မြန်မာစာ"

Debugging

Enable Debug Logging

from myspellchecker.utils.logging_utils import configure_logging
configure_logging(level="DEBUG")

Using Debugger

# Add breakpoint
import pdb; pdb.set_trace()

# Or use VS Code debugger

Common Issues

IssueDebugging Approach
Wrong suggestionsCheck SymSpell parameters
Slow performanceProfile with cProfile
Memory issuesUse memory_profiler
Cython errorsCheck .pyx compilation

Benchmarking

Test Fixtures

Test datasets are located in tests/fixtures/benchmarks/:
  • pos_gold_standard.json - POS tagging accuracy evaluation
See test fixtures for sample evaluation data.

Release Process

Version Bump

# Update version in pyproject.toml
# Update CHANGELOG.md
git add .
git commit -m "chore: bump version to X.Y.Z"
git tag vX.Y.Z
git push origin main --tags

Building Package

# Build distribution
python -m build

# Check package
twine check dist/*

# Upload to PyPI
twine upload dist/*

Resources

Documentation