Contents
Getting Started
- Setup - Development environment setup
- Architecture - System architecture
Development Workflow
- Testing - Running and writing tests
- Cython Development - Working with Cython modules
Contributing
- Contributing Guide - How to contribute
Quick Start
Environment Setup
Running Tests
Code Quality
Project Structure
Development Guidelines
Code Style
- Follow PEP 8 with 100-character line length
- Use type hints for all public functions
- Write docstrings for all public APIs
- Use meaningful variable and function names
Testing
- Maintain ≥75% code coverage
- Write unit tests for all new functions
- Add integration tests for new features
- Use pytest fixtures for test data
Documentation
- Update documentation for all changes
- Include docstrings with examples
- Add entries to CHANGELOG.md
Git Workflow
Commit Messages
Follow conventional commits:feat:New featurefix:Bug fixdocs:Documentationtest:Testsrefactor:Refactoringperf:Performancechore:Maintenance
Key Components
Core Components
| Component | Location | Purpose |
|---|---|---|
| SpellChecker | core/spellchecker.py | Main entry point |
| SpellCheckerBuilder | core/builder.py | Fluent configuration |
| Validators | core/validators.py | Validation layers |
| Config | core/config/ | Configuration package |
Algorithms
| Algorithm | Location | Purpose |
|---|---|---|
| SymSpell | algorithms/symspell.py | Fast suggestions |
| N-gram | algorithms/ngram_context_checker.py | Context validation |
| Viterbi | algorithms/viterbi.pyx | POS tagging |
Cython Modules
| Module | Location | Purpose |
|---|---|---|
| normalize_c | text/normalize_c.pyx | Fast normalization |
| edit_distance_c | algorithms/distance/edit_distance_c.pyx | Edit distance |
| batch_processor | data_pipeline/batch_processor.pyx | Parallel processing |
Cython Development
Building Extensions
Cython Tips
- Profile first: Only optimize hot paths
- Use typed memoryviews: For array operations
- Release GIL: For parallel operations
- Provide fallbacks: Pure Python for compatibility
Example Cython Pattern
Testing Guide
Test Categories
Running Specific Tests
Test Fixtures
Debugging
Enable Debug Logging
Using Debugger
Common Issues
| Issue | Debugging Approach |
|---|---|
| Wrong suggestions | Check SymSpell parameters |
| Slow performance | Profile with cProfile |
| Memory issues | Use memory_profiler |
| Cython errors | Check .pyx compilation |
Benchmarking
Running Accuracy Tests
Test Fixtures
Test datasets are located intests/fixtures/benchmarks/:
pos_gold_standard.json- POS tagging accuracy evaluation