Prerequisites
- Python 3.10+ (3.11 recommended)
- Git
- C++ compiler (for Cython extensions)
- Linux:
gccorclang - macOS: Xcode Command Line Tools
- Windows: Visual Studio Build Tools
- Linux:
Optional
- OpenMP (for parallel processing)
- macOS:
brew install libomp - Linux: Usually pre-installed
- macOS:
- CUDA (for GPU acceleration with transformer models)
Quick Setup
Detailed Setup
Build Cython Extensions
text/normalize_c.pyx- Text normalizationalgorithms/viterbi.pyx- POS taggingalgorithms/distance/edit_distance_c.pyx- Levenshtein distancedata_pipeline/batch_processor.pyx- Parallel batch processingdata_pipeline/frequency_counter.pyx- Fast frequency calculationsdata_pipeline/ingester_c.pyx- Corpus ingestiondata_pipeline/repair_c.pyx- Segmentation repairdata_pipeline/tsv_reader_c.pyx- TSV file readingtokenizers/cython/word_segment.pyx- Word segmentationtokenizers/cython/mmap_reader.pyx- Memory-mapped file readingcore/syllable_rules_c.pyx- Syllable rule validation
IDE Setup
VS Code
Recommended extensions:- Python (Microsoft)
- Pylance
- Cython
.vscode/settings.json:
PyCharm
- Open project folder
- Configure interpreter:
venv/bin/python - Mark
srcas Sources Root - Enable Ruff plugin for linting
Environment Variables
Verifying Setup
Run Tests
Check Cython
Test Spell Checker
Common Issues
Cython Build Fails
Error:fatal error: Python.h: No such file or directory
Solution: Install Python development headers
OpenMP Not Found (macOS)
Error:ld: library not found for -lomp
Solution:
Database Not Found
Error:MissingDatabaseError
Solution:
Import Errors
Error:ModuleNotFoundError: No module named 'myspellchecker'
Solution:
Development Workflow
See Also
- Testing Guide - Running and writing tests
- Contributing - Contribution guidelines
- Cython Guide - Working with Cython