NOTE: PostgreSQL-based corpus migration utilities (e.g.
CorpusMigrator) have been removed; corpus management now uses Lucene services. CI workflows have been switched to manual-only dispatch while CI is reconfigured.
A professional tool for creating and managing comprehensive dictionaries using the LIFT (Lexicon Interchange FormaT) standard. This Flask-based application provides full support for LIFT 0.13 with extensive features designed for lexicographers, linguists, and language documentation specialists.
- Multilingual Support: Every text field supports multiple writing systems simultaneously
- Senses & Subsenses: Organize word meanings hierarchically to capture polysemy and semantic relationships
- Examples & Usage: Rich contextual information with source language examples and translations
- Pronunciation Management: Comprehensive phonetic documentation with IPA transcription, audio files, and TTS integration
- Etymology Tracking: Document word origins and historical development
- Variants & Allomorphs: Document different forms of the same lexeme following SIL Fieldworks approach
- Lexical Relations: Create semantic networks connecting related entries (synonyms, antonyms, hypernyms, etc.)
- Reversals: Essential for bilingual dictionaries - create L2βL1 lookup capability
- Annotations: Editorial workflow and quality control features with review comments and status tracking
- Custom Fields: Extend LIFT to meet specific project needs with FieldWorks-compatible custom fields
- Ranges Editor: Manage controlled vocabularies for grammatical categories, semantic domains, and lexical relations
- Project Setup Wizard: Bootstrap new projects with recommended ranges and configurations
- 91% LIFT 0.13 Compliance: Implements 51 out of 56 LIFT elements, covering all essential features
- Responsive Web Interface: Works seamlessly on desktop and mobile devices
- Advanced Search & Filtering: Powerful search capabilities across all dictionary data
- Import/Export Functionality: Full support for LIFT, Kindle, and Flutter/SQLite formats
- FieldWorks Compatibility: Seamless import/export with FieldWorks/FLEx preserving 91% of elements
- LLM Integration: Example and sense management with AI assistance
- Python 3.8+
- BaseX XML Database (version 9.0+)
- Redis (for caching and performance optimization)
- Java Runtime Environment (for BaseX)
The application requires BaseX and Redis services to run properly. Use the provided scripts to start all required services:
# Start BaseX and Redis services
./start-services.sh
# The script will:
# - Start BaseX server (runs on port 1984)
# - Start Redis (runs on port 6379, optional but recommended)
# - Initialize admin user if needed
# - Check database connectivitygit clone <repository-url>
cd flask-apppython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txtIf you intend to run the Playwright end-to-end tests, install Node dependencies and download the Playwright browsers. Note that the node_modules/ directory is ignored by Git (run npm ci after cloning to populate it).
# Install Node dependencies (deterministic install)
npm ci
# Install Playwright browsers (Chromium, Firefox, WebKit)
# Use --with-deps on Linux to ensure system packages are installed
npx playwright install --with-deps chromium firefox webkitCopy the example environment file and update the settings:
cp .env.example .envEdit .env file with your configuration:
BASEX_HOST=localhost
BASEX_PORT=1984
BASEX_USERNAME=admin
BASEX_PASSWORD=admin
BASEX_DATABASE=dictionary
REDIS_URL=redis://localhost:6379/0
SECRET_KEY=your-secret-key-here# Start BaseX and Redis
./start-services.shpython run.pyThe application will be available at http://localhost:5000
- Go to Import/Export β Import LIFT
- Upload your LIFT file to begin working with your dictionary data
- The application supports LIFT 0.13 format with 91% element coverage
- Click on Entries to view your lexicon
- Click on any entry to open the full editor
- Use the comprehensive editing interface to modify or add new entries
Access the Tools β Ranges Editor to manage controlled vocabularies:
- View and edit grammatical information categories
- Manage semantic domains and lexical relations
- Create custom classification systems
Export functionality is currently under development. Import is fully operational through Import/Export β Import LIFT.
The application implements 51 out of 56 LIFT 0.13 elements (91% compliance), covering all essential and professional features needed for lexicography:
- Entry Elements: 11/12 (92%) - All essential entry components
- Sense Elements: 12/14 (86%) - Complete sense management
- Example Elements: 7/7 (100%) - Full example support
- Pronunciation: 3/3 (100%) - Complete phonetic documentation
- Etymology: 5/5 (100%) - Full etymology tracking
- Custom Fields: 6/7 (86%) - Extensive custom field support
# Import a LIFT file with optional ranges
python -m scripts.import_lift path/to/lift_file.lift [path/to/lift_ranges.lift-ranges]# Export to a LIFT file (currently in development)
python -m scripts.export_lift path/to/output.lift- LIFT (primary format) - Full LIFT 0.13 support for import, limited export
- Kindle - Dictionary format for Kindle devices (development in progress)
- SQLite - Database format (development in progress)
Note: Export functionality is currently limited and under active development. Import functionality is fully operational.
The system provides a comprehensive RESTful API for programmatic access:
- GET /api/entries/ - List entries with pagination and search
- GET /api/entries/{id} - Get a specific entry
- POST /api/entries/ - Create a new entry
- PUT /api/entries/{id} - Update an existing entry
- DELETE /api/entries/{id} - Delete an entry
- GET /api/entries/{id}/related - Get related entries
- GET /api/search/ - Search entries by text across all fields
- GET /api/search/grammatical - Search entries by grammatical information
- GET /api/search/ranges - Get range definitions and controlled vocabularies
- GET /api/search/ranges/{id} - Get values for a specific range
- GET /api/stats - Get dictionary statistics and entry counts
- GET /api/projects - Manage multiple dictionary projects
- GET /api/reports - Generate various dictionary reports
# Run all tests
pytest
# Generate coverage report
pytest --cov=app tests/# Format code with black
black .
# Lint code
flake8
# Type checking
mypy .This application is currently in pre-alpha stage. While the core functionality is operational, some features are still under active development and may not work as expected. The export functionality is currently limited and undergoing improvements.
This project is licensed under the MIT License - see the LICENSE file for details.
For technical questions about LIFT format:
For lexicographic resources and guidance:
- Introduction to Lexicography by Ron Moe
For application-specific support, contact your system administrator or open an issue in the repository.
