Official repository for the paper:
SERN-MIL: Selective Embedding Retrieval and Nuclei-Feature Aggregation Based MIL for Prostate Gleason Grading.
This repository contains the codebase for preprocessing, feature extraction, training, and inference with a
CLAM + CellViT dependent pipeline aligned with the paper methodology.
- WSI preprocessing at
0.50 um/px - Tissue masking (Otsu + morphology)
- Tiling (
4096x4096) and patching (256x256) - CLAM-compatible tissue embedding extraction
- CellViT-based nuclei feature extraction
- Feature fusion with coordinate alignment
- SER module for selective embedding retrieval
- Region-aware MIL for GP classification and ISUP grading
- Optional survival/BCR head
configs/ # Pipeline, model, dataset, and experiment configs
docker/ # Dockerfile and compose
docs/ # Architecture and pipeline docs
scripts/ # User-facing entry scripts
src/sern_mil/ # Core package
tests/ # Unit tests
pip install -e .docker build -f docker/Dockerfile -t sern-mil:latest .
docker run --rm -it -v $PWD:/workspace sern-mil:latest bashPlace your slides in:
data/wsi/
Optionally keep labels/splits in:
data/labels/data/splits/
make preprocess
make features
make fusepython -m sern_mil.cli.main preprocess --config configs/default.yaml
python -m sern_mil.cli.main extract-clam --config configs/default.yaml
python -m sern_mil.cli.main extract-cellvit --config configs/default.yaml
python -m sern_mil.cli.main fuse --config configs/default.yamlGenerated manifests:
data_index/preprocess_manifest.jsonldata_index/clam_features_manifest.jsonldata_index/cellvit_features_manifest.jsonldata_index/fused_features_manifest.jsonl
Training pipeline is separated from inference.
make trainpython scripts/train_pipeline.py --config configs/training_pipeline.yamlpython -m sern_mil.cli.main train-pipeline --config configs/training_pipeline.yamlTraining summary output:
artifacts/training_run_summary.json
Checkpoints directory:
checkpoints/
python scripts/infer_pipeline.py --config configs/inference_pipeline.yaml --slide data/wsi/sample.svspython scripts/infer_pipeline.py --config configs/inference_pipeline.yaml --wsis-dir data/wsimake inferInference output:
artifacts/inference_predictions.json
configs/default.yaml: preprocessing + feature pipeline settingsconfigs/training_pipeline.yaml: training-specific settingsconfigs/inference_pipeline.yaml: inference-specific settingsconfigs/models/sern_mil.yaml: model-level defaultsconfigs/experiments/*.yaml: ablation toggles
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python -m pytest -q- Pipeline code is structured for paper-faithful components (SER, NFI, MIL), with clean separation between:
- preprocessing/feature generation
- training
- inference
- For full experimental reproduction, ensure your dataset splits and checkpoints match your study protocol.