Skip to content

Conversation

@jja725
Copy link

@jja725 jja725 commented Jan 22, 2026

Summary

Implements #16 - Background compaction for Lance fragments with comprehensive features:

  • ✅ Manual compaction via compact() method
  • ✅ Optional background compaction with configurable intervals
  • ✅ Comprehensive configuration (thresholds, quiet hours, intervals)
  • ✅ Advanced observability (stats API, metrics, structured logging)

Changes

Rust Core:

  • Added CompactionConfig and CompactionStats types
  • Implemented compact(), should_compact(), compaction_stats() methods
  • Background compaction task with Tokio interval timer
  • Graceful shutdown via Drop implementation
  • Added tokio and tracing dependencies

Python API:

  • PyO3 bindings for all compaction methods
  • High-level API with comprehensive docstrings
  • Configuration parameters in Context.create()

Tests:

  • 10 comprehensive integration tests (all passing ✅)
  • Tests cover manual/background compaction, quiet hours, data integrity

Usage Example

Manual Compaction:
```python
ctx = Context.create("context.lance")
for i in range(100):
ctx.add("user", f"message {i}")

metrics = ctx.compact()
print(f"Removed {metrics['fragments_removed']} fragments")
```

Background Compaction:
```python
ctx = Context.create(
"context.lance",
enable_background_compaction=True,
compaction_interval_secs=300,
compaction_min_fragments=10,
quiet_hours=[(22, 6)], # 10pm-6am
)
```

Check Status:
```python
stats = ctx.compaction_stats()
print(f"Fragments: {stats['total_fragments']}")
print(f"Last compaction: {stats['last_compaction']}")
```

Test Results

```
10 passed in 5.39s
✅ Manual compaction reduces fragments
✅ Data integrity preserved
✅ Concurrent writes work
✅ Compaction stats accurate
✅ Custom options work
✅ Background compaction triggers
✅ Quiet hours respected
✅ Metrics structure correct
✅ Empty context handled
✅ Multiple compactions work
```

Architecture

  • Hybrid approach: Both manual and optional background compaction
  • Thread-safe: Uses Arc for state management
  • Non-blocking: Background task runs in separate Tokio task
  • Graceful shutdown: Drop implementation aborts background task
  • Lance MVCC: No explicit locking needed, leverages Lance's versioning

Checklist

  • Tests added and passing
  • Documentation updated
  • Code follows project conventions
  • All files properly formatted

Implements issue lance-format#16 with comprehensive compaction functionality:

**Core Features:**
- Manual compaction via `compact()` method
- Optional background compaction with configurable intervals
- Comprehensive configuration (thresholds, quiet hours, intervals)
- Advanced observability (stats API, metrics, logging)

**Implementation Details:**
- Rust: Added CompactionConfig, CompactionStats types to store.rs
- Rust: Implemented compact(), should_compact(), compaction_stats()
- Rust: Background task with Tokio interval timer and graceful shutdown
- Python: PyO3 bindings for all compaction methods
- Python: High-level API with full docstrings
- Tests: 10 comprehensive tests (all passing)

**Configuration Options:**
- enable_background_compaction: Enable auto-compaction
- compaction_interval_secs: Check interval (default: 300s)
- compaction_min_fragments: Trigger threshold (default: 5)
- compaction_target_rows: Target rows per fragment (default: 1M)
- quiet_hours: Skip compaction during specified hours

**Metrics Returned:**
- fragments_removed/added
- files_removed/added
- is_compacting status
- last_compaction timestamp
- total_compactions count

All tests pass. Documentation updated with usage examples.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant