-
Notifications
You must be signed in to change notification settings - Fork 22
[Issue 130] support all basic datatypes in compacted key encoder(continuation) #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue 130] support all basic datatypes in compacted key encoder(continuation) #175
Conversation
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
rust_decimal is capped at 28 digits and is typically faster than BigDecimal. One option is a hybrid design: a FlussDecimal wrapper that stores small decimals as rust_decimal::Decimal and falls back to BigDecimal for higher precision. That would only pay off if decimals are used heavily on hot paths, and it would noticeably increase implementation and maintenance complexity. For now, the BigDecimal approach is the right trade-off: prioritize correctness and wire-compatibility with Java. The hybrid optimization is something we can revisit later if profiling shows decimals are a real bottleneck. |
|
Once merged, I'll add support to arrow writing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR completes support for all basic data types (DATE, TIME, TIMESTAMP_NTZ, TIMESTAMP_LTZ, DECIMAL) in the CompactedRow format and key encoding path. It migrates from rust_decimal to bigdecimal to support DECIMAL types with up to 38 digits of precision (matching Java's BigDecimal), implements temporal types with appropriate compact/non-compact encoding based on precision, and adds comprehensive test coverage.
Changes:
- Migrated from
rust_decimaltobigdecimalfor arbitrary precision decimal support (up to 38 digits) - Implemented temporal types (DATE, TIME, TIMESTAMP_NTZ, TIMESTAMP_LTZ) with precision-based compact encoding
- Updated all BinaryWriter trait methods to return Result for proper error handling
- Added comprehensive validation and test coverage for new data types
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/fluss/Cargo.toml | Updated dependency from rust_decimal to bigdecimal with serde support |
| crates/fluss/src/row/datum.rs | Added Time type, restructured Timestamp/TimestampLtz with millisecond+nano fields, migrated Decimal to BigDecimal, added validation tests |
| crates/fluss/src/row/mod.rs | Added InternalRow trait methods for decimal, date, time, and timestamp types with GenericRow implementations |
| crates/fluss/src/row/field_getter.rs | Added field getters for new temporal and decimal types, reformatted enum variants |
| crates/fluss/src/row/compacted/compacted_row_writer.rs | Implemented java_decimal_precision helper, added write methods for decimal/temporal types with compact/non-compact encoding logic, updated all methods to return Result |
| crates/fluss/src/row/compacted/compacted_row_reader.rs | Added deserialization support for decimal and temporal types with error handling |
| crates/fluss/src/row/compacted/compacted_row.rs | Updated decoded_row error handling, added getter implementations for new types, comprehensive test coverage |
| crates/fluss/src/row/compacted/compacted_key_writer.rs | Updated delegate macro for new Result-returning method signatures |
| crates/fluss/src/row/binary/binary_writer.rs | Updated trait signatures to return Result, added InnerValueWriter variants with precision/scale validation, comprehensive test coverage |
| crates/fluss/src/row/column.rs | Implemented get_decimal with Arrow integration and precision validation, added temporal type getters, test coverage |
| crates/fluss/src/row/encode/compacted_key_encoder.rs | Updated test to include all new data types with expected byte encodings |
| crates/fluss/src/row/row_decoder.rs | Updated tests to handle Result returns from write methods |
| crates/fluss/src/record/kv/*.rs | Updated tests to handle Result returns from write methods |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
leekeiabstraction
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR! Left comments, PTAL!
luoyuxia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fresh-borzoni Thanks for the pr. Left some comments. PTAL
e87e2bc to
ee1f434
Compare
ee1f434 to
782474e
Compare
|
@luoyuxia @leekeiabstraction PTAL 🙏 |
|
fixed CI/CD |
luoyuxia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fresh-borzoni Thanks for update. LGTM overall. I just left minor comments. PTAL
leekeiabstraction
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, left some comments. PTAL!
|
@luoyuxia @leekeiabstraction Thanks for the review! Addressed comments, PTAL 🙏 |
leekeiabstraction
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! TY for your work.
luoyuxia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fresh-borzoni LGTM! Thanks for the work!
Closes #130
Continuation of #144
Summary
Completes support for all basic data types (DATE, TIME, TIMESTAMP_NTZ, TIMESTAMP_LTZ, DECIMAL) in CompactedRow format and key encoding path
1. Migrated to
bigdecimal(fromrust_decimal)rust_decimalis limited to 28 digitsBigDecimalbehavior: arbitrary precision, HALF_UP rounding, trailing-zero stripping2. Implemented Temporal Types
3. Decimal Encoding