Skip to content

Conversation

@fresh-borzoni
Copy link
Contributor

Closes #130
Continuation of #144

Summary

Completes support for all basic data types (DATE, TIME, TIMESTAMP_NTZ, TIMESTAMP_LTZ, DECIMAL) in CompactedRow format and key encoding path

1. Migrated to bigdecimal (from rust_decimal)

  • Why: Fluss schema supports DECIMAL up to 38 digits; rust_decimal is limited to 28 digits
  • Matches Java's BigDecimal behavior: arbitrary precision, HALF_UP rounding, trailing-zero stripping

2. Implemented Temporal Types

  • DATE: i32 days since epoch
  • TIME: i32 milliseconds since midnight (precision metadata only, not in wire format)
  • TIMESTAMP_NTZ/LTZ: Compact (≤3 precision) vs non-compact (>3 precision) encoding
    • Compact: milliseconds only (i64 varint)
    • Non-compact: milliseconds + nanoOfMillisecond (i64 + i32 varint)

3. Decimal Encoding

  • Compact (precision ≤18): i64 varint
  • Non-compact (precision >18): big-endian byte array
  • Two-step validation: rescale with HALF_UP → validate precision

@fresh-borzoni
Copy link
Contributor Author

rust_decimal is capped at 28 digits and is typically faster than BigDecimal. One option is a hybrid design: a FlussDecimal wrapper that stores small decimals as rust_decimal::Decimal and falls back to BigDecimal for higher precision. That would only pay off if decimals are used heavily on hot paths, and it would noticeably increase implementation and maintenance complexity.

For now, the BigDecimal approach is the right trade-off: prioritize correctness and wire-compatibility with Java. The hybrid optimization is something we can revisit later if profiling shows decimals are a real bottleneck.

@leekeiabstraction @luoyuxia @Kelvinyu1117 PTAL 🙏

@fresh-borzoni
Copy link
Contributor Author

fresh-borzoni commented Jan 18, 2026

Once merged, I'll add support to arrow writing.
Probably in this #167 or in a separate PR.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes support for all basic data types (DATE, TIME, TIMESTAMP_NTZ, TIMESTAMP_LTZ, DECIMAL) in the CompactedRow format and key encoding path. It migrates from rust_decimal to bigdecimal to support DECIMAL types with up to 38 digits of precision (matching Java's BigDecimal), implements temporal types with appropriate compact/non-compact encoding based on precision, and adds comprehensive test coverage.

Changes:

  • Migrated from rust_decimal to bigdecimal for arbitrary precision decimal support (up to 38 digits)
  • Implemented temporal types (DATE, TIME, TIMESTAMP_NTZ, TIMESTAMP_LTZ) with precision-based compact encoding
  • Updated all BinaryWriter trait methods to return Result for proper error handling
  • Added comprehensive validation and test coverage for new data types

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
crates/fluss/Cargo.toml Updated dependency from rust_decimal to bigdecimal with serde support
crates/fluss/src/row/datum.rs Added Time type, restructured Timestamp/TimestampLtz with millisecond+nano fields, migrated Decimal to BigDecimal, added validation tests
crates/fluss/src/row/mod.rs Added InternalRow trait methods for decimal, date, time, and timestamp types with GenericRow implementations
crates/fluss/src/row/field_getter.rs Added field getters for new temporal and decimal types, reformatted enum variants
crates/fluss/src/row/compacted/compacted_row_writer.rs Implemented java_decimal_precision helper, added write methods for decimal/temporal types with compact/non-compact encoding logic, updated all methods to return Result
crates/fluss/src/row/compacted/compacted_row_reader.rs Added deserialization support for decimal and temporal types with error handling
crates/fluss/src/row/compacted/compacted_row.rs Updated decoded_row error handling, added getter implementations for new types, comprehensive test coverage
crates/fluss/src/row/compacted/compacted_key_writer.rs Updated delegate macro for new Result-returning method signatures
crates/fluss/src/row/binary/binary_writer.rs Updated trait signatures to return Result, added InnerValueWriter variants with precision/scale validation, comprehensive test coverage
crates/fluss/src/row/column.rs Implemented get_decimal with Arrow integration and precision validation, added temporal type getters, test coverage
crates/fluss/src/row/encode/compacted_key_encoder.rs Updated test to include all new data types with expected byte encodings
crates/fluss/src/row/row_decoder.rs Updated tests to handle Result returns from write methods
crates/fluss/src/record/kv/*.rs Updated tests to handle Result returns from write methods

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@leekeiabstraction leekeiabstraction left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! Left comments, PTAL!

Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fresh-borzoni Thanks for the pr. Left some comments. PTAL

@fresh-borzoni fresh-borzoni force-pushed the issue-130-Support-all-basic-datatypes-in-CompactedKeyEncoder branch 8 times, most recently from e87e2bc to ee1f434 Compare January 18, 2026 17:49
@fresh-borzoni fresh-borzoni force-pushed the issue-130-Support-all-basic-datatypes-in-CompactedKeyEncoder branch from ee1f434 to 782474e Compare January 18, 2026 18:03
@fresh-borzoni
Copy link
Contributor Author

@luoyuxia @leekeiabstraction
Thanks for the review, addressed comments, cleaned a bit as well.

PTAL 🙏

@fresh-borzoni
Copy link
Contributor Author

fixed CI/CD

Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fresh-borzoni Thanks for update. LGTM overall. I just left minor comments. PTAL

Copy link
Contributor

@leekeiabstraction leekeiabstraction left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, left some comments. PTAL!

@fresh-borzoni
Copy link
Contributor Author

@luoyuxia @leekeiabstraction Thanks for the review!

Addressed comments, PTAL 🙏

Copy link
Contributor

@leekeiabstraction leekeiabstraction left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! TY for your work.

Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fresh-borzoni LGTM! Thanks for the work!

@luoyuxia luoyuxia merged commit 350d5a9 into apache:main Jan 19, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support all basic datatypes in CompactedKeyEncoder

4 participants