Skip to content

Extract u4/u8 zero point directly instead of FP bias#41

Merged
wine99 merged 1 commit intodev_backend_openvinofrom
extract-zp-instead-of-bias
Feb 5, 2026
Merged

Extract u4/u8 zero point directly instead of FP bias#41
wine99 merged 1 commit intodev_backend_openvinofrom
extract-zp-instead-of-bias

Conversation

@wine99
Copy link
Collaborator

@wine99 wine99 commented Feb 5, 2026

OV uses a u4/u8 zero-point tensor for dequantization, while GGUF models store FP biases. The current code first extracts an FP bias ov::Tensor from the model and then creates a separate zero-point tensor during model conversion. This PR changes the flow so that the zero-point tensor is created directly during model loading.

On top of PR #40

@wine99 wine99 force-pushed the extract-zp-instead-of-bias branch from 656c43b to ccf727e Compare February 5, 2026 06:37
@wine99 wine99 merged commit 907d832 into dev_backend_openvino Feb 5, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant