-
Notifications
You must be signed in to change notification settings - Fork 112
optimize unnecessary copy #573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -40,8 +40,18 @@ def task_done_logger(task: asyncio.Task) -> None: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| return | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| def get_address(mv: memoryview) -> int: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| return ctypes.addressof(ctypes.c_char.from_buffer(mv)) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| def get_address(data) -> int: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| if isinstance(data, memoryview): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| if not data.readonly: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| return ctypes.addressof(ctypes.c_char.from_buffer(data)) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| data = data.obj | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| if isinstance(data, bytearray): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| return ctypes.addressof(ctypes.c_char.from_buffer(data)) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| if isinstance(data, bytes): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| addr = ctypes.cast(ctypes.c_char_p(data), ctypes.c_void_p).value | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| assert addr is not None | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| return addr | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| raise TypeError(f"expected bytes, bytearray, or memoryview, got {type(data)}") | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Comment on lines
+43
to
+54
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 get_address falls through to TypeError for read-only memoryview backed by non-bytes/bytearray object When Root CauseAt For example, get_address(memoryview(memoryview(b"hello")))
# data.readonly = True → data = data.obj (inner memoryview)
# not bytearray, not bytes → TypeErrorThis can be triggered indirectly through Impact:
Suggested change
Was this helpful? React with 👍 or 👎 to provide feedback. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| T = TypeVar("T") | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -49,19 +49,22 @@ def __init__( | |
| Raises: | ||
| ValueError: If the length of `data` is smaller than the required size. | ||
| """ | ||
| data = memoryview(data).cast("B") | ||
| if isinstance(data, memoryview): | ||
| data = data.obj # type: ignore[assignment] | ||
|
Comment on lines
+52
to
+53
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 AudioFrame.init using memoryview.obj loses slice offset and length information When a sliced memoryview is passed to Root Cause and ImpactThe old code did For example: ba = bytearray(1000)
mv = memoryview(ba)[100:200] # 100 bytes starting at offset 100
# Old: memoryview(mv).cast("B") → len=100, points to ba[100:200] ✓
# New: mv.obj → ba itself, len=1000, points to ba[0:1000] ✗This causes two problems:
Impact: Silent data corruption or out-of-bounds memory access when AudioFrame is constructed from a sliced memoryview. Prompt for agentsWas this helpful? React with 👍 or 👎 to provide feedback. |
||
|
|
||
| if len(data) < num_channels * samples_per_channel * ctypes.sizeof(ctypes.c_int16): | ||
| min_size = num_channels * samples_per_channel * ctypes.sizeof(ctypes.c_int16) | ||
| data_len = len(data) | ||
|
|
||
| if data_len < min_size: | ||
| raise ValueError( | ||
| "data length must be >= num_channels * samples_per_channel * sizeof(int16)" | ||
| ) | ||
|
|
||
| if len(data) % ctypes.sizeof(ctypes.c_int16) != 0: | ||
| if data_len % ctypes.sizeof(ctypes.c_int16) != 0: | ||
| # can happen if data is bigger than needed | ||
| raise ValueError("data length must be a multiple of sizeof(int16)") | ||
|
|
||
| n = len(data) // ctypes.sizeof(ctypes.c_int16) | ||
| self._data = (ctypes.c_int16 * n).from_buffer_copy(data) | ||
| self._data = data | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 AudioFrame with immutable When Root Cause and ImpactPreviously, The bdata = data.data.cast("b") # read-only memoryview if _data is bytes
req.apm_process_stream.data_ptr = get_address(memoryview(bdata))The This is confirmed by the existing test at Impact: Silent data corruption (processed audio not actually written back) or crashes when using AudioProcessingModule with bytes-backed AudioFrames. The test Prompt for agentsWas this helpful? React with 👍 or 👎 to provide feedback. |
||
|
|
||
| self._sample_rate = sample_rate | ||
| self._num_channels = num_channels | ||
|
|
@@ -97,7 +100,7 @@ def _from_owned_info(owned_info: proto_audio.OwnedAudioFrameBuffer) -> "AudioFra | |
|
|
||
| def _proto_info(self) -> proto_audio.AudioFrameBufferInfo: | ||
| audio_info = proto_audio.AudioFrameBufferInfo() | ||
| audio_info.data_ptr = get_address(memoryview(self._data)) | ||
| audio_info.data_ptr = get_address(self._data) | ||
| audio_info.sample_rate = self.sample_rate | ||
| audio_info.num_channels = self.num_channels | ||
| audio_info.samples_per_channel = self.samples_per_channel | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 get_address loses slice offset for readonly sliced memoryviews via .obj unwrapping
When
get_addressreceives a readonly sliced memoryview,data = data.objon line 47 returns the full underlying buffer, discarding the slice offset. The returned address points to the start of the full buffer, not the start of the slice.Root Cause and Impact
The
get_addressfunction handles readonly memoryviews by falling through todata = data.obj, which returns the original underlying object (e.g., abytesobject). For a sliced readonly memoryview likememoryview(b'hello world')[2:6],.objreturns the fullb'hello world', and the address returned is of the start ofb'hello world', not of index 2.This affects
VideoFrame._proto_info()atlivekit-rtc/livekit/rtc/video_frame.py:113andVideoFrame.get_plane()at line 147, whereget_address(self.data)is called withmemoryview(self._data). If a user constructs aVideoFramewith a sliced readonly memoryview (e.g.,VideoFrame(..., data=memoryview(some_bytes)[offset:])), the FFI layer receives the wrong data pointer, leading to incorrect video frame processing or memory access issues.The writable (non-readonly) path correctly handles slices via
ctypes.c_char.from_buffer(data), which respects the memoryview's buffer protocol including offset.Prompt for agents
Was this helpful? React with 👍 or 👎 to provide feedback.