fix(uploads): resolve hash calculation memory crash and add hashing progress#375
fix(uploads): resolve hash calculation memory crash and add hashing progress#375Confusion-ymc wants to merge 2 commits intoOpenListTeam:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical memory crash issue when calculating hashes for large files (10GB+) and adds progress reporting for the hashing phase. The changes address the "Array buffer allocation failed" error by introducing periodic yielding of the main thread during hash calculation.
Changes:
- Refactored hash calculation from recursive to iterative with periodic
setTimeout(0)calls to allow garbage collection - Added progress callback to
calculateHashfunction to report hashing progress - Introduced new "hashing" status with corresponding UI badge and translation
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/pages/home/uploads/util.ts | Refactored calculateHash to use while loop instead of recursion, added onProgress callback, and introduced setTimeout(0) every 10 iterations to yield control and enable GC |
| src/pages/home/uploads/types.ts | Added "hashing" status to Status union type and corresponding "warning" badge color to StatusBadge mapping |
| src/pages/home/uploads/stream.ts | Updated to set "hashing" status before hash calculation and pass progress callback to calculateHash |
| src/pages/home/uploads/form.ts | Updated to set "hashing" status before hash calculation and pass progress callback to calculateHash |
| src/lang/en/home.json | Added "hashing": "Hashing" translation string for the new status |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
已经改为Worker处理 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| while (true) { | ||
| const { done, value } = await reader.read() | ||
| if (done) { | ||
| break | ||
| } | ||
| loaded += value.length | ||
| md5Digest.update(value) | ||
| sha1Digest.update(value) | ||
| sha256Digest.update(value) | ||
| self.postMessage({ | ||
| type: "progress", | ||
| progress: (loaded / file.size) * 100, | ||
| }) | ||
| } |
There was a problem hiding this comment.
The PR description states that setTimeout(0) is used to yield control and allow garbage collection, but this is not implemented in the code. The tight while loop could still cause memory issues for very large files (10GB+) as the browser may not have opportunities to perform garbage collection between chunks. Consider adding periodic yielding, for example:
// After processing each chunk, periodically yield control
if (loaded % (10 * 1024 * 1024) === 0) { // Every 10MB
await new Promise(resolve => setTimeout(resolve, 0))
}This would match the description and help prevent the "Array buffer allocation failed" errors mentioned in the PR description.
| self.postMessage({ | ||
| type: "progress", | ||
| progress: (loaded / file.size) * 100, | ||
| }) |
There was a problem hiding this comment.
Progress updates are sent on every chunk read, which could result in excessive postMessage calls for large files and potentially impact performance. Consider throttling the progress updates to send them at most once every 100-500ms or every certain number of bytes processed (e.g., every 1MB). This would reduce the message overhead while still providing smooth progress updates.
| } | ||
| worker.onerror = (e) => { | ||
| worker.terminate() | ||
| reject(e) |
There was a problem hiding this comment.
The worker.onerror handler rejects with the error event object directly, which may not provide a helpful error message to the user. Consider extracting a meaningful error message from the error event before rejecting, similar to how it's done in the worker's own error handling. For example: reject(new Error(e.message || 'Hash calculation failed'))
| reject(e) | |
| reject(new Error((e && (e as ErrorEvent).message) || "Hash calculation failed")) |
| import { createMD5, createSHA1, createSHA256 } from "hash-wasm" | ||
|
|
||
| self.onmessage = async (e: MessageEvent<{ file: File }>) => { | ||
| const { file } = e.data | ||
| try { | ||
| const md5Digest = await createMD5() | ||
| const sha1Digest = await createSHA1() | ||
| const sha256Digest = await createSHA256() | ||
| const reader = file.stream().getReader() | ||
| let loaded = 0 | ||
| while (true) { | ||
| const { done, value } = await reader.read() | ||
| if (done) { | ||
| break | ||
| } | ||
| loaded += value.length | ||
| md5Digest.update(value) | ||
| sha1Digest.update(value) | ||
| sha256Digest.update(value) | ||
| self.postMessage({ | ||
| type: "progress", | ||
| progress: (loaded / file.size) * 100, | ||
| }) | ||
| } | ||
| const md5 = md5Digest.digest("hex") | ||
| const sha1 = sha1Digest.digest("hex") | ||
| const sha256 = sha256Digest.digest("hex") | ||
| self.postMessage({ | ||
| type: "result", | ||
| hash: { md5, sha1, sha256 }, | ||
| }) | ||
| } catch (error) { | ||
| self.postMessage({ | ||
| type: "error", | ||
| error: error instanceof Error ? error.message : String(error), | ||
| }) | ||
| } | ||
| } |
There was a problem hiding this comment.
Consider adding TypeScript types for the worker message structure to improve type safety. Define interfaces for the message events exchanged between the main thread and worker, such as:
interface WorkerProgressMessage {
type: 'progress';
progress: number;
}
interface WorkerResultMessage {
type: 'result';
hash: { md5: string; sha1: string; sha256: string };
}
interface WorkerErrorMessage {
type: 'error';
error: string;
}
type WorkerMessage = WorkerProgressMessage | WorkerResultMessage | WorkerErrorMessage;This would help catch potential issues at compile time and make the code more maintainable.
Description / 描述
修复了大文件(10GB+)哈希计算导致的 Array buffer allocation failed 崩溃异常。
主要改进:
Motivation and Context / 背景
解决处理超大文件时主线程被长时间占用导致的页面假死和内存溢出崩溃,提升大文件上传的成功率与用户体验。
How Has This Been Tested? / 测试
Checklist / 检查清单
我已阅读 CONTRIBUTING 文档。
go fmtor prettier.我已使用
go fmt或 prettier 格式化提交的代码。我已为此 PR 添加了适当的标签(如无权限或需要的标签不存在,请在描述中说明,管理员将后续处理)。
我已在适当情况下使用"Request review"功能请求相关代码作者进行审查。
我已相应更新了相关仓库(若适用)。