feat: Add file context tracking to skip redundant re-reads (reduces token usage) #10807

roomote · 2026-01-17T20:47:20Z

This PR attempts to address Issue #10653. Feedback and guidance are welcome.

Summary

Implements file content hashing in the FileContextTracker to detect when files are unchanged and still in the conversation context, allowing ReadFileTool to skip redundant re-reads and save tokens.

Changes

FileContextTrackerTypes.ts

Added content_hash field to FileMetadataEntry schema (optional, nullable string)

FileContextTracker.ts

Added computeContentHash(content: string) static method for MD5 hashing
Added computeFileHash(filePath: string) method to compute hash of file on disk
Added isFileUnchangedInContext(filePath: string) method to check if a file can be skipped
Modified trackFileContext() to accept an optional contentHash parameter
Modified addFileToFileContextTracker() to store/invalidate content hashes

ReadFileTool.ts

Added check at the start of file processing to detect unchanged files
Returns a brief "file unchanged" status message instead of re-reading the full file
Computes and stores content hash after reading files successfully
Tracks file context with content hash

Tests

Added comprehensive test suite for new FileContextTracker functionality
Updated existing ReadFileTool tests to mock the new isFileUnchangedInContext method

How it works

When a file is read, its content is hashed with MD5 and stored in the context tracker
On subsequent reads, the tool checks if the file is already in context and unchanged
If unchanged, it returns a brief status message instead of re-reading the entire file
The hash is invalidated when the user edits the file (detected via file watchers)

Benefits

Reduces token usage by avoiding redundant file re-reads
Especially useful for large files that are referenced multiple times
Transparent to the user and the AI model

Fixes #10653

Important

This PR adds file content hashing to FileContextTracker to skip redundant reads in ReadFileTool, reducing token usage.

Behavior:
- FileContextTracker now tracks file content changes using MD5 hashes to skip redundant reads.
- ReadFileTool checks if a file is unchanged before reading, returning a status message if so.
Functions:
- computeContentHash(content: string) in FileContextTracker computes MD5 hash of content.
- isFileUnchangedInContext(filePath: string) checks if a file's content is unchanged.
- trackFileContext() and addFileToFileContextTracker() updated to handle content hashes.
Tests:
- Added tests for FileContextTracker in FileContextTracker.spec.ts.
- Updated readFileTool.spec.ts to mock isFileUnchangedInContext method.

^{This description was created by}^{for ea696ce. You can customize this summary. It will automatically update as commits are pushed.}

- Add content_hash field to FileMetadataEntry schema for tracking file content - Add isFileUnchangedInContext() method to FileContextTracker to detect unchanged files - Add computeContentHash() and computeFileHash() methods for MD5 hashing - Modify trackFileContext() to store content hash on read/edit operations - Update ReadFileTool to skip re-reading files that are unchanged and in context - Add comprehensive tests for the new functionality - Update existing ReadFileTool tests to mock new isFileUnchangedInContext method This reduces token usage by skipping redundant file re-reads when: 1. The file has an active read entry in context tracking 2. The file has not been edited since the last read 3. The file content hash matches the stored hash Addresses issue #10653

roomote · 2026-01-17T20:47:48Z

Rooviewer See task on Roo Cloud

All previously flagged issues have been addressed. No new issues found.

ReadFileTool.ts (lines 629-648): Dead code, redundant file read, and incorrect hash storage for truncated files. The code always stores the content hash even when files are truncated, which will incorrectly report files as "unchanged" on subsequent reads.

Previous reviews

3323b97: Review #1

_{Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.}

roomote · 2026-01-17T20:50:27Z

src/core/tools/ReadFileTool.ts

+					// Compute content hash for skip-redundant-reads optimization
+					// Only store hash for complete reads (not truncated)
+					let contentHash: string | undefined
+					if (safeReadBudget > 0) {
+						const readResult = await readFileWithTokenBudget(fullPath, { budgetTokens: safeReadBudget })
+						// Re-read to get content for hash (already done above, use same result)
+						// Actually we need to use the result.content from above, so we compute hash here
+						// Note: content variable above is already the numbered content, we need raw
+						// For simplicity, compute hash from the result we already have (pre-addLineNumbers)
+					}
+					// For full reads, compute hash from raw content
+					if (safeReadBudget > 0) {
+						try {
+							const rawContent = await fs.readFile(fullPath, "utf-8")
+							contentHash = FileContextTracker.computeContentHash(rawContent)
+						} catch {
+							// If we can't read for hash, that's ok - just don't store hash
+						}
+					}
+					await task.fileContextTracker.trackFileContext(relPath, "read_tool" as RecordSource, contentHash)


This code has several issues: (1) The first if (safeReadBudget > 0) block is dead code - it calls readFileWithTokenBudget but never uses the result. (2) The second block reads the file again with fs.readFile even though the content was already read earlier via readFileWithTokenBudget. (3) Despite the comment saying "Only store hash for complete reads (not truncated)", the code always stores the hash regardless of whether result.complete is false. This means truncated files will incorrectly report as "unchanged" on subsequent reads, even though only a portion is actually in context.

The fix should use the already-read content and only store the hash when the read was complete:

Suggested change

// Compute content hash for skip-redundant-reads optimization

// Only store hash for complete reads (not truncated)

let contentHash: string | undefined

if (safeReadBudget > 0) {

const readResult = await readFileWithTokenBudget(fullPath, { budgetTokens: safeReadBudget })

// Re-read to get content for hash (already done above, use same result)

// Actually we need to use the result.content from above, so we compute hash here

// Note: content variable above is already the numbered content, we need raw

// For simplicity, compute hash from the result we already have (pre-addLineNumbers)

}

// For full reads, compute hash from raw content

if (safeReadBudget > 0) {

try {

const rawContent = await fs.readFile(fullPath, "utf-8")

contentHash = FileContextTracker.computeContentHash(rawContent)

} catch {

// If we can't read for hash, that's ok - just don't store hash

}

}

await task.fileContextTracker.trackFileContext(relPath, "read_tool" as RecordSource, contentHash)

// Compute content hash for skip-redundant-reads optimization

// Only store hash for complete reads (not truncated)

let contentHash: string | undefined

if (safeReadBudget > 0) {

const result = await readFileWithTokenBudget(fullPath, { budgetTokens: safeReadBudget })

content = addLineNumbers(result.content)

if (result.complete) {

// Only store hash for complete reads

contentHash = FileContextTracker.computeContentHash(result.content)

}

}

await task.fileContextTracker.trackFileContext(relPath, "read_tool" as RecordSource, contentHash)

_{Fix it with Roo Code or mention @roomote and request a fix.}

@roomote Try the suggested fix, if you're game. I appreciate your attentiveness :)

Fixaroo See task on Roo Cloud

Fixed the reported issues. All local checks passed.

_{View commit | Revert commit}

@roomote do you have the ability to run any e2e tests that exercise file reads that could validate whether this patch produces the expected token savings?

Fixaroo See task on Roo Cloud

Good question! I investigated the e2e test infrastructure to assess token validation capabilities:

Current State:

The read-file.test.ts e2e tests exist but are currently skipped (suite.skip)

The test infrastructure does support token tracking via TaskCompleted and TaskTokenUsageUpdated events, which emit TokenUsage data including totalTokensIn, totalTokensOut, and contextTokens

Feasibility:
An e2e test to validate token savings could:

Read a file, capture token usage

Read the same file again in the same conversation

Verify the second read returns the "file unchanged" message

Compare token counts - the second read should show significantly fewer tokens

Limitation:
I cannot run e2e tests in this environment since they require a full VSCode extension host. However, the unit tests in FileContextTracker.spec.ts and readFileTool.spec.ts can be run to verify the hashing and skip logic works correctly. Would running those unit tests be helpful for validating the implementation?

@roomote sure, please run those tests. Thanks!

Fixaroo See task on Roo Cloud

All tests passed:

FileContextTracker.spec.ts: 22 tests passed

readFileTool.spec.ts: 52 tests passed

- Remove dead code that called readFileWithTokenBudget but never used result - Remove redundant fs.readFile call (content was already read earlier) - Only compute and store contentHash when result.complete is true - Fixes incorrect hash storage for truncated files

github-project-automation bot moved this to New in Roo Code Roadmap Jan 17, 2026

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Jan 17, 2026

github-project-automation bot moved this to Triage in Roo Code Roadmap Jan 17, 2026

roomote bot mentioned this pull request Jan 17, 2026

[ENHANCEMENT] Add file context tracking to skip redundant re-reads (reduces token usage) #10653

Open

roomote bot commented Jan 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add file context tracking to skip redundant re-reads (reduces token usage) #10807

feat: Add file context tracking to skip redundant re-reads (reduces token usage) #10807

roomote bot commented Jan 17, 2026 •

edited by ellipsis-dev bot

Loading

Uh oh!

roomote bot commented Jan 17, 2026 •

edited

Loading

Uh oh!

roomote bot Jan 17, 2026

Uh oh!

dreness Jan 17, 2026

Uh oh!

roomote bot Jan 17, 2026 •

edited

Loading

Uh oh!

dreness Jan 17, 2026

Uh oh!

roomote bot Jan 17, 2026

Uh oh!

dreness Jan 17, 2026

Uh oh!

roomote bot Jan 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: Add file context tracking to skip redundant re-reads (reduces token usage) #10807

Are you sure you want to change the base?

feat: Add file context tracking to skip redundant re-reads (reduces token usage) #10807

Conversation

roomote bot commented Jan 17, 2026 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

FileContextTrackerTypes.ts

FileContextTracker.ts

ReadFileTool.ts

Tests

How it works

Benefits

Uh oh!

roomote bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

roomote bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

dreness Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

roomote bot Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dreness Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

roomote bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

dreness Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

roomote bot Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

roomote bot commented Jan 17, 2026 •

edited by ellipsis-dev bot

Loading

roomote bot commented Jan 17, 2026 •

edited

Loading

roomote bot Jan 17, 2026 •

edited

Loading

roomote bot Jan 17, 2026 •

edited

Loading