Implement an advanced fuzzy diffing feature for interdiff#156
Open
kerneltoast wants to merge 18 commits intotwaugh:masterfrom
Open
Implement an advanced fuzzy diffing feature for interdiff#156kerneltoast wants to merge 18 commits intotwaugh:masterfrom
kerneltoast wants to merge 18 commits intotwaugh:masterfrom
Conversation
It's difficult to conditionally add additional arguments to the patch execution in apply_patch() because they are placed within a compound literal array. Make the arguments more extensible by creating a local array and an index variable to place the next argument into the array. This way, it's much easier to change the number of arguments provided at runtime.
Remove the superfluous fseeks and simplify the original file creation process by moving relevant fseeks to come right after the file cursor was last modified.
Coloring the newline character results in the terminal cursor becoming colored when the final line in the interdiff is colored. Fix this by not coloring the newline character.
18418f4 to
5401c9f
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #156 +/- ##
==========================================
- Coverage 86.47% 86.17% -0.31%
==========================================
Files 15 15
Lines 8176 8931 +755
Branches 1643 1851 +208
==========================================
+ Hits 7070 7696 +626
- Misses 1106 1235 +129
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Contributor
Author
|
@twaugh Please take a look at this when you can, thanks! |
When an @@ line isn't immediately after the +++ line in patch2, the next line is checked from the top of the loop which tries to search for a +++ line again, even though the +++ was already found. This results in the +++ not being found again and thus a spurious error that patch2 is empty. Fix this by making the patch2 case loop over the next line until either an @@ is found or the patch is exhausted.
5401c9f to
170ca21
Compare
Contributor
Author
|
All checks are passing now with a lot more test coverage added. |
01efe36 to
60a60b3
Compare
Contributor
Author
|
@twaugh I had updated this PR with several fixes and additional tests last week, and everything is finalized beyond a shadow of a doubt at this point. Would love to land this into patchutils! |
This implements a --fuzzy option to make interdiff perform a fuzzy comparison between two diffs. This is very helpful, for example, for comparing a backport patch to its upstream source patch to assist a human reviewer in verifying the correctness of the backport. The fuzzy diffing process is complex and works by: - Generating a new patch file with hunks split up into smaller hunks to separate out multiple deltas (+/- lines) in a single hunk that are spaced apart by context lines, increasing the amount of deltas that can be applied successfully with fuzz - Applying the rewritten p1 patch to p2's original file, and the rewritten p2 patch to p1's original file; the original files aren't ever merged - Relocating patched hunks in only p1's original file to align with their respective locations in the other file, based on the reported line offset printed out by `patch` for each hunk it successfully applied - Squashing unline gaps fewer than max_context*2 lines between hunks in the patched files, to hide unknown contextual information that is irrelevant for comparing the two diffs while also improving hunk alignment between the two patched files - Diffing the two patched files as usual - Rewriting the hunks in the diff output to exclude unlines from the unified diff, even splitting up hunks to remove unlines present in the middle of a hunk, while also adjusting the @@ line to compensate for the change in line offsets - Emitting the rewritten diff output while interleaving rejected hunks from both p1 and p2 in the output in order by line number, with a comment on the @@ line indicating when an emitted hunk is a rejected hunk This also involves working around some bugs in `patch` itself encountered along the way, such as occasionally inaccurate line offsets printed out and spurious fuzzing in certain cases that involve hunks with an unequal number of pre-context and post-context lines. The end result of all of this is a minimal set of real differences in the context lines of each hunk between the user's provided diffs. Even when fuzzing results in a faulty patch, the context differences are shown so there is never a risk of any real deltas getting hidden due to fuzzing. By default, the fuzz factor used is just the default used in `patch`. The fuzz factor can be adjusted by the user via appending =N to `--fuzzy` to specify the maximum number of context lines for `patch` to fuzz.
Fuzzy interdiffs would show how to go from patch2 to patch1, the opposite of how it should be. Fix it so that the output shows how to go from patch1 to patch2.
split_patch_hunks() fails to filter bogus hunks that contain real context lines mixed with unlines. Such hunks are still bogus because they don't have any delta lines. Additionally, the numbers in the @@ line become nonsense when more than one bogus hunk is filtered. Fix both of these issues affecting the bogus hunk filter logic.
So that patch never tries to prompt and ask for something when running interdiff on an interactive terminal.
Fuzzy mode output is incoherent because context differences show up as going from patch1 -> patch2, while delta differences show up in the opposite direction: patch2 -> patch1. This makes it rather impossible to tell which patch file a +/- line originates from, not to mention that the diff itself is totally nonsense in terms of the actual code changed by the patches. Bring interdiff part of the way there to fixing this issue, by eliminating delta differences from the compared files. Note that this makes fuzzy mode only do context diffing, which is fixed in the subsequent commits.
…ed one
When using --fuzzy mode, interdiff splits patch hunks and applies them
with fuzz to maximize successful application. The parse_fuzzed_hunks()
function parses patch's output to create relocation records that track
where hunks were applied and how much they need to be relocated back to
their original intended positions.
The bug was in how the relocation's `new` field was calculated. The code
was storing:
new = lnum - hunk_offs[hnum - 1]
where `lnum` is the line number where patch applied the hunk, and
`hunk_offs` is the offset introduced by splitting the original hunk into
smaller pieces. This calculation gives the "original hunk's intended
line number" before any splitting occurred.
However, fuzzy_relocate_hunks() later searches for hunks in the patched
file by looking for hunks where `hcurr->nstart == rcurr->new`. The hunks
in the patched file are located at their *actual* applied positions
(i.e., `lnum`), not at the adjusted position (`lnum - hunk_offs`). This
mismatch caused the relocation logic to fail to find the target hunk,
triggering the fatal "failed to relocate hunk" error.
The fix changes the relocation record to store:
new = lnum (actual position in patched file)
off = off + split_off (total offset: patch offset + split offset)
This ensures that:
1. Hunks can be found at their actual positions in the patched file
(matching `hcurr->nstart == rcurr->new` succeeds)
2. When relocating, subtracting `off` correctly moves the hunk back to
its original intended position, accounting for both the patch offset
(where patch chose to apply the hunk) and the split offset (the
difference introduced by hunk splitting)
3. Duplicate detection still works correctly by comparing original
intended positions: `lnum - off - split_off == prev->new - prev->off`
simplifies to comparing the same values as before since prev->off now
also includes the split offset
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
This is a slippery slope and tricky to keep track of to ensure that the file string buffer is sufficiently large. Just use dynamic allocations.
Extract the common diff invocation pattern into a reusable run_diff() helper function. This function: - Takes options string as a parameter (caller builds it) - Builds the diff command with the options and diff_opts - Executes diff via xpipe() - Consumes the --- and +++ header lines - Returns a FILE* positioned at the first @@ line, or NULL if empty - Optionally returns the child pid for the caller to waitpid() Refactor output_delta() to use run_diff() instead of inline diff code. This simplifies the function and prepares for additional diff call sites. Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Refactor fuzzy mode to produce clearly separated output sections that distinguish between actual code changes (delta differences) and surrounding context changes (context differences). This makes it much easier for reviewers to understand backport patches by showing: 1. What behavioral changes differ between the patches 2. What contextual adaptations were made for the target codebase 3. Which hunks couldn't be applied and need manual review DELTA DIFFING: The goal is to show code changes present in one patch but not the other, even when patch2 has hunks that fail to apply due to context mismatch. 1. Construct patch1_orig and patch1_new from patch1 2. Split patch2 and apply to patch1_orig with normal fuzz 3. Generate delta_diff = diff(patch1_new, patch1_orig + patch2) 4. Filter delta hunks that are just the inverse of rejected patch2 hunks (see REJECTION FILTERING below) 5. Filter bogus hunks containing unlines via split_patch_hunks() CONTEXT DIFFING: The goal is to show only context differences with delta changes removed. 1. Create a fresh copy of patch1_orig (delta diffing modified tmpp1) 2. Apply delta_diff in reverse to patch2_orig to remove delta changes 3. Parse fuzz offsets and relocate hunks in patch2_orig 4. Generate ctx_diff = diff(patch1_orig, patch2_orig) 5. Filter bogus hunks containing unlines via split_patch_hunks() REJECTION FILTERING: When patch2 has a hunk that was rejected on patch1_orig, and patch1 makes the same change, the delta diff will show a bogus difference (patch1_new has the change but tmpp1 doesn't). Rather than reverse-applying rejected hunks with GNU patch (which is unreliable when multiple similar code patterns exist), delta_diff is post-processed to filter these out: 1. Parse all delta hunks and rejected hunks, extracting +/- lines and context lines with their distances from the nearest change line. 2. For each rejected hunk, find the best matching delta hunk: the rejected hunk's +lines must appear as a contiguous segment in the delta's -lines (and vice versa). Among matches, pick the delta hunk with the highest context score. The score is the sum of matching context line lengths divided by distance from the nearest change, so lines adjacent to changes are weighted most heavily (similar to how GNU patch prioritizes inner context for fuzzy matching). 3. Each rejected hunk is assigned to at most one delta hunk. A single delta hunk may have multiple rejected hunks assigned (when diff merged adjacent rejected changes into one hunk). 4. A delta hunk is filtered only if ALL its change lines are fully covered by assigned rejected hunks, verified by greedy sequential matching. OUTPUT STRUCTURE: Output is organized into distinct sections with 80-column ASCII banners. Delta differences use = borders with * sides, rejected hunks use # borders with ! sides, and context differences use = borders with * sides. Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
In fuzzy mode, files modified by only one patch were output raw without colorization or section headers. Redirect their output into per-section tmpfile accumulators and display them under new ONLY IN PATCH1 / ONLY IN PATCH2 banners alongside the existing delta/context/rejected sections. Suppress the "reverted:" and "unchanged:" labels in fuzzy mode since the section headers already convey the meaning. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When comparing context differences between two patches, diff hunks can contain changes at the top or bottom edge that are exclusively additions or exclusively deletions. These are not real differences -- they are artifacts of one patch having captured more or fewer context lines than the other around the same code change. For example, if patch2 includes 5 lines of context above a change but patch1 only includes 3, the context diff will show those 2 extra lines as additions at the top of a hunk. This is misleading because the patches make the same change; they just differ in how much surrounding code was captured. Add filter_edge_hunks() to detect and handle these spurious edge lines. For each hunk, the first and last context lines partition the body into three regions: top edge, middle, and bottom edge. Each edge is then classified: - Two-sided (has both additions and deletions): a real change, kept as-is - One-sided (exclusively additions or exclusively deletions): spurious, trimmed from the hunk and the @@ header line counts adjusted If no changes remain after trimming (the entire hunk was spurious edges), the hunk is dropped. If all hunks are dropped, the CONTEXT DIFFERENCES section is suppressed entirely. Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update expected output for fuzzy1-9 tests to reflect the new section-based output format and improved hunk matching. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
60a60b3 to
d90325c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This implements a --fuzzy option to make interdiff perform a fuzzy
comparison between two diffs. This is very helpful, for example, for
comparing a backport patch to its upstream source patch to assist a human
reviewer in verifying the correctness of the backport.
The fuzzy diffing process is complex and works by:
separate out multiple deltas (+/- lines) in a single hunk that are spaced
apart by context lines, increasing the amount of deltas that can be
applied successfully with fuzz
p2 patch to p1's original file; the original files aren't ever merged
respective locations in the other file, based on the reported line
offset printed out by
patchfor each hunk it successfully appliedpatched files, to hide unknown contextual information that is irrelevant
for comparing the two diffs while also improving hunk alignment between
the two patched files
unified diff, even splitting up hunks to remove unlines present in the
middle of a hunk, while also adjusting the @@ line to compensate for the
change in line offsets
both p1 and p2 in the output in order by line number, with a comment on
the @@ line indicating when an emitted hunk is a rejected hunk
This also involves working around some bugs in
patchitself encounteredalong the way, such as occasionally inaccurate line offsets printed out and
spurious fuzzing in certain cases that involve hunks with an unequal number
of pre-context and post-context lines.
The end result of all of this is a minimal set of real differences in the
context lines of each hunk between the user's provided diffs. Even when
fuzzing results in a faulty patch, the context differences are shown so
there is never a risk of any real deltas getting hidden due to fuzzing.
By default, the fuzz factor used is just the default used in
patch. Thefuzz factor can be adjusted by the user via appending =N to
--fuzzytospecify the maximum number of context lines for
patchto fuzz.Testing
This was tested on several complex Linux kernel patches to compare the backported version of a patch to its original upstream version. This PR also comes with a few basic fuzzy diffing tests integrated into the test infrastructure.