feat: upgrade existing filesystem by upils · Pull Request #264 · canonical/chisel

upils · 2026-01-29T16:24:00Z

Have you signed the CLA?

This commit enables Chisel detecting the target directory contains the result
of a previous execution to then operate an upgrade of the content. This initial
simple implementation has the the limitation that content (files, symlinks) is
systematically replaced, even if identical. As a consequences:

user modifications on chisel-managed content is overridden.
in the context of a OCI image build, this "new" content is identified as
different and thus the new OCI layer contains duplicated files.

Signed-off-by: Paul Mars <paul.mars@canonical.com>

github-actions · 2026-01-29T16:28:45Z

Command	Mean [s]	Min [s]	Max [s]	Relative
`BASE`	13.036 ± 0.119	12.883	13.243	1.00
`HEAD`	13.106 ± 0.154	12.878	13.427	1.01 ± 0.01

letFunny

First review. I am missing a couple of files but I wanted to publish it as quickly as possible so we have more time to iterate. This is looking great Paul, many things are looking dead simple which is always an extremely good sign. I have written some comments about how to make the PR shorter and how to make it, hopefully, even simpler. Let me know what you think.

I see some of the comments are similar to past discussions we had in upils#1. We spent some effort there, so please make sure all the comments were carried over to this one.

internal/fsutil/create.go

letFunny · 2026-02-04T15:19:27Z

cmd/chisel/cmd_cut.go

 		}
 	}

+	mfest, err := slicer.SelectValidManifest(cmd.RootDir, release)


In general our philosophy is that we should be able to cut a release from main at any point in time. That is, if we ever do a bug fix we should be able to release a point version. This is changing how Chisel works in folders with a manifest and it is not the correct behavior (yet) so I would prefer if we could gate this functionality over a environmental variable or a debug command or any other mechanism.

Good point. See a259d1d

internal/fsutil/move.go

internal/manifestutil/manifestutil_test.go

internal/fsutil/remove.go

letFunny · 2026-02-04T16:07:07Z

internal/slicer/slicer.go

 		targetDir = filepath.Join(dir, targetDir)
 	}

+	var originalTargetDir string


A different way of doing it is to extract what used to be the Run function into another one and have Run call into that with different target dir and then upgrade (see extract.go for Run vs extractData). Do you think it will be clearer this way?

I have no strong opinion for one or the other. It is unclear to me what lead extractData to be split into it's own function. Can you share the reason?

It was already like that when I joined :). IMO it is better to split than to carry more data around and "fake" it bypassing the interface by overriding target and copying original.

It is also a cleaner abstraction as it doesn't need the rest of the code and will make the function shorter (even if marginally).

See fad2cb7

internal/slicer/slicer.go

letFunny

The final set of files. Thanks again Paul for working on this.

tests/recut/task.yaml

letFunny · 2026-02-05T14:31:36Z

tests/recut/task.yaml

+  test -s ${rootfs_folder}/etc/debian_version
+  test -f ${rootfs_folder}/chisel/manifest.wall
+
+  # Remove a file from the first slice


Please add a TODO here saying that this is not expected behavior of the final feature. We should highlight in the test that this is only temporary.

I think what you are trying to test is that the content of pre-installed slices gets updated. What I would do is to change the contents and change the manifest to contain the new hash. The test will then verify that the file was updated. This will work now and in the future as it is highlighting the correct behavior.

Yes I wanted to test content was updated. I reworked the test and adopted a different strategy to what you suggested. This is testing that changes in a previously installed slice are reflected in the updated content. I think this will not change with the final upgrade strategy so this test case should be fine. This spread test can be expanded when the final upgrade strategy is implemented.

letFunny · 2026-02-05T14:36:29Z

internal/slicer/slicer_test.go

 		"/dir/file": "file 0644 cc55e2ec {test-package_third}",
 	},
+}, {
+	summary: "Recut removes obsolete paths when selection shrinks",


I think we are pushing this table too much now. This should really be test for the inner function, i.e. the equivalent of extract_data we discus in the other comments (not saying you should call internal function, I am just talking about conceptually).

I think we need a different test table that verifies re-cut and will take different data. You can have a release (maybe with a default release) and two inputs for the slices installed. The test installation will install version one and then call cut again. Compared to the current version, it is less leaky of the report abstraction as we don't need to build it manually, it is more faithful to what the real execution looks like and it is also far simpler to read and write as we are talking about normal chisel releases file contents, not the intricacies of the report with slice attribution.

I expect the file to change substantially so I will not comment on the rest for now.

Reworked as suggested, this is much simpler and closer to TestRun. The added TestRunRecut test is very close to runSlicerTests. I have some ideas to factorize (embed slicerTest in slicerRecutTest, split runSlicerTests in several test helpers) but that might just tie tests together and make the test harder to read. Let me know what you think.

internal/slicer/slicer.go

letFunny · 2026-02-05T14:42:46Z

internal/slicer/slicer.go

+			defer r.Close()
+			mfest, err := manifest.Read(r)
+			if err != nil {
+				return err


This will not work correctly unless you add a custom error for the invalid schema and verify it here, I remember we discussed this a number of times in the other PR. Is there a test with a manifest with a different version that is valid jsonwall? I may have missed it.

Named error added and handled in SelectValidManifest.

letFunny · 2026-02-05T14:46:23Z

internal/slicer/slicer.go

+			if !ok {
+				schemaManifest[mfest.Schema()] = manifestHash{mfestPath, mfestHash}
+			} else if refMfest.hash != mfestHash {
+				return fmt.Errorf("inconsistent manifests: %q and %q", refMfest.path, mfestPath)
+			}
+
+			if selected == nil || manifestutil.CompareSchemas(mfest.Schema(), selected.Schema()) > 0 {
+				selected = mfest
+			}


This is a bit weird, I would remove this logic until we have a need for it (see my other comment). We need to verify schema matches expected and matches hash (if set). This is trying to predict the future, but let's not anticipate small details about code when not needed.

Also, as a side-node, checking the hash for each version is different from:

Consistency with all other manifests with the same schema is verified so the selection is deterministic.

which I read as, verify all manifests of the selected schema. Is that what we want? (we probably don't want to have this discussion and we want to simplify it further instead).

I simplified it and reworked the comment.

upils · 2026-02-11T14:47:49Z

public/manifest/manifest.go

 	mfestSchema := db.Schema()
 	if mfestSchema != Schema {
-		return nil, fmt.Errorf("unknown schema version %q", mfestSchema)
+		return nil, fmt.Errorf("%w %q", ErrUnknownSchema, mfestSchema)


[Note to reviewer]: It looks a bit off to me but I did not want to break compatibility since this error is part of the public API.

We do not have to maintain compatibility on the message contents, no consumer should depend on that. I know about hyrum law and what not. But we are not Go standard library and we cannot fix error messages forever. If we did that then all errors messages are fixed forever, here and in internal.

There are multiple ways to create named errors in Go, this is one of them. I think you can make it work with a nice message.

letFunny

Thank you for this Paul. Most of the comments are minor which should tell you that the code is almost ready, kudos for that. I have several important comments that I think will affect the correctness of the solution so please read those carefully.

Also be sure to capture our discussions in 1:1 in a comment in Github or something so that we don't lose track. I wrote a comment stating a part we discussed that was not here, I think that was all of it.

cmd/chisel/cmd_cut.go

letFunny · 2026-02-12T10:15:55Z

internal/fsutil/create.go

+		// TODO: Detect if existing content is a file. ErrExist is also returned
+		// if a file exists at this path, so returning nil here creates an
+		// inconsistency between our view of the content and the real content on
+		// disk.


Can you state in a comment in the PR that this is a bug that we found while implementing this and will be fixed in another PR? And can you also add the bug for symlinks I told you about the other day in the MM chat?

Last thing, in the comment you wrote can you say explicitly "bug". This is not an enhancement it is a bug that leads to an inconsistency as you correctly point out.

See 10a1744

internal/manifestutil/manifestutil_test.go

letFunny · 2026-02-12T10:18:09Z

internal/slicer/slicer.go

+		if err != nil {
+			return fmt.Errorf("cannot create temporary working directory: %w", err)
+		}
+		targetDir = tmpWorkDir


Nit: My suggestion is to avoid naming originalTargetDir and targetDir and do installOpts.TargetDir = tmpWorkDir here directly.

See 6cf317c

letFunny · 2026-02-12T10:18:31Z

internal/slicer/slicer.go

-	pkgArchive, err := selectPkgArchives(options.Archives, options.Selection)
+	originalTargetDir := targetDir
+	if options.Manifest != nil {
+		tmpWorkDir, err := os.MkdirTemp(targetDir, "chisel-*")


Is it worth it to name it chisel-temp-* or -workdir-... or something like that?

Yes, adding -workdir will make it extra explicit. Note: this directory will only be visible to user if they inspect the target directory during the execution or Chisel panics. But in the future that might change so it is better to have a good name now.

See 0d1103b

internal/slicer/slicer_test.go

letFunny · 2026-02-12T11:23:06Z

internal/slicer/slicer_test.go

+	},
+	noMatch: true,
+}, {
+	summary: "Unknown schema error ignored",


IMPORTANT: I am not seeing any test about what we discussed this the other day so let's make sure we don't lose the context (I may have missed it). I don't think this is correct. I think this should fail for the reasons I stated the other day:

case 1, we find a valid and an optional invalid manifest. This means we upgraded the format and we produced two versions for compatibility.

case 2, we find only invalid manifests. This means the new versions of Chisel didn't produce backwards compatible manifests. You are ignoring the error, meaning old Chisel will do a cut which is a mistake given that there was a previous one. Imagine when rockcraft updates its chiselled base layer.

case 3, we find no manifests. This is clearly a first cut, proceed normally.

We are losing information about the previous cut by grouping case 2 and case 3 into the same one, and to me they are NOT the same.

Lastly, this gives us the option in the future to either break compatibility or not. In the current code we either force old versions to ignore re-cut or we are compatible. I prefer the error.

I missed that indeed. I agree with the proposed strategy, see 46f872c. I may revisit that with a clearer mind.

letFunny · 2026-02-12T11:26:01Z

public/manifest/manifest.go

 	mfestSchema := db.Schema()
 	if mfestSchema != Schema {
-		return nil, fmt.Errorf("unknown schema version %q", mfestSchema)
+		return nil, fmt.Errorf("%w %q", ErrUnknownSchema, mfestSchema)


We do not have to maintain compatibility on the message contents, no consumer should depend on that. I know about hyrum law and what not. But we are not Go standard library and we cannot fix error messages forever. If we did that then all errors messages are fixed forever, here and in internal.

There are multiple ways to create named errors in Go, this is one of them. I think you can make it work with a nice message.

letFunny · 2026-02-12T11:27:47Z

tests/recut/task.yaml

+  test -s ${ROOTFS}/etc/debian_version
+  test -s ${ROOTFS}/etc/issue
+  cat ${ROOTFS}/etc/foo | grep "qux"
+  test -f ${ROOTFS}/chisel/manifest.wall


We could check qux being included in the manifest if you wanted to. I don't have a strong opinion on how extensive the spread tests should be.

upils added 2 commits January 27, 2026 11:28

feat: search, load and validates manifests

491cd0c

Signed-off-by: Paul Mars <paul.mars@canonical.com>

feat: adapt to revised strategy

a6e57f7

upils added 6 commits January 30, 2026 14:26

feat: implement upgrade

5da0450

fix: deletion

5703751

fix: revert inadvertent change

26bb2b3

fix: check slice collecting error

4a14230

fix: simplify upgrade and improve deletion

865c1b8

refactor: cleaning

0343ddc

upils changed the title ~~feat: update existing filesystem~~ feat: upgrade existing filesystem Feb 2, 2026

refactor: refine upgrade

e116424

upils requested a review from letFunny February 2, 2026 14:35

upils marked this pull request as ready for review February 2, 2026 14:35

upils added 8 commits February 3, 2026 09:31

refactor: fsutil handles moving and removing

5b9d408

refactor: improve consistency

7861c8d

style: fix lint error

27a0d22

tests: Move and Remove

5f377b0

tests: FindPathsInRelease

f74265c

tests: more test cases for Move

799373c

tests: add spread test

dbe3f30

tests: fix recut spread test

becfe42

upils mentioned this pull request Feb 4, 2026

feat: search, load and validates manifests upils/chisel#1

Closed

1 task

upils added 5 commits February 4, 2026 09:44

tests: SelectValidManifest

3c994b8

fix: add missing deps

6ebb893

tests: recut feature

ceac217

fix: avoid duplicates in FindPathsInRelease

7e34e32

ci: rerun

e1d4e17

letFunny suggested changes Feb 4, 2026

View reviewed changes

upils added 2 commits February 5, 2026 08:46

fix: apply PR suggestions

c3db9d4

refactor: simplify FindPaths and FindPathsInRelease

e1e9c43

letFunny suggested changes Feb 5, 2026

View reviewed changes

upils added 13 commits February 6, 2026 14:20

refactor: apply PR suggestions

7c17f18

refactor: revert error message change

7caf3ba

fix: gate recut feature behind env var

a259d1d

refactor: split install from Run

fad2cb7

tests: simplify recut spread test

d095953

tests: rework recut spread tests

d0b70cd

refactor: revert now useless changes

83e7af6

refactor: simplify

d217c43

refactor: more cleaning

8a7066b

test: wip rework uprade tests

87d1496

fix: ignore unknown manifest schema error

f8e2bd6

feat: improve upgrade and add tests

6dd701c

tests: simplify TestRunRecut

8523aa5

upils commented Feb 11, 2026

View reviewed changes

upils requested a review from letFunny February 11, 2026 14:49

letFunny suggested changes Feb 12, 2026

View reviewed changes

upils added 13 commits February 12, 2026 13:05

fix: rename recut env var

2ceb21a

docs: add TODO on other file creation bug

10a1744

refactor: simplify targetDir handling in Run

6cf317c

fix: improve workdir name

0d1103b

fix: clarify intent to remove existing content

78e3f51

fix: apply various PR suggestions

d14c056

fix: apply PR suggestions

4217131

fix: apply PR suggestions

a902099

test: improve regex precision

08b3edd

refactor: extract and simplify mkParentAll

cf90f0d

fix: Error when only invalid manifests found

46f872c

fix: lint

2052ee5

fix: wrap add context on upgrade errors

0fbc47a

Conversation

upils commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

letFunny left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

letFunny left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

letFunny left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

upils commented Jan 29, 2026 •

edited

Loading

github-actions bot commented Jan 29, 2026 •

edited

Loading