Conversation
Signed-off-by: Paul Mars <paul.mars@canonical.com>
|
letFunny
left a comment
There was a problem hiding this comment.
First review. I am missing a couple of files but I wanted to publish it as quickly as possible so we have more time to iterate. This is looking great Paul, many things are looking dead simple which is always an extremely good sign. I have written some comments about how to make the PR shorter and how to make it, hopefully, even simpler. Let me know what you think.
I see some of the comments are similar to past discussions we had in upils#1. We spent some effort there, so please make sure all the comments were carried over to this one.
cmd/chisel/cmd_cut.go
Outdated
| } | ||
| } | ||
|
|
||
| mfest, err := slicer.SelectValidManifest(cmd.RootDir, release) |
There was a problem hiding this comment.
In general our philosophy is that we should be able to cut a release from main at any point in time. That is, if we ever do a bug fix we should be able to release a point version. This is changing how Chisel works in folders with a manifest and it is not the correct behavior (yet) so I would prefer if we could gate this functionality over a environmental variable or a debug command or any other mechanism.
internal/slicer/slicer.go
Outdated
| targetDir = filepath.Join(dir, targetDir) | ||
| } | ||
|
|
||
| var originalTargetDir string |
There was a problem hiding this comment.
A different way of doing it is to extract what used to be the Run function into another one and have Run call into that with different target dir and then upgrade (see extract.go for Run vs extractData). Do you think it will be clearer this way?
There was a problem hiding this comment.
I have no strong opinion for one or the other. It is unclear to me what lead extractData to be split into it's own function. Can you share the reason?
There was a problem hiding this comment.
It was already like that when I joined :). IMO it is better to split than to carry more data around and "fake" it bypassing the interface by overriding target and copying original.
It is also a cleaner abstraction as it doesn't need the rest of the code and will make the function shorter (even if marginally).
letFunny
left a comment
There was a problem hiding this comment.
The final set of files. Thanks again Paul for working on this.
tests/recut/task.yaml
Outdated
| test -s ${rootfs_folder}/etc/debian_version | ||
| test -f ${rootfs_folder}/chisel/manifest.wall | ||
|
|
||
| # Remove a file from the first slice |
There was a problem hiding this comment.
Please add a TODO here saying that this is not expected behavior of the final feature. We should highlight in the test that this is only temporary.
I think what you are trying to test is that the content of pre-installed slices gets updated. What I would do is to change the contents and change the manifest to contain the new hash. The test will then verify that the file was updated. This will work now and in the future as it is highlighting the correct behavior.
There was a problem hiding this comment.
Yes I wanted to test content was updated. I reworked the test and adopted a different strategy to what you suggested. This is testing that changes in a previously installed slice are reflected in the updated content. I think this will not change with the final upgrade strategy so this test case should be fine. This spread test can be expanded when the final upgrade strategy is implemented.
internal/slicer/slicer_test.go
Outdated
| "/dir/file": "file 0644 cc55e2ec {test-package_third}", | ||
| }, | ||
| }, { | ||
| summary: "Recut removes obsolete paths when selection shrinks", |
There was a problem hiding this comment.
I think we are pushing this table too much now. This should really be test for the inner function, i.e. the equivalent of extract_data we discus in the other comments (not saying you should call internal function, I am just talking about conceptually).
I think we need a different test table that verifies re-cut and will take different data. You can have a release (maybe with a default release) and two inputs for the slices installed. The test installation will install version one and then call cut again. Compared to the current version, it is less leaky of the report abstraction as we don't need to build it manually, it is more faithful to what the real execution looks like and it is also far simpler to read and write as we are talking about normal chisel releases file contents, not the intricacies of the report with slice attribution.
There was a problem hiding this comment.
I expect the file to change substantially so I will not comment on the rest for now.
There was a problem hiding this comment.
Reworked as suggested, this is much simpler and closer to TestRun. The added TestRunRecut test is very close to runSlicerTests. I have some ideas to factorize (embed slicerTest in slicerRecutTest, split runSlicerTests in several test helpers) but that might just tie tests together and make the test harder to read. Let me know what you think.
internal/slicer/slicer.go
Outdated
| defer r.Close() | ||
| mfest, err := manifest.Read(r) | ||
| if err != nil { | ||
| return err |
There was a problem hiding this comment.
This will not work correctly unless you add a custom error for the invalid schema and verify it here, I remember we discussed this a number of times in the other PR. Is there a test with a manifest with a different version that is valid jsonwall? I may have missed it.
There was a problem hiding this comment.
Named error added and handled in SelectValidManifest.
internal/slicer/slicer.go
Outdated
| if !ok { | ||
| schemaManifest[mfest.Schema()] = manifestHash{mfestPath, mfestHash} | ||
| } else if refMfest.hash != mfestHash { | ||
| return fmt.Errorf("inconsistent manifests: %q and %q", refMfest.path, mfestPath) | ||
| } | ||
|
|
||
| if selected == nil || manifestutil.CompareSchemas(mfest.Schema(), selected.Schema()) > 0 { | ||
| selected = mfest | ||
| } |
There was a problem hiding this comment.
This is a bit weird, I would remove this logic until we have a need for it (see my other comment). We need to verify schema matches expected and matches hash (if set). This is trying to predict the future, but let's not anticipate small details about code when not needed.
Also, as a side-node, checking the hash for each version is different from:
Consistency with all other manifests with the same schema is verified so the selection is deterministic.
which I read as, verify all manifests of the selected schema. Is that what we want? (we probably don't want to have this discussion and we want to simplify it further instead).
There was a problem hiding this comment.
I simplified it and reworked the comment.
| mfestSchema := db.Schema() | ||
| if mfestSchema != Schema { | ||
| return nil, fmt.Errorf("unknown schema version %q", mfestSchema) | ||
| return nil, fmt.Errorf("%w %q", ErrUnknownSchema, mfestSchema) |
There was a problem hiding this comment.
[Note to reviewer]: It looks a bit off to me but I did not want to break compatibility since this error is part of the public API.
There was a problem hiding this comment.
We do not have to maintain compatibility on the message contents, no consumer should depend on that. I know about hyrum law and what not. But we are not Go standard library and we cannot fix error messages forever. If we did that then all errors messages are fixed forever, here and in internal.
There are multiple ways to create named errors in Go, this is one of them. I think you can make it work with a nice message.
letFunny
left a comment
There was a problem hiding this comment.
Thank you for this Paul. Most of the comments are minor which should tell you that the code is almost ready, kudos for that. I have several important comments that I think will affect the correctness of the solution so please read those carefully.
Also be sure to capture our discussions in 1:1 in a comment in Github or something so that we don't lose track. I wrote a comment stating a part we discussed that was not here, I think that was all of it.
internal/fsutil/create.go
Outdated
| // TODO: Detect if existing content is a file. ErrExist is also returned | ||
| // if a file exists at this path, so returning nil here creates an | ||
| // inconsistency between our view of the content and the real content on | ||
| // disk. |
There was a problem hiding this comment.
Can you state in a comment in the PR that this is a bug that we found while implementing this and will be fixed in another PR? And can you also add the bug for symlinks I told you about the other day in the MM chat?
Last thing, in the comment you wrote can you say explicitly "bug". This is not an enhancement it is a bug that leads to an inconsistency as you correctly point out.
internal/slicer/slicer.go
Outdated
| if err != nil { | ||
| return fmt.Errorf("cannot create temporary working directory: %w", err) | ||
| } | ||
| targetDir = tmpWorkDir |
There was a problem hiding this comment.
Nit: My suggestion is to avoid naming originalTargetDir and targetDir and do installOpts.TargetDir = tmpWorkDir here directly.
internal/slicer/slicer.go
Outdated
| pkgArchive, err := selectPkgArchives(options.Archives, options.Selection) | ||
| originalTargetDir := targetDir | ||
| if options.Manifest != nil { | ||
| tmpWorkDir, err := os.MkdirTemp(targetDir, "chisel-*") |
There was a problem hiding this comment.
Is it worth it to name it chisel-temp-* or -workdir-... or something like that?
There was a problem hiding this comment.
Yes, adding -workdir will make it extra explicit. Note: this directory will only be visible to user if they inspect the target directory during the execution or Chisel panics. But in the future that might change so it is better to have a good name now.
internal/slicer/slicer_test.go
Outdated
| }, | ||
| noMatch: true, | ||
| }, { | ||
| summary: "Unknown schema error ignored", |
There was a problem hiding this comment.
IMPORTANT: I am not seeing any test about what we discussed this the other day so let's make sure we don't lose the context (I may have missed it). I don't think this is correct. I think this should fail for the reasons I stated the other day:
- case 1, we find a valid and an optional invalid manifest. This means we upgraded the format and we produced two versions for compatibility.
- case 2, we find only invalid manifests. This means the new versions of Chisel didn't produce backwards compatible manifests. You are ignoring the error, meaning old Chisel will do a cut which is a mistake given that there was a previous one. Imagine when rockcraft updates its chiselled base layer.
- case 3, we find no manifests. This is clearly a first cut, proceed normally.
We are losing information about the previous cut by grouping case 2 and case 3 into the same one, and to me they are NOT the same.
Lastly, this gives us the option in the future to either break compatibility or not. In the current code we either force old versions to ignore re-cut or we are compatible. I prefer the error.
There was a problem hiding this comment.
I missed that indeed. I agree with the proposed strategy, see 46f872c. I may revisit that with a clearer mind.
| mfestSchema := db.Schema() | ||
| if mfestSchema != Schema { | ||
| return nil, fmt.Errorf("unknown schema version %q", mfestSchema) | ||
| return nil, fmt.Errorf("%w %q", ErrUnknownSchema, mfestSchema) |
There was a problem hiding this comment.
We do not have to maintain compatibility on the message contents, no consumer should depend on that. I know about hyrum law and what not. But we are not Go standard library and we cannot fix error messages forever. If we did that then all errors messages are fixed forever, here and in internal.
There are multiple ways to create named errors in Go, this is one of them. I think you can make it work with a nice message.
| test -s ${ROOTFS}/etc/debian_version | ||
| test -s ${ROOTFS}/etc/issue | ||
| cat ${ROOTFS}/etc/foo | grep "qux" | ||
| test -f ${ROOTFS}/chisel/manifest.wall |
There was a problem hiding this comment.
We could check qux being included in the manifest if you wanted to. I don't have a strong opinion on how extensive the spread tests should be.
This commit enables Chisel detecting the target directory contains the result
of a previous execution to then operate an upgrade of the content. This initial
simple implementation has the the limitation that content (files, symlinks) is
systematically replaced, even if identical. As a consequences:
different and thus the new OCI layer contains duplicated files.