Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manifest should not be added to index.json #664

Closed
eiffel-fl opened this issue Jan 2, 2024 · 15 comments
Closed

Manifest should not be added to index.json #664

eiffel-fl opened this issue Jan 2, 2024 · 15 comments
Assignees
Milestone

Comments

@eiffel-fl
Copy link
Contributor

Hi!

Manifests are added to index.json:

{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "digest": "sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1",
      "size": 491,
      "annotations": {
        "org.opencontainers.image.ref.name": "docker.io/library/execs:latest"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:662ca58536d5a03864781eb42870f6712927981383f1dab1449d9adf80067313",
      "size": 581,
      "platform": {
        "architecture": "amd64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb",
      "size": 581,
      "platform": {
        "architecture": "arm64",
        "os": "linux"
      }
    }
  ]
}

This behavior can create troubles, as when the manifests in index.json will not be automatically deleted when the surrounding index is deleted:

francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ sudo ls /var/lib/ig/oci-store/blobs/sha256                                 main % u=
225a8e697e260684d9451ccf8a622cd5548b0ae5fac6e13c416cd1d85247b7b7  b4be31f07444fb9be6dee8cd1349aab3c780d333b63406be10c5c14c1b39dd37
662ca58536d5a03864781eb42870f6712927981383f1dab1449d9adf80067313  c06a86dad7e582686b53bec9c729b92fc56c838b59cc11d0105c2c20f3158dcd
b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1  f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb
francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ sudo -E ./ig image remove execs                                            main % u=
INFO[0000] Experimental features enabled                
Successfully removed execs
francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ sudo ls /var/lib/ig/oci-store/blobs/sha256                                 main % u=
225a8e697e260684d9451ccf8a622cd5548b0ae5fac6e13c416cd1d85247b7b7  c06a86dad7e582686b53bec9c729b92fc56c838b59cc11d0105c2c20f3158dcd
662ca58536d5a03864781eb42870f6712927981383f1dab1449d9adf80067313  f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb
b4be31f07444fb9be6dee8cd1349aab3c780d333b63406be10c5c14c1b39dd37
francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ sudo cat /var/lib/ig/oci-store/index.json | jq                             main % u=
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb",
      "size": 581,
      "platform": {
        "architecture": "arm64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:662ca58536d5a03864781eb42870f6712927981383f1dab1449d9adf80067313",
      "size": 581,
      "platform": {
        "architecture": "amd64",
        "os": "linux"
      }
    }
  ]
}

Sadly, this behavior is not straightforward to modify.
So, I open this issue to discuss a proper design.

Best regards.

@eiffel-fl
Copy link
Contributor Author

eiffel-fl commented Jan 3, 2024

Hi!

I took a look at this and was able to avoid storing the manifests in index.json with the following patch:

diff --git a/content/oci/oci.go b/content/oci/oci.go
index 5c584f8..997543c 100644
--- a/content/oci/oci.go
+++ b/content/oci/oci.go
@@ -143,10 +143,6 @@ func (s *Store) Push(ctx context.Context, expected ocispec.Descriptor, reader io
        if err := s.graph.Index(ctx, s.storage, expected); err != nil {
                return err
        }
-       if descriptor.IsManifest(expected) {
-               // tag by digest
-               return s.tag(ctx, expected, expected.Digest.String())
-       }
        return nil
 }
$ sudo -E ./cmd/ig/ig image build -t execs -f gadgets/trace_exec/gadget.yaml gadgets/trace_exec
INFO[0000] Experimental features enabled                
Successfully built docker.io/library/execs:latest@sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1
$ sudo cat /var/lib/ig/oci-store/index.json | jq                            main *% u=
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "digest": "sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1",
      "size": 491,
      "annotations": {
        "org.opencontainers.image.ref.name": "docker.io/library/execs:latest"
      }
    }
  ]
}

The code I removed with the above patch was added in 4e58192 which description is the following:

refactor: refactor OCI store to fully support Predecessors() and Resolve() (#385)

  1. Support discovering predecessors that are not tagged
  2. Support resolving a manifest by its digest
  3. Support resolving a blob by its digest

Resolves: #34

So, I guess by removing the above code it breaks resolving a manifest by its digest?
Meanwhile everything is OK with the above patch regarding tests:

root@8db5c3d638c6:/mnt/oras-go# make test
...
coverage: 78.5% of statements
ok      oras.land/oras-go/v2/registry/remote/retry      9.762s  coverage: 78.5% of statements

Moreover, applying the above patch on top of faaa1dd and modifying few stuff on Inspektor Gadget side, I can have everything removed while removing an image:

$ sudo -E ./cmd/ig/ig image build -t execs -f gadgets/trace_exec/gadget.yaml gadgets/trace_exec
INFO[0000] Experimental features enabled                
Successfully built docker.io/library/execs:latest@sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1
$ sudo ls /var/lib/ig/oci-store/blobs/sha256                  francis/rmi-upstream *+%
225a8e697e260684d9451ccf8a622cd5548b0ae5fac6e13c416cd1d85247b7b7  b4be31f07444fb9be6dee8cd1349aab3c780d333b63406be10c5c14c1b39dd37
662ca58536d5a03864781eb42870f6712927981383f1dab1449d9adf80067313  c06a86dad7e582686b53bec9c729b92fc56c838b59cc11d0105c2c20f3158dcd
b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1  f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb
$ sudo cat /var/lib/ig/oci-store/index.json | jq              francis/rmi-upstream *+%
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "digest": "sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1",
      "size": 491,
      "annotations": {
        "org.opencontainers.image.ref.name": "docker.io/library/execs:latest"
      }
    }
  ]
}
$ sudo -E ./cmd/ig/ig run execs                               francis/rmi-upstream *+%
INFO[0000] Experimental features enabled                
RUNTIME.CONTAINERNAME                PID                 PPID                UID                 GID                 RE… COMM            
builder                              48390               28015               0                   0                   0   ls              
^C%                                                                                                                                      $ sudo -E ./cmd/ig/ig image remove execs                      francis/rmi-upstream *+%
INFO[0000] Experimental features enabled                
Successfully removed execs
$ sudo ls /var/lib/ig/oci-store/blobs/sha256                  francis/rmi-upstream *+%
$ sudo cat /var/lib/ig/oci-store/index.json | jq              francis/rmi-upstream *+%
{
  "schemaVersion": 2,
  "manifests": null
}

Am I missing something, or the above patch can really change this behavior without breaking everything?
Your insights on this would be appreciated.

Best regards.

@Wwwsylvia
Copy link
Member

Wwwsylvia commented Jan 4, 2024

So, I guess by removing the above code it breaks resolving a manifest by its digest?

Yes, this is true. Since #385, we record every manifest in index.json in order to support:

  1. Resolving a manifest by its digest
  2. Tracking untagged referrers manifests (these referrers manifests would be unreachable if they are not recorded in index.json)

So if you remove the code like that the above scenarios would be broken.

Meanwhile everything is OK with the above patch regarding tests:

Mmm that shouldn't happen🤔 Can you try again? The diff you made will fail several cases in oci_test.go.

go test ./content/oci
--- FAIL: TestStore_Success (0.00s)
    oci_test.go:144: resolver.Map() = 0, want 1
    oci_test.go:162: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_RelativeRoot_Success (0.01s)
    oci_test.go:273: resolver.Map() = 0, want 1
    oci_test.go:291: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_DisableAutoSaveIndex (0.00s)
    oci_test.go:614: resolver.Map() = 0, want 1
    oci_test.go:623: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_RepeatTag (0.00s)
    oci_test.go:703: len(resolver.Map()) = 0, want 1
    oci_test.go:706: len(index.Manifests) = 0, want 1
    oci_test.go:744: resolver.Map() = 2, want 3
    oci_test.go:747: len(index.Manifests) = 1, want 2
--- FAIL: TestStore_TagByDigest (0.00s)
    oci_test.go:881: len(resolver.Map()) = 0, want 1
    oci_test.go:884: len(index.Manifests) = 0, want 1
    oci_test.go:891: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_ExistingStore (0.01s)
    oci_test.go:1217: Store.Resolve() = {application/octet-stream sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }, want {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }
    oci_test.go:1289: Store.Predecessors(6) = [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> }], want [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> } {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    oci_test.go:1289: Store.Predecessors(10) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
    oci_test.go:1289: Store.Predecessors(11) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    oci_test.go:1289: Store.Predecessors(13) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
--- FAIL: TestReadOnlyStore_DirFS (0.01s)
    readonlyoci_test.go:350: ReadOnlyStore.Resolve() = {application/octet-stream sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }, want {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(6) = [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> }], want [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> } {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(10) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(11) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(13) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
FAIL
FAIL    oras.land/oras-go/v2/content/oci        0.249s
FAIL

@eiffel-fl
Copy link
Contributor Author

Hi!

So, I guess by removing the above code it breaks resolving a manifest by its digest?

Yes, this is true. Since #385, we record every manifest in index.json in order to support:

1. Resolving a manifest by its digest

2. Tracking untagged referrers manifests (these referrers manifests would be unreachable if they are not recorded in `index.json`)

This makes sense!
For 1., I did not take a look at the code deeply, but could we replicate what you have done to resolve layers by their digests but for manifests?
Regarding 2., do we need it now what we have garbage collection?

So if you remove the code like that the above scenarios would be broken.

Meanwhile everything is OK with the above patch regarding tests:

Mmm that shouldn't happen🤔 Can you try again? The diff you made will fail several cases in oci_test.go.

go test ./content/oci
--- FAIL: TestStore_Success (0.00s)
    oci_test.go:144: resolver.Map() = 0, want 1
    oci_test.go:162: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_RelativeRoot_Success (0.01s)
    oci_test.go:273: resolver.Map() = 0, want 1
    oci_test.go:291: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_DisableAutoSaveIndex (0.00s)
    oci_test.go:614: resolver.Map() = 0, want 1
    oci_test.go:623: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_RepeatTag (0.00s)
    oci_test.go:703: len(resolver.Map()) = 0, want 1
    oci_test.go:706: len(index.Manifests) = 0, want 1
    oci_test.go:744: resolver.Map() = 2, want 3
    oci_test.go:747: len(index.Manifests) = 1, want 2
--- FAIL: TestStore_TagByDigest (0.00s)
    oci_test.go:881: len(resolver.Map()) = 0, want 1
    oci_test.go:884: len(index.Manifests) = 0, want 1
    oci_test.go:891: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_ExistingStore (0.01s)
    oci_test.go:1217: Store.Resolve() = {application/octet-stream sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }, want {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }
    oci_test.go:1289: Store.Predecessors(6) = [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> }], want [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> } {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    oci_test.go:1289: Store.Predecessors(10) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
    oci_test.go:1289: Store.Predecessors(11) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    oci_test.go:1289: Store.Predecessors(13) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
--- FAIL: TestReadOnlyStore_DirFS (0.01s)
    readonlyoci_test.go:350: ReadOnlyStore.Resolve() = {application/octet-stream sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }, want {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(6) = [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> }], want [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> } {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(10) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(11) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(13) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
FAIL
FAIL    oras.land/oras-go/v2/content/oci        0.249s
FAIL

Indeed, I got mixed up and I guess I tested without the patch applied 😅:

root@c808884f4989:/mnt# make test
...
ok      oras.land/oras-go/v2/registry/remote/retry      9.786s  coverage: 78.5% of statements
FAIL
make: *** [Makefile:16: test] Error 1

Best regards.

@eiffel-fl
Copy link
Contributor Author

I tweaked a bit the code base to be able to reference manifest by their digest, the code can be found here:
https://github.com/eiffel-fl/oras-go/commits/francis/no-manifests-index/
I am sadly still getting troubles with tests on Predecessors() as, for the moment, I do not understand the link between resolver.Tag() and graph.Predecessors().
I will continue to dig and follow-up with my findings here.

@eiffel-fl
Copy link
Contributor Author

Hi!

I continued my investigation and now I understand why you need to store the manifests in index.json.

I will take the example of TestReadOnlyStore_DirFS.
At first, everything lives in memory and you creates the graph as the same time as you Push(), since Push() calls Index() which creates the graph.
If you tweak the tag(), which is called by Push() for manifest, to remove writing manifests to index.json, with, for example, the following patch:

diff --git a/content/oci/oci.go b/content/oci/oci.go
index 997543c..c737796 100644
--- a/content/oci/oci.go
+++ b/content/oci/oci.go
@@ -429,13 +432,6 @@ func (s *Store) saveIndex() error {
                        tagged.Add(desc.Digest)
                }
        }
-       // 2. Add descriptors that are not associated with any tag
-       for ref, desc := range refMap {
-               if ref == desc.Digest.String() && !tagged.Contains(desc.Digest) {
-                       // skip tagged ones since they have been added in step 1
-                       manifests = append(manifests, deleteAnnotationRefName(desc))
-               }
-       }
 
        s.index.Manifests = manifests
        return s.writeIndexFile()

You do not have troubles, as long as everything resides in memory.
But, in this test, we later call NewFromFS() which reads index.json through loadIndexFile() to create the graph.
Sadly, some links are missing as manifests are not written in index.json, so we cannot Fetch() them later in the code.

So, out of the blue, I think we need to explore the filesystem to create all the links.
Indeed, if we have this index.json:

$ sudo cat /var/lib/ig/oci-store/index.json | jq            francis/no-manifests-index *% u+2-2
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "digest": "sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1",
      "size": 491,
      "annotations": {
        "org.opencontainers.image.ref.name": "docker.io/library/execs:latest"
      }
    }
  ]
}

We can find its associated manifests with:

$ sudo cat /var/lib/ig/oci-store/blobs/sha256/b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1 | jq
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:662ca58536d5a03864781eb42870f6712927981383f1dab1449d9adf80067313",
      "size": 581,
      "platform": {
        "architecture": "amd64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb",
      "size": 581,
      "platform": {
        "architecture": "arm64",
        "os": "linux"
      }
    }
  ]
}

And so on, we can get the layers with:

$ sudo cat /var/lib/ig/oci-store/blobs/sha256/f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb | jq
{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.gadget.config.v1+yaml",
    "digest": "sha256:b4be31f07444fb9be6dee8cd1349aab3c780d333b63406be10c5c14c1b39dd37",
    "size": 1853,
    "annotations": {
      "org.opencontainers.image.title": "config.yaml"
    }
  },
  "layers": [
    {
      "mediaType": "application/vnd.gadget.ebpf.program.v1+binary",
      "digest": "sha256:c06a86dad7e582686b53bec9c729b92fc56c838b59cc11d0105c2c20f3158dcd",
      "size": 125232,
      "annotations": {
        "org.opencontainers.image.authors": "TODO: authors",
        "org.opencontainers.image.description": "TODO: description",
        "org.opencontainers.image.title": "program.o"
      }
    }
  ]
}

First of all, is my understanding correct?
Second, what do you think of the proposal to loop through the filesystem? I sadly do not see other solutions but I am a bit afraid of the I/O cost.
Anyway, we are already going through the filesystem when we use NewFromFS(), so maybe we can bear with it.

Best regards.

@eiffel-fl
Copy link
Contributor Author

eiffel-fl commented Jan 8, 2024

OK, I took another look and my conclusion were not really accurate.
We really need to have manifests written in index.json, if they do not have predecessors.
Indeed, without predecessors, if they are not present in index.json, the current code cannot find them.
So, we can go two ways:

  1. We can only write manifests in index.json if they do not have a tagged predecessor. This is what I do in this commit 732d87d:
$ sudo cat /var/lib/ig/oci-store/index.json | jq         francis/fsslower-statfs *% u=
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "digest": "sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1",
      "size": 491,
      "annotations": {
        "org.opencontainers.image.ref.name": "docker.io/library/execs:latest"
      }
    }
  ]
}

On the other hand, manifests which do not have predecessors, like

generateArtifactManifest(descs[6], descs[11]) // Blob 12
, are still written, so everything is still correct.

  1. We can go even further and remove the writing of all manifests to index.json.
    To achieve this, we would need to go through all files under /blobs checking if they are manifests.
    If there is a manifest which is currently not in the graph, because it has no predecessors, we would need to add it to the graph as well as all its successors.

What do you think?
Does it sound reasonable for you to first go with 1. in a dedicated PR and then go further with 2.?

@yizha1 yizha1 added this to the v2.4.0 milestone Jan 9, 2024
@Wwwsylvia
Copy link
Member

Hi @eiffel-fl !
The way we handle index.json has a fundamental impact on the entire OCI store, so we need more time to carefully consider the design. We can start with some brainstorming here.

Basically, we need to ensure that:

  • It is possible to discover untagged referrers (since people usually won't tag referrers)
  • Manifests can be resolved by either their tags or digests

Additionally, it is not necessary to record nodes in index.json that:

  • Can be traversed from the roots (i.e., nodes in index.json)
  • Are not tagged

@eiffel-fl
Copy link
Contributor Author

Hi!

Hi @eiffel-fl ! The way we handle index.json has a fundamental impact on the entire OCI store, so we need more time to carefully consider the design. We can start with some brainstorming here.

Indeed, more discussion are needed as this problem is not trivial to solve.

Basically, we need to ensure that:

* It is possible to discover untagged referrers (since people usually won't tag referrers)

I do not have a patch for this at the moment and I will need to think more about the consequences.

* Manifests can be resolved by either their tags or digests

To resolve manifests with their digest, I wrote this patch:
1c01677
At the moment, I do not use this patch in another one and I did not add test, so I cannot guarantee it works.

Additionally, it is not necessary to record nodes in index.json that:

* Can be traversed from the roots (i.e., nodes in `index.json`)

* Are not tagged

Indeed, if there is a path to a manifest, we do not need to store it in index.json, this is achieved by this patch:
732d87d

Best regards.

@Wwwsylvia
Copy link
Member

Hi @eiffel-fl , after an offline discussion, we realized that we still want all manifests to be recorded in index.json.
The roles of index.json are:

  • To serve as the entry point of the graph, as defined in the spec
  • To function as a tagging service,as defined in the spec
  • To maintain a list of manifests, where the manifests may or may not have tags

The idea is that every manifest is a root of an artifact, so it should be in the list of roots (index.json), no matter it is tagged or not.

Since there is no change in the design of index.json, we can still ensure:

  • It is possible to discover untagged referrers (since people usually won't tag referrers)
  • Manifests can be resolved by either their tags or digests

To resolve your original concern:

This behavior can create troubles, as when the manifests in index.json will not be automatically deleted when the surrounding index is deleted:

We re-consider the definition of garbage produced by Delete(): The untagged successor manifests are also garbage.
@wangxiaoxuan273 has sent a PR #680 for this change , can you check if that works for you?

@eiffel-fl
Copy link
Contributor Author

Hi!

Hi @eiffel-fl , after an offline discussion, we realized that we still want all manifests to be recorded in index.json. The roles of index.json are:

* To serve as the entry point of the graph, as defined in the [spec](https://github.com/opencontainers/image-spec/blob/v1.1.0-rc5/image-layout.md#indexjson-file)

* To function as a tagging service,as defined in the [spec](https://github.com/opencontainers/image-spec/blob/v1.1.0-rc5/image-layout.md#indexjson-file)

* To maintain a list of manifests, where the manifests may or may not have tags

The idea is that every manifest is a root of an artifact, so it should be in the list of roots (index.json), no matter it is tagged or not.

With regard to the spec, this definitely makes sense!

Since there is no change in the design of index.json, we can still ensure:

* It is possible to discover untagged referrers (since people usually won't tag referrers)

* Manifests can be resolved by either their tags or digests

To resolve your original concern:

This behavior can create troubles, as when the manifests in index.json will not be automatically deleted when the surrounding index is deleted:

We re-consider the definition of garbage produced by Delete(): The untagged successor manifests are also garbage. @wangxiaoxuan273 has sent a PR #680 for this change , can you check if that works for you?

I will for sure take a look at it 👀!

Best regards.

@eiffel-fl
Copy link
Contributor Author

Closing this issue as:

  1. Manifests have to be present in index.json as they are the root of an artifact.
  2. On deletion, untagged manifests will be removed by feat: mark untagged manifests as garbage during GC and delete #680.

With regard to this:

Manifests can be resolved by either their tags or digests

I wrote this commit:
eiffel-fl@1c01677
Can you please share your insights on it?
If it is useful for you, I will just open a PR.

Wwwsylvia pushed a commit that referenced this issue Jan 23, 2024
Related to #664

Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>
@Wwwsylvia
Copy link
Member

Wwwsylvia commented Jan 23, 2024

Hi @eiffel-fl , I'm wondering why do you need this change? Using the current implantation, you should be able to resolve a manifest with the correct media type by its digest.

@eiffel-fl
Copy link
Contributor Author

Hi!

Hi @eiffel-fl , I'm wondering why do you need this change? Using the current implantation, you should be able to resolve a manifest with the correct media type by its digest.

I personally do not need this code, all I needed was already perfectly merged in #680.
I will take a deeper look at the code to see which part already resolve manifest by digest, I should have missed it.

Best regards.

@Wwwsylvia
Copy link
Member

@eiffel-fl See

oras-go/content/oci/oci.go

Lines 282 to 294 in 66ea3df

// attempt resolving manifest
desc, err := s.tagResolver.Resolve(ctx, reference)
if err != nil {
if errors.Is(err, errdef.ErrNotFound) {
// attempt resolving blob
return resolveBlob(os.DirFS(s.root), reference)
}
return ocispec.Descriptor{}, err
}
if reference == desc.Digest.String() {
return descriptor.Plain(desc), nil
}

@eiffel-fl
Copy link
Contributor Author

@eiffel-fl See

oras-go/content/oci/oci.go

Lines 282 to 294 in 66ea3df

// attempt resolving manifest
desc, err := s.tagResolver.Resolve(ctx, reference)
if err != nil {
if errors.Is(err, errdef.ErrNotFound) {
// attempt resolving blob
return resolveBlob(os.DirFS(s.root), reference)
}
return ocispec.Descriptor{}, err
}
if reference == desc.Digest.String() {
return descriptor.Plain(desc), nil
}

Indeed! Thank you for sharing :D!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants