Project

Profile

Help

Issue #7923

manifest requests do not match advertised checksum under some situations

Added by jsherril@redhat.com 11 months ago. Updated 8 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
High
Assignee:
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Katello
Sprint:
Sprint 89
Quarter:

Description

Create a remote and sync:

https://quay.io/
foreman/busybox-test

Then curl the latest manifest:

curl -vv https://pp-katello-ser-nightly-centos7.windhelm.example.com/v2/test_organization-test_product-foremanbusybox/manifests/latest

Get the redirect:

Location: https://pp-katello-ser-nightly-centos7.windhelm.example.com/pulp/container/test_organization-test_product-foremanbusybox/manifests/latest?validate_token=ded5ee1946d5293e716abc51b08ff981989fb78644104e92133dabe535b256e5:bcffb6bd722a286a4baaba5b9841e3fe038035b0140e9dfcadef2fc2db112de0

Curl that:

curl  -vvv https://pp-katello-ser-nightly-centos7.windhelm.example.com/pulp/container/test_organization-test_product-foremanbusybox/manifests/latest?validate_token=ded5ee1946d5293e716abc51b08ff981989fb78644104e92133dabe535b256e5:bcffb6bd722a286a4baaba5b9841e3fe038035b0140e9dfcadef2fc2db112de0

Notice the digest:

< Docker-Content-Digest: sha256:13280b5914050853a87d662c3229d42b61544e36dd4515f06e188835f3407468

However if you calculate the checksum, you get a different checksum. In fact every time i download it, the manifest changes.

This prevents pulp2 from being able to sync from pulp3. Strangely podman pull still works fine.

Associated revisions

Revision 0a362e23 View on GitHub
Added by Lubos Mjachky 9 months ago

Return the digest of an unsigned manifest

closes #7923

Revision 1289587e View on GitHub
Added by Lubos Mjachky 9 months ago

Return a legacy manifest when the conversion was not required ref #7923

Revision 0a362e23 View on GitHub
Added by Lubos Mjachky 9 months ago

Return the digest of an unsigned manifest

closes #7923

Revision 1289587e View on GitHub
Added by Lubos Mjachky 9 months ago

Return a legacy manifest when the conversion was not required ref #7923

History

#1 Updated by jsherril@redhat.com 11 months ago

  • Description updated (diff)

#2 Updated by jsherril@redhat.com 11 months ago

  • Description updated (diff)

#3 Updated by jsherril@redhat.com 11 months ago

  • Priority changed from Normal to High
  • Tags Katello added

#4 Updated by ipanova@redhat.com 11 months ago

The digest changes every time because you are requesting shema1 ( because i do not see any accept_headers in you request) Pulp3 converts manifest latest which is probably schema2 into schema1. Conversion on the fly creates new digest every time.

Can you provide more info and logs/tracebacks how pulp2 fails when syncing from pulp3?

#5 Updated by dkliban@redhat.com 11 months ago

Here is a theory that I need to test out:

The initial request is sent with the correct headers. However, when pulp 3 sends a redirect to the content app, nectar fails to send those headers when following the redirect.

#6 Updated by ipanova@redhat.com 11 months ago

  • Triaged changed from No to Yes
  • Sprint set to Sprint 87

#7 Updated by jsherril@redhat.com 11 months ago

For blobs, it looks to me like they are coming in with

ACCEPT= */*

Strangely when we fix our installer to put back katello as being in front of '/v2/' requests the problem goes away, even though we are forwarding the Accept header to the api app (and then redirecting the client to the content app directly). I don't see why that would make a difference since the request to the content app comes straight from the client directly

#8 Updated by jsherril@redhat.com 11 months ago

  • Priority changed from High to Normal

This was on pulp-container 2.1.0

#9 Updated by ipanova@redhat.com 10 months ago

  • Sprint deleted (Sprint 87)

Taking off the sprint for now, will come back to it later

#10 Updated by ipanova@redhat.com 10 months ago

I had an enlightenment on where might be the issue and how to reproduce. Instead of using pulp2migration box which has some sever misconfiguration that prevented from proper testing I used 2 boxes - pulp3-fedora box and pulp2 migration box.

$ pulp-admin docker repo sync run --repo-id pulp3-repo
+----------------------------------------------------------------------+
                 Synchronizing Repository [pulp3-repo]
+----------------------------------------------------------------------+

This command may be exited via ctrl+c without affecting the request.



Task Failed

The Manifest digest does not match the expected value. The remote feed announced
a digest of
sha256:6e11c15668e7d20d60fe5a790c16b0aedc90b725c7f715aeef8ccc7e22fb7ee6, but the
downloaded digest was
sha256:94c54600f6939911c4ed74fae49c78e00808544803a4a883e65d697a7e89c4d3.

The digest does not match because in the headers we are sending the digest of the non-converted manifest. This is not correct. The digest should contain converted schema payload with the stripped out signature

The changes will happen in the schema_conversion code. https://github.com/pulp/pulp_container/blob/master/pulp_container/app/schema_convert.py#L35 https://github.com/pulp/pulp_container/blob/master/pulp_container/app/schema_convert.py#L43 I suggest the Schema2toSchema1Converter.convert method return signed manifest as well as digest of the manifest without the signature https://github.com/pulp/pulp_container/blob/master/pulp_container/app/schema_convert.py#L97

I made some calls to dockerhub and here is the proof that the digest sent in the header is the digest calculated on manifest without the signature. The digest does not change, if it would contain the signature it would change with every call since signature has different fingerprint in every call.

(call1)
$ ./docker-token library/busybox:latest
{'Content-Length': '2735', 'Content-Type': 'application/vnd.docker.distribution.manifest.v1+prettyjws', 'Docker-Content-Digest': 'sha256:af39243ae92c12504f260709da43f1b4bd17a802a86a367ffcd7f4913688d92a', 'Docker-Distribution-Api-Version': 'registry/2.0', 'Etag': '"sha256:af39243ae92c12504f260709da43f1b4bd17a802a86a367ffcd7f4913688d92a"', 'Date': 'Mon, 21 Dec 2020 10:52:14 GMT', 'Strict-Transport-Security': 'max-age=31536000', 'RateLimit-Limit': '200;w=21600', 'RateLimit-Remaining': '199;w=21600'}

 (call2)
[ipanova@fluffy pulp_container]$ ./docker-token library/busybox:latest
{'Content-Length': '2735', 'Content-Type': 'application/vnd.docker.distribution.manifest.v1+prettyjws', 'Docker-Content-Digest': 'sha256:af39243ae92c12504f260709da43f1b4bd17a802a86a367ffcd7f4913688d92a', 'Docker-Distribution-Api-Version': 'registry/2.0', 'Etag': '"sha256:af39243ae92c12504f260709da43f1b4bd17a802a86a367ffcd7f4913688d92a"', 'Date': 'Mon, 21 Dec 2020 10:52:21 GMT', 'Strict-Transport-Security': 'max-age=31536000', 'RateLimit-Limit': '200;w=21600', 'RateLimit-Remaining': '199;w=21600'}

versus we are sending digest of the original manifest

(pulp) [vagrant@pulp2-nightly-pulp3-source-centos7 _scripts]$ http HEAD "http://pulp3-source-fedora32.fluffy.example.com/v2/test/manifests/manifest_e" --follow
HTTP/1.1 200 OK
Access-Control-Expose-Headers: Correlation-ID
Allow: GET, PUT, HEAD, OPTIONS
Connection: keep-alive
Content-Length: 942
Correlation-ID: 275fbd944a0c421e9f9e8d114eff74aa
Date: Mon, 21 Dec 2020 11:14:01 GMT
Docker-Content-Digest: sha256:e7300fcf2becf0e60628ee003902f9e4b70b3ea1782f766fd5d45b59a2126f50
Docker-Distribution-Api-Version: registry/2.0
Location: /v2/test/manifests/sha256:e7300fcf2becf0e60628ee003902f9e4b70b3ea1782f766fd5d45b59a2126f50
Server: nginx/1.18.0
X-Frame-Options: SAMEORIGIN

$ http GET ":24817/pulp/api/v3/content/container/tags/?repository_version=/pulp/api/v3/repositories/container/container/78452742-15e0-4580-9ae1-da7e67232d32/versions/1/&name=manifest_e"
HTTP/1.1 200 OK
Access-Control-Expose-Headers: Correlation-ID
Allow: GET, HEAD, OPTIONS
Connection: close
Content-Length: 305
Content-Type: application/json
Correlation-ID: 59c853adcd8148738658dc630d047ccc
Date: Mon, 21 Dec 2020 11:25:02 GMT
Server: gunicorn/20.0.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN

{
    "count": 1,
    "next": null,
    "previous": null,
    "results": [
        {
            "name": "manifest_e",
            "pulp_created": "2020-12-21T10:41:35.656901Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/a2b6e5b8-ba4d-4d6f-89f3-a0415dc12c57/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/d57301ab-0abb-4feb-a032-66ec930e7f84/"
        }
    ]
}

(pulp) [vagrant@pulp3-source-fedora32 _scripts]$ http GET :24817/pulp/api/v3/content/container/manifests/d57301ab-0abb-4feb-a032-66ec930e7f84/?fields=digest
HTTP/1.1 200 OK
Access-Control-Expose-Headers: Correlation-ID
Allow: GET, HEAD, OPTIONS
Connection: close
Content-Length: 84
Content-Type: application/json
Correlation-ID: a3ba96b2f8d04edf970125deee394ea6
Date: Mon, 21 Dec 2020 11:25:24 GMT
Server: gunicorn/20.0.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN

{
    "digest": "sha256:e7300fcf2becf0e60628ee003902f9e4b70b3ea1782f766fd5d45b59a2126f50"
}

#11 Updated by mdellweg 10 months ago

So this digest is nothing we can know in a head request without performing the actual conversion? That means, as the clients usually perform a head first, we would need to convert things twice regularly. Or we need to turn them into alternate artifacts in the db.

When you do the request to dockerhub twice, do the blobs you receive change? Maybe they store the converted result.

#12 Updated by ipanova@redhat.com 10 months ago

mdellweg wrote:

So this digest is nothing we can know in a head request without performing the actual conversion? That means, as the clients usually perform a head first, we would need to convert things twice regularly. Or we need to turn them into alternate artifacts in the db.

When you do the request to dockerhub twice, do the blobs you receive change? Maybe they store the converted result.

Blobs do not change, the only object that can change is the manifest converted on the fly. HEAD request is usually made to check on an existing resource, since the manifest gets converted on the fly it is questionable to consider it as an existing resource. I suggest to issue 404 if there is no such tag that matches the list of accepted_headers sent along with the request. This would be the place to adjust the logic in case of us deciding to return digest of converted schema https://github.com/pulp/pulp_container/blob/master/pulp_container/app/registry_api.py#L545 dockerhub returns a 400 on such request

I also noticed that we are not handling properly tag redirects https://github.com/pulp/pulp_container/blob/master/pulp_container/app/redirects.py#L39 We should also take into account the media_types as in S3 redirects.

$ curl -X GET -H "Accept:application/vnd.docker.distribution.manifest.list.v2+json" "http://pulp3-source-fedora32.fluffy.example.com/v2/test3/manifests/ml_i"  -L
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 735,
         "digest": "sha256:94391db5d7dae06e2e463ca41a0b8b73381817d3ab23d7a52c16db60b89a966e",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 735,
         "digest": "sha256:bdc42bf398edffb7d5cee329d16bae00439fcc7ee963e8089f293018268ffae1",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      }
   ]
}(pulp) [vagrant@pulp2-nightly-pulp3-source-centos7 _scripts]$ curl -X GET -H "Accept:application/vnd.docker.distribution.manifest.v1+json" "http://pulp3-source-fedora32.fluffy.example.com/v2/test3/manifests/ml_i"  -L
500 Internal Server Error

#13 Updated by ipanova@redhat.com 10 months ago

  • Priority changed from Normal to High
  • Sprint set to Sprint 88

Podman pull passes because in the first place it always sends all the media_types in the accept headers so no conversion logic is involved and the redirects happen to the original object.

We need to fix redirects(1) as well as schema conversion(2) to enable pulp3 to pulp3 sync as well as pulp3 to pulp2 sync. (3) handle head request for tags, my suggestions would be to issue 404 if there is no such tag that matches the list of accepted_headers sent along with the request

#14 Updated by lmjachky 10 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to lmjachky

#15 Updated by lmjachky 10 months ago

  • Status changed from ASSIGNED to NEW
  • Assignee deleted (lmjachky)

I am unassigning myself because I cannot sync from docker.io due to the pull rate limit.

#16 Updated by ipanova@redhat.com 10 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to lmjachky

#17 Updated by rchan 9 months ago

  • Sprint changed from Sprint 88 to Sprint 89

#18 Updated by pulpbot 9 months ago

  • Status changed from ASSIGNED to POST

#19 Updated by Anonymous 9 months ago

  • Status changed from POST to MODIFIED

#20 Updated by ipanova@redhat.com 9 months ago

  • Sprint/Milestone set to 2.3.0

#21 Updated by pulpbot 9 months ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Please register to edit this issue

Also available in: Atom PDF