Issue #7923
closedmanifest requests do not match advertised checksum under some situations
Description
Create a remote and sync:
https://quay.io/
foreman/busybox-test
Then curl the latest manifest:
curl -vv https://pp-katello-ser-nightly-centos7.windhelm.example.com/v2/test_organization-test_product-foremanbusybox/manifests/latest
Get the redirect:
Location: https://pp-katello-ser-nightly-centos7.windhelm.example.com/pulp/container/test_organization-test_product-foremanbusybox/manifests/latest?validate_token=ded5ee1946d5293e716abc51b08ff981989fb78644104e92133dabe535b256e5:bcffb6bd722a286a4baaba5b9841e3fe038035b0140e9dfcadef2fc2db112de0
Curl that:
curl -vvv https://pp-katello-ser-nightly-centos7.windhelm.example.com/pulp/container/test_organization-test_product-foremanbusybox/manifests/latest?validate_token=ded5ee1946d5293e716abc51b08ff981989fb78644104e92133dabe535b256e5:bcffb6bd722a286a4baaba5b9841e3fe038035b0140e9dfcadef2fc2db112de0
Notice the digest:
< Docker-Content-Digest: sha256:13280b5914050853a87d662c3229d42b61544e36dd4515f06e188835f3407468
However if you calculate the checksum, you get a different checksum. In fact every time i download it, the manifest changes.
This prevents pulp2 from being able to sync from pulp3. Strangely podman pull still works fine.
Updated by jsherril@redhat.com about 4 years ago
- Priority changed from Normal to High
- Tags Katello added
Updated by ipanova@redhat.com about 4 years ago
The digest changes every time because you are requesting shema1 ( because i do not see any accept_headers in you request)
Pulp3 converts manifest latest
which is probably schema2 into schema1.
Conversion on the fly creates new digest every time.
Can you provide more info and logs/tracebacks how pulp2 fails when syncing from pulp3?
Updated by dkliban@redhat.com about 4 years ago
Here is a theory that I need to test out:
The initial request is sent with the correct headers. However, when pulp 3 sends a redirect to the content app, nectar fails to send those headers when following the redirect.
Updated by ipanova@redhat.com about 4 years ago
- Triaged changed from No to Yes
- Sprint set to Sprint 87
Updated by jsherril@redhat.com about 4 years ago
For blobs, it looks to me like they are coming in with
ACCEPT= */*
Strangely when we fix our installer to put back katello as being in front of '/v2/' requests the problem goes away, even though we are forwarding the Accept header to the api app (and then redirecting the client to the content app directly). I don't see why that would make a difference since the request to the content app comes straight from the client directly
Updated by jsherril@redhat.com about 4 years ago
- Priority changed from High to Normal
This was on pulp-container 2.1.0
Updated by ipanova@redhat.com about 4 years ago
- Sprint deleted (
Sprint 87)
Taking off the sprint for now, will come back to it later
Updated by ipanova@redhat.com about 4 years ago
I had an enlightenment on where might be the issue and how to reproduce. Instead of using pulp2migration box which has some sever misconfiguration that prevented from proper testing I used 2 boxes - pulp3-fedora box and pulp2 migration box.
$ pulp-admin docker repo sync run --repo-id pulp3-repo
+----------------------------------------------------------------------+
Synchronizing Repository [pulp3-repo]
+----------------------------------------------------------------------+
This command may be exited via ctrl+c without affecting the request.
Task Failed
The Manifest digest does not match the expected value. The remote feed announced
a digest of
sha256:6e11c15668e7d20d60fe5a790c16b0aedc90b725c7f715aeef8ccc7e22fb7ee6, but the
downloaded digest was
sha256:94c54600f6939911c4ed74fae49c78e00808544803a4a883e65d697a7e89c4d3.
The digest does not match because in the headers we are sending the digest of the non-converted manifest. This is not correct. The digest should contain converted schema payload with the stripped out signature
The changes will happen in the schema_conversion code.
https://github.com/pulp/pulp_container/blob/master/pulp_container/app/schema_convert.py#L35
https://github.com/pulp/pulp_container/blob/master/pulp_container/app/schema_convert.py#L43
I suggest the Schema2toSchema1Converter.convert
method return signed manifest as well as digest of the manifest without the signature https://github.com/pulp/pulp_container/blob/master/pulp_container/app/schema_convert.py#L97
I made some calls to dockerhub and here is the proof that the digest sent in the header is the digest calculated on manifest without the signature. The digest does not change, if it would contain the signature it would change with every call since signature has different fingerprint in every call.
(call1)
$ ./docker-token library/busybox:latest
{'Content-Length': '2735', 'Content-Type': 'application/vnd.docker.distribution.manifest.v1+prettyjws', 'Docker-Content-Digest': 'sha256:af39243ae92c12504f260709da43f1b4bd17a802a86a367ffcd7f4913688d92a', 'Docker-Distribution-Api-Version': 'registry/2.0', 'Etag': '"sha256:af39243ae92c12504f260709da43f1b4bd17a802a86a367ffcd7f4913688d92a"', 'Date': 'Mon, 21 Dec 2020 10:52:14 GMT', 'Strict-Transport-Security': 'max-age=31536000', 'RateLimit-Limit': '200;w=21600', 'RateLimit-Remaining': '199;w=21600'}
(call2)
[ipanova@fluffy pulp_container]$ ./docker-token library/busybox:latest
{'Content-Length': '2735', 'Content-Type': 'application/vnd.docker.distribution.manifest.v1+prettyjws', 'Docker-Content-Digest': 'sha256:af39243ae92c12504f260709da43f1b4bd17a802a86a367ffcd7f4913688d92a', 'Docker-Distribution-Api-Version': 'registry/2.0', 'Etag': '"sha256:af39243ae92c12504f260709da43f1b4bd17a802a86a367ffcd7f4913688d92a"', 'Date': 'Mon, 21 Dec 2020 10:52:21 GMT', 'Strict-Transport-Security': 'max-age=31536000', 'RateLimit-Limit': '200;w=21600', 'RateLimit-Remaining': '199;w=21600'}
versus we are sending digest of the original manifest
(pulp) [vagrant@pulp2-nightly-pulp3-source-centos7 _scripts]$ http HEAD "http://pulp3-source-fedora32.fluffy.example.com/v2/test/manifests/manifest_e" --follow
HTTP/1.1 200 OK
Access-Control-Expose-Headers: Correlation-ID
Allow: GET, PUT, HEAD, OPTIONS
Connection: keep-alive
Content-Length: 942
Correlation-ID: 275fbd944a0c421e9f9e8d114eff74aa
Date: Mon, 21 Dec 2020 11:14:01 GMT
Docker-Content-Digest: sha256:e7300fcf2becf0e60628ee003902f9e4b70b3ea1782f766fd5d45b59a2126f50
Docker-Distribution-Api-Version: registry/2.0
Location: /v2/test/manifests/sha256:e7300fcf2becf0e60628ee003902f9e4b70b3ea1782f766fd5d45b59a2126f50
Server: nginx/1.18.0
X-Frame-Options: SAMEORIGIN
$ http GET ":24817/pulp/api/v3/content/container/tags/?repository_version=/pulp/api/v3/repositories/container/container/78452742-15e0-4580-9ae1-da7e67232d32/versions/1/&name=manifest_e"
HTTP/1.1 200 OK
Access-Control-Expose-Headers: Correlation-ID
Allow: GET, HEAD, OPTIONS
Connection: close
Content-Length: 305
Content-Type: application/json
Correlation-ID: 59c853adcd8148738658dc630d047ccc
Date: Mon, 21 Dec 2020 11:25:02 GMT
Server: gunicorn/20.0.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN
{
"count": 1,
"next": null,
"previous": null,
"results": [
{
"name": "manifest_e",
"pulp_created": "2020-12-21T10:41:35.656901Z",
"pulp_href": "/pulp/api/v3/content/container/tags/a2b6e5b8-ba4d-4d6f-89f3-a0415dc12c57/",
"tagged_manifest": "/pulp/api/v3/content/container/manifests/d57301ab-0abb-4feb-a032-66ec930e7f84/"
}
]
}
(pulp) [vagrant@pulp3-source-fedora32 _scripts]$ http GET :24817/pulp/api/v3/content/container/manifests/d57301ab-0abb-4feb-a032-66ec930e7f84/?fields=digest
HTTP/1.1 200 OK
Access-Control-Expose-Headers: Correlation-ID
Allow: GET, HEAD, OPTIONS
Connection: close
Content-Length: 84
Content-Type: application/json
Correlation-ID: a3ba96b2f8d04edf970125deee394ea6
Date: Mon, 21 Dec 2020 11:25:24 GMT
Server: gunicorn/20.0.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN
{
"digest": "sha256:e7300fcf2becf0e60628ee003902f9e4b70b3ea1782f766fd5d45b59a2126f50"
}
Updated by mdellweg about 4 years ago
So this digest is nothing we can know in a head request without performing the actual conversion? That means, as the clients usually perform a head first, we would need to convert things twice regularly. Or we need to turn them into alternate artifacts in the db.
When you do the request to dockerhub twice, do the blobs you receive change? Maybe they store the converted result.
Updated by ipanova@redhat.com about 4 years ago
mdellweg wrote:
So this digest is nothing we can know in a head request without performing the actual conversion? That means, as the clients usually perform a head first, we would need to convert things twice regularly. Or we need to turn them into alternate artifacts in the db.
When you do the request to dockerhub twice, do the blobs you receive change? Maybe they store the converted result.
Blobs do not change, the only object that can change is the manifest converted on the fly. HEAD request is usually made to check on an existing resource, since the manifest gets converted on the fly it is questionable to consider it as an existing resource. I suggest to issue 404 if there is no such tag that matches the list of accepted_headers sent along with the request. This would be the place to adjust the logic in case of us deciding to return digest of converted schema https://github.com/pulp/pulp_container/blob/master/pulp_container/app/registry_api.py#L545 dockerhub returns a 400 on such request
I also noticed that we are not handling properly tag redirects https://github.com/pulp/pulp_container/blob/master/pulp_container/app/redirects.py#L39 We should also take into account the media_types as in S3 redirects.
$ curl -X GET -H "Accept:application/vnd.docker.distribution.manifest.list.v2+json" "http://pulp3-source-fedora32.fluffy.example.com/v2/test3/manifests/ml_i" -L
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
"manifests": [
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"size": 735,
"digest": "sha256:94391db5d7dae06e2e463ca41a0b8b73381817d3ab23d7a52c16db60b89a966e",
"platform": {
"architecture": "amd64",
"os": "linux"
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"size": 735,
"digest": "sha256:bdc42bf398edffb7d5cee329d16bae00439fcc7ee963e8089f293018268ffae1",
"platform": {
"architecture": "amd64",
"os": "linux"
}
}
]
}(pulp) [vagrant@pulp2-nightly-pulp3-source-centos7 _scripts]$ curl -X GET -H "Accept:application/vnd.docker.distribution.manifest.v1+json" "http://pulp3-source-fedora32.fluffy.example.com/v2/test3/manifests/ml_i" -L
500 Internal Server Error
Updated by ipanova@redhat.com about 4 years ago
- Priority changed from Normal to High
- Sprint set to Sprint 88
Podman pull passes because in the first place it always sends all the media_types in the accept headers so no conversion logic is involved and the redirects happen to the original object.
We need to fix redirects(1) as well as schema conversion(2) to enable pulp3 to pulp3 sync as well as pulp3 to pulp2 sync. (3) handle head request for tags, my suggestions would be to issue 404 if there is no such tag that matches the list of accepted_headers sent along with the request
Updated by lmjachky almost 4 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to lmjachky
Updated by lmjachky almost 4 years ago
- Status changed from ASSIGNED to NEW
- Assignee deleted (
lmjachky)
I am unassigning myself because I cannot sync from docker.io due to the pull rate limit.
Updated by ipanova@redhat.com almost 4 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to lmjachky
Updated by pulpbot almost 4 years ago
- Status changed from ASSIGNED to POST
Added by Lubos Mjachky almost 4 years ago
Added by Lubos Mjachky almost 4 years ago
Revision 1289587e | View on GitHub
Return a legacy manifest when the conversion was not required ref #7923
Added by Lubos Mjachky almost 4 years ago
Revision 0a362e23 | View on GitHub
Return the digest of an unsigned manifest
closes #7923
Added by Lubos Mjachky almost 4 years ago
Revision 1289587e | View on GitHub
Return a legacy manifest when the conversion was not required ref #7923
Updated by Anonymous almost 4 years ago
- Status changed from POST to MODIFIED
Applied in changeset 0a362e236bcf2d5ecac15afd4d8d5d166f732637.
Updated by pulpbot almost 4 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Return the digest of an unsigned manifest
closes #7923