Project

Profile

Help

Issue #5161

Removing manifest_lists from a repository does not purge all newly unlinked manifests

Added by amacdona@redhat.com 4 months ago. Updated about 2 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Severity:
2. Medium
Version - Docker:
Platform Release:
2.21.0
Blocks Release:
Target Release - Docker:
OS:
Backwards Incompatible:
No
Triaged:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 56

Description

To reproduce:
1. Sync busybox
2. Create a new repo foo
3. add all manifest_lists from busybox to foo
4. remove all manifest lists from foo
5. check content unit counts, some manifests (and their referenced blobs) are still in foo

The problem occurs when removing 2 or more manifest_lists at the same time that reference the same manifests.

Manifest List A:
Manifest 1, 2, 3
Manifest LIst B:
Manifest 2, 3, 4

After adding AB then removing AB, Manifests 2, 3 will remain in the repo.

If removing the manifest_lists 1 at a time, no manifests remain in the repo (expected).

The faulty query is here:
https://github.com/pulp/pulp_docker/blob/2-master/plugins/pulp_docker/plugins/importers/importer.py#L501-L502

Unit filter should actually be:
unit_filters={'digest': {'$nin': [list, of, all, manifest_lists, being, removed]}}

This problem was discovered as a discrepency between a fix for https://pulp.plan.io/issues/4549 and the code on 2-master, and will be fixed in the same commit.


Related issues

Related to Docker Support - Issue #4549: Removing docker manifests from a docker repository takes a long time CLOSED - CURRENTRELEASE Actions

Associated revisions

Revision 76f5894b View on GitHub
Added by amacdona@redhat.com 4 months ago

Flatten queries for content unit removal

https://pulp.plan.io/issues/4549

Replace the recursive pattern with a fixed number of larger queries.

Additionally, reorder the content removal to "top down". This will fail
more gracefully; failure leaves orphans (safe) rather than user-facing
unlinked content (unsafe). This requires the additional plugin step of
removing the explicitly given content, which is normally handled by pulp
platform.

A side effect of this change is the correction of a bug that did not
remove shared content, even if all linked content is removed.
https://pulp.plan.io/issues/5161

fixes #5161
fixes #4549

Revision 76f5894b View on GitHub
Added by amacdona@redhat.com 4 months ago

Flatten queries for content unit removal

https://pulp.plan.io/issues/4549

Replace the recursive pattern with a fixed number of larger queries.

Additionally, reorder the content removal to "top down". This will fail
more gracefully; failure leaves orphans (safe) rather than user-facing
unlinked content (unsafe). This requires the additional plugin step of
removing the explicitly given content, which is normally handled by pulp
platform.

A side effect of this change is the correction of a bug that did not
remove shared content, even if all linked content is removed.
https://pulp.plan.io/issues/5161

fixes #5161
fixes #4549

Revision ecba1db4 View on GitHub
Added by bherring 3 months ago

Refactor of test_remove.py with changes from 4549

With the refactor of the docker importer's remove function to
increase performance, content removal needs to be functional verified.
The cases covered with content post-count verification for all units:
1. Remove all manifest_lists sequentially.
2. Remove all manifests sequentially.
3. Remove all blobs sequentially.
4. Remove all manifest_lists batch.
5. Remove all manifests batch.
6. Remove all blobs batch.
7. Remove some non-shared manifest lists.
8. Remove some non-shared manifest.
9. Remove some shared manifests lists and verify shared units are not
recursively removed.
10. Remove some shared manifests and verify shared units are not
recursively removed.
The fixture includes:
  • 2 relatively independent manifest lists (no shared manifests,
    no shared blobs between them)
  • 2 manifest lists that share some (but not all) manifests, and those
    manifest share some (but not all) blobs. This only requires the creation
    of 1 manifest list that shares some content with one of the first
    “independent manifest lists”.
  • 2 relatively independent manifests
  • 2 manifests that share (some but not all) blobs
In order to sync the content, each content unit must be recursively related
to at least 1 tag.

refs #4549 #5161

closes #5181

Revision fbc90740 View on GitHub
Added by amacdona@redhat.com 2 months ago

Flatten queries for content unit removal

https://pulp.plan.io/issues/4549

Replace the recursive pattern with a fixed number of larger queries.

Additionally, reorder the content removal to "top down". This will fail
more gracefully; failure leaves orphans (safe) rather than user-facing
unlinked content (unsafe). This requires the additional plugin step of
removing the explicitly given content, which is normally handled by pulp
platform.

A side effect of this change is the correction of a bug that did not
remove shared content, even if all linked content is removed.
https://pulp.plan.io/issues/5161

fixes #5161
fixes #4549

(cherry picked from commit 76f5894b7c593eafc8498d5215cb7e517cd4624b)

Revision fbc90740 View on GitHub
Added by amacdona@redhat.com 2 months ago

Flatten queries for content unit removal

https://pulp.plan.io/issues/4549

Replace the recursive pattern with a fixed number of larger queries.

Additionally, reorder the content removal to "top down". This will fail
more gracefully; failure leaves orphans (safe) rather than user-facing
unlinked content (unsafe). This requires the additional plugin step of
removing the explicitly given content, which is normally handled by pulp
platform.

A side effect of this change is the correction of a bug that did not
remove shared content, even if all linked content is removed.
https://pulp.plan.io/issues/5161

fixes #5161
fixes #4549

(cherry picked from commit 76f5894b7c593eafc8498d5215cb7e517cd4624b)

History

#1 Updated by amacdona@redhat.com 4 months ago

  • Related to Issue #4549: Removing docker manifests from a docker repository takes a long time added

#2 Updated by amacdona@redhat.com 4 months ago

  • Tags Pulp 2 added

#3 Updated by amacdona@redhat.com 4 months ago

  • Subject changed from Removing manifest_lists from a repository does not purge all manifests to Removing manifest_lists from a repository does not purge all newly unlinked manifests

#4 Updated by amacdona@redhat.com 4 months ago

The same issue occurs for blobs that are referenced by more than one manifest. They are correctly removed if manifests are removed one at a time, but shared blobs remain when deleting the manifests together.

This will also be fixed with #4589

#5 Updated by amacdona@redhat.com 4 months ago

  • Status changed from ASSIGNED to POST

#6 Updated by amacdona@redhat.com 4 months ago

  • Status changed from POST to MODIFIED

#7 Updated by amacdona@redhat.com 4 months ago

  • Description updated (diff)

#9 Updated by dalley 3 months ago

  • Platform Release set to 2.21.0

#11 Updated by dalley about 2 months ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Please register to edit this issue

Also available in: Atom PDF