Project

Profile

Help

Issue #6581

closed

Pulpcore delete_orphans took too long

Added by iballou over 4 years ago. Updated about 4 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

Recently I ran delete_orphans on pulpcore-3.0.1-2 and it took 15 hours, which is quite a bit longer than it takes with Pulp 2. Pulp 2 took just under 3 minutes.

My Katello setup is as follows:

Repos: 72 BusyBox (Docker) 51 Alpine (Docker) 51 Bash (Docker) 72 Large File (70,000 synced units) 1 Very Large File (150,000 synced units) 5 File with 10 uploaded files each

Content views: 1 Large View with 100 versions: 90 versions with all Docker repos 5 versions with all Docker repos plus Large File 5 versions with only Large File Another view with 10 versions: All Busybox only 100 Little Views with 1 version each: All Docker repos


Files

orphan_profile2.png (816 KB) orphan_profile2.png dalley, 09/11/2020 03:43 AM
Actions #1

Updated by fao89 over 4 years ago

  • Triaged changed from No to Yes
  • Sprint set to Sprint 71
Actions #2

Updated by rchan over 4 years ago

  • Sprint changed from Sprint 71 to Sprint 72
Actions #3

Updated by rchan over 4 years ago

  • Sprint changed from Sprint 72 to Sprint 73
Actions #4

Updated by rchan over 4 years ago

  • Sprint changed from Sprint 73 to Sprint 74
Actions #5

Updated by rchan over 4 years ago

  • Sprint changed from Sprint 74 to Sprint 75
Actions #6

Updated by rchan over 4 years ago

  • Sprint changed from Sprint 75 to Sprint 76
Actions #7

Updated by rchan over 4 years ago

  • Sprint changed from Sprint 76 to Sprint 77
Actions #8

Updated by iballou over 4 years ago

When I hit this issue, I had just migrated from Pulp 2. Pulp 2 - 3 migration is still sort of a tech preview feature, so I think we'll be safe by ignoring the migration. It might cause unneeded headaches considering the content syncing takes a while too.

Instructions for creating a reproducer environment:

  1. Setup Forklift: https://github.com/theforeman/forklift along with the DNS steps (if you want).

  2. In your Forklift 99-local.yaml, add the following:

centos7-katello-3.15:
  box: centos7-katello-3.15
  cpus: 8
  memory: 32768
  1. Run vagrant up centos7-katello-3.15 and then vagrant ssh centos7-katello-3.15

  2. Follow these steps to improve the Katello server's tuning: https://projects.theforeman.org/issues/29370#note-5

  3. Navigate to https://<your VM's hostname> and make sure the Katello server is up and running

  4. Clone my scripts repo: https://github.com/ianballou/katello-performance-scripts

  5. Run migration_performance_test --setup. It will likely take a number of hours due to the amount of content. Note that I had troubles syncing many repositories at the same time with Pulp 2, so it's mostly serial syncs. You might want to run this overnight.

  6. Run migration_performance_test_tier_2 --setup. This script won't take nearly as long as the first one.

  7. Check the foreman tasks monitor to ensure all the tasks passed successfully. Double check that all the content is properly synced and that all of the content views exist.

  8. To try orphan cleanup, run foreman-rake katello:delete_orphaned_content or just do it through the Pulp 3 API.

Actions #9

Updated by dalley over 4 years ago

That is excellent information Ian, it should be extremely helpful in reproducing this.

I had a couple of different theories about why this might be slow, and possibly they're all true at once:

Actions #10

Updated by gerrod over 4 years ago

  • Assignee set to gerrod
Actions #11

Updated by rchan over 4 years ago

  • Sprint changed from Sprint 77 to Sprint 78
Actions #12

Updated by rchan over 4 years ago

  • Sprint changed from Sprint 78 to Sprint 79
Actions #13

Updated by daviddavis over 4 years ago

  • Assignee deleted (gerrod)
Actions #14

Updated by rchan over 4 years ago

  • Sprint changed from Sprint 79 to Sprint 80
Actions #15

Updated by ttereshc over 4 years ago

Just to confirm that even on a small setup it takes longer than expected. I had 2 RPM repositories, centos8 baseod kickstart and fedora32. That's it, no other repositories, no other content. Both repos were synced with on_demand policy, so only metadata files are on the fs. I removed both repos and ran orphan clean-up which took 3.5 minutes (removed 57230 content units and 13 artifacts).

Actions #16

Updated by rchan over 4 years ago

  • Sprint changed from Sprint 80 to Sprint 81
Actions #17

Updated by dalley over 4 years ago

  • Status changed from NEW to ASSIGNED
Actions #18

Updated by dalley over 4 years ago

  • Assignee set to dalley
Actions #19

Updated by dalley over 4 years ago

  • Status changed from ASSIGNED to POST
Actions #20

Updated by dalley over 4 years ago

The PR linked improves performance by about 40%-ish and improves memory consumption. I don't really think that is enough of an improvement to close this issue, but I also don't see an easy way to improve it further. It looks like django-lifecycle used by RBAC is incurring some overhead here (65%?) but I'm not sure that can be easily rectified (Brian agreed).

Brian has some ideas about preventing it from blocking other tasks, which should help mitigate the practical issues of it taking a long time, but that's technically a separate issue. I'm going to send this one back to NEW and remove it from the sprint since it was mitigated without a full fix (and without one forthcoming).

Actions #21

Updated by dalley over 4 years ago

  • Status changed from POST to NEW
  • Assignee deleted (dalley)
  • Sprint deleted (Sprint 81)

Added by dalley over 4 years ago

Revision d89a488d | View on GitHub

Improve performance of orphan cleanup slightly

Also memory consumption.

re: #6581 https://pulp.plan.io/issues/6581

Actions #22

Updated by dalley about 4 years ago

  • Status changed from NEW to MODIFIED
  • Assignee set to dalley
Actions #23

Updated by bmbouter about 4 years ago

  • Sprint/Milestone set to 3.7.0
Actions #24

Updated by pulpbot about 4 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF