Project

Profile

Help

Issue #2118

closed

Reduce runtime of file path migration

Added by mhrivnak almost 8 years ago. Updated about 5 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
2.10.0
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
Yes
Tags:
Pulp 2
Sprint:
Sprint 6
Quarter:

Description

When using a latent filesystem, such as NFS, users are finding that the file path migration can take a long time to run. Some have reported over 24 hours. Having investigated and done some quick-and-dirty PoCs, I found two options for improving performance.

I tested with 2 repos, about 5k RPMs each, and 20 published copies of one of them. The system had an NFS share mounted from a desktop machine on a 100Mbps link with latency <1ms.

On my setup, these two changes reduced migration time from 54 minutes to 14 minutes.

Remove Pruning

About half the time was spent searching for and deleting empty directories after the files themselves had been moved. There do not seem to be any opportunities to make the pruning go faster in python, but the same operation can be accomplished with a "find" command in a shell, and about 20% faster. Removing it from the migrations would allow the user to do this at their leisure, although there is possibly a small risk of the operation interfering with other pulp operations. If pulp is otherwise manipulating files and directories in /var/lib/pulp/content/, this command could inadvertently remove a directory out from under something that was about to use it. We may be able to limit the scope of the command to avoid parts of the filesystem used by pulp 2.8.

Introduce Concurrency

For the operations that move files around and fix symlinks, introducing a small number of threads to do the work concurrently has a big speed increase. These operations go roughly twice as fast with 4 threads as opposed to a single thread.

Both of these can be accomplished in a fairly short amount of time. I have working PoCs, and it's not a lot of code change. The changes can be done only in the platform, without the need to touch the plugins.

Actions #1

Updated by mhrivnak almost 8 years ago

  • Assignee set to mhrivnak
Actions #2

Updated by mhrivnak almost 8 years ago

  • Status changed from NEW to ASSIGNED
  • Sprint/Milestone set to 24
  • Sprint Candidate changed from No to Yes

Adding to the sprint per request from @jalberts

Actions #3

Updated by mhrivnak over 7 years ago

  • Status changed from ASSIGNED to POST
Actions #4

Updated by jortel@redhat.com over 7 years ago

Team decided it would be more appropriate to include Y release. Align to master.

Added by mhrivnak over 7 years ago

Revision caddb227 | View on GitHub

Speeds up the unit file path migrations

Removes the empty directory purge phase of the 2.8 migrations, which was taking some users many hours when done over NFS.

Introduces multi-threaded concurrency for the bulk of the migration's work.

https://pulp.plan.io/issues/2118 fixes #2118

Added by mhrivnak over 7 years ago

Revision caddb227 | View on GitHub

Speeds up the unit file path migrations

Removes the empty directory purge phase of the 2.8 migrations, which was taking some users many hours when done over NFS.

Introduces multi-threaded concurrency for the bulk of the migration's work.

https://pulp.plan.io/issues/2118 fixes #2118

Actions #5

Updated by mhrivnak over 7 years ago

Actions #6

Updated by mhrivnak over 7 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100
Actions #7

Updated by mhrivnak over 7 years ago

  • Platform Release set to 2.10.0
Actions #8

Updated by dkliban@redhat.com over 7 years ago

  • Tracker changed from Refactor to Issue
  • Severity set to 2. Medium
  • Triaged set to No
Actions #9

Updated by semyers over 7 years ago

  • Status changed from MODIFIED to 5
Actions #10

Updated by amacdona@redhat.com over 7 years ago

  • Triaged changed from No to Yes
Actions #11

Updated by semyers over 7 years ago

  • Status changed from 5 to CLOSED - CURRENTRELEASE
Actions #12

Updated by bmbouter about 6 years ago

  • Sprint set to Sprint 6
Actions #13

Updated by bmbouter about 6 years ago

  • Sprint/Milestone deleted (24)
Actions #14

Updated by bmbouter about 5 years ago

  • Tags Pulp 2 added

Also available in: Atom PDF