Project

Profile

Help

Issue #6111

Re-migrations take nearly as long as initial migrations

Added by dalley almost 2 years ago. Updated over 1 year ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Low
Assignee:
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Katello
Sprint:
Quarter:

Description

With a large repository, re-migrations take a very long time.

<jsherrill> i was also planning on testing re-migration time with that repo
<jsherrill> and that seemed really really slow
<jsherrill> but i wasn't sure if that was related to the fact that not all of the units migrated properly
<jsherrill> so was waiting for it to be fixed to test for sure
<jsherrill> like it seemed like it took ~an hour to re-migrate  it
<jsherrill> i'm sure there is a lot of low-hanging fruit performance wise
<jsherrill> and i'm less concerned about initial migration time
migration_perf2.svg (37.6 KB) migration_perf2.svg dalley, 02/07/2020 06:10 PM

Related issues

Related to Migration Plugin - Task #6156: Improve performance by cutting way down on ProgressReport updatingCLOSED - NOTABUG

<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

Associated revisions

Revision c430b4c8 View on GitHub
Added by dalley over 1 year ago

Don't iterate all lazy content every time we re-migrate

Push some of our Python-level checks into database query logic to significantly reduce the # of individual queries we make and the amount of content we need to iterate over / prefetch for.

closes: #6111 https://pulp.plan.io/issues/6111

Revision c430b4c8 View on GitHub
Added by dalley over 1 year ago

Don't iterate all lazy content every time we re-migrate

Push some of our Python-level checks into database query logic to significantly reduce the # of individual queries we make and the amount of content we need to iterate over / prefetch for.

closes: #6111 https://pulp.plan.io/issues/6111

Revision c430b4c8 View on GitHub
Added by dalley over 1 year ago

Don't iterate all lazy content every time we re-migrate

Push some of our Python-level checks into database query logic to significantly reduce the # of individual queries we make and the amount of content we need to iterate over / prefetch for.

closes: #6111 https://pulp.plan.io/issues/6111

History

#1 Updated by dalley almost 2 years ago

  • Description updated (diff)

#2 Updated by dalley almost 2 years ago

Here's a flamegraph sampled from a few minutes of one of my migration runs, if it helps. Open it in firefox.

Captured with:

pip install py-spy
sudo env "PATH=$PATH" py-spy record --pid 20183 --output ../migration_perf2.svg

I'm going to guess that we're saving the progress bars too much, just as we did with the other plugins. We're calling pb.increment() in a loop for each content unit in a couple of places, which hugely inflates the # of DB queries.

#3 Updated by ipanova@redhat.com almost 2 years ago

  • Related to Task #6156: Improve performance by cutting way down on ProgressReport updating added

#4 Updated by ttereshc almost 2 years ago

  • Triaged changed from No to Yes

#5 Updated by ggainey over 1 year ago

  • Priority changed from Normal to Low

#6 Updated by ggainey over 1 year ago

  • Tags Katello added
  • Tags deleted (Katello-P3)

#7 Updated by ttereshc over 1 year ago

  • Sprint/Milestone set to 0.2.0

#8 Updated by ttereshc over 1 year ago

needs to be retested

#9 Updated by dalley over 1 year ago

  • Status changed from NEW to POST
  • Assignee set to dalley

#10 Updated by dalley over 1 year ago

  • Status changed from POST to MODIFIED

#11 Updated by ttereshc over 1 year ago

  • Sprint/Milestone deleted (0.2.0)

#12 Updated by ttereshc over 1 year ago

  • Sprint/Milestone set to 0.3.0

#13 Updated by ttereshc over 1 year ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF