Project

Profile

Help

Task #2466

closed

Remove unnecessary `deepcopy` calls for sync

Added by ttereshc over 7 years ago. Updated almost 5 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Sprint/Milestone:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Platform Release:
2.11.1
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Sprint 12
Quarter:

Description

After some profiling for sync operation the following possible performance improvements were determined:

  1. do not add to catalog and re-associate units (if not necessary) to decrease number of writes to db
  2. avoid additional writes to db when no new errata collections were introduced (all errata will still be updated on every sync)
  3. do not use `deepcopy` during primary.xml processing

First two are fixed by issue #2457.
It depends but on one setup for the repo with 14K rpms and 3.5K errata sync took ~20% less than before in terms of time with the improvements above. 5% goes to Story #1: As a user, I can have Pulp attempt use auto_retry application wide using the 'unsafe_autoretry' parameter, another 5% goes to Story #2: As a user, my rpm sync finishes quickly when upstream metadata hasn't changed and 10% - to deepcopy, Task #3: Make pulp_puppet publish use step processing framework.

NOTE: When one triggers sync via API call or CLI, the following happens:

  1. Sync task is scheduled and later retrieved from a queue by worker.
  2. Sync task is executed.
  3. By default auto-publish is enabled, so in most cases Publish task is scheduled and later retrieved from a queue by worker.
  4. Publish task is executed.
    All the improvements described in this issue are only about Story #2: As a user, my rpm sync finishes quickly when upstream metadata hasn't changed.

Related issues

Related to RPM Support - Issue #2457: When syncing do not associate units that are already associated to the repoCLOSED - CURRENTRELEASEttereshcActions
Actions #1

Updated by ttereshc over 7 years ago

  • Related to Issue #2457: When syncing do not associate units that are already associated to the repo added
Actions #2

Updated by ttereshc over 7 years ago

  • Status changed from ASSIGNED to POST
Actions #3

Updated by mhrivnak over 7 years ago

  • Sprint/Milestone set to 30

Added by ttereshc over 7 years ago

Revision 95139912 | View on GitHub

Stop using deepcopy in primary.xml processing to speed up sync

closes #2466 https://pulp.plan.io/issues/2466

Actions #4

Updated by ttereshc over 7 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100
Actions #5

Updated by ttereshc over 7 years ago

  • Platform Release changed from 2.8.7 to 2.10.4
Actions #6

Updated by semyers about 7 years ago

  • Platform Release changed from 2.10.4 to 2.11.1
Actions #7

Updated by semyers about 7 years ago

  • Status changed from MODIFIED to 5
Actions #8

Updated by semyers about 7 years ago

  • Status changed from 5 to CLOSED - CURRENTRELEASE
Actions #10

Updated by bmbouter about 6 years ago

  • Sprint set to Sprint 12
Actions #11

Updated by bmbouter about 6 years ago

  • Sprint/Milestone deleted (30)
Actions #12

Updated by bmbouter almost 5 years ago

  • Tags Pulp 2 added

Also available in: Atom PDF