Project

Profile

Help

Issue #9243

closed

30-50% re-sync performance regression due to touch() of content and artifacts during sync

Added by dalley over 3 years ago. Updated over 3 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Category:
-
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Performance
Sprint:
Quarter:

Description

"touching" existing content and artifacts individually has led to a slowdown on re-syncs of between 30-50% depending on the size of the repo, due to performing (potentially) tens of thousands of extra queries. Copy is also likely impacted severely however I did not test it.

Bulk touch functionality was added to pulpcore by [0], but we must take advantage of it to eliminate the slowdown.

See the following spots where touch() is called in a loop:

Additionally it would be good to eliminate the extra query here: https://github.com/pulp/pulpcore/blob/feede7bb08b1e3107766ce534433cb7867fe52bd/pulpcore/content/handler.py#L715

As this was a significant performance regression introduced by 3.14, we should consider backporting the fixes. This will include [0] as well which will require a small portion of the refactor introduced here [1]

[0] https://pulp.plan.io/issues/9234

[1] https://github.com/pulp/pulpcore/pull/1528/files#diff-81f6a78175bb93934b6beff952646d3ca1ef3731f1ff14492d4ec77bfc3fdf82R195


Related issues

Related to Pulp - Story #9234: As a user, I want to `touch` content and artifacts in bulkCLOSED - CURRENTRELEASEmdellweg

Actions
Related to Pulp - Issue #9266: repository 'modify' touches all the content being added one by oneCLOSED - CURRENTRELEASElmjachkyActions
Copied to Pulp - Backport #9264: Backport #9243 "30-50% re-sync performance regression due to touch() of content and artifacts during sync" to 3.14.zCLOSED - CURRENTRELEASE

Actions
Actions #1

Updated by dalley over 3 years ago

  • Related to Story #9234: As a user, I want to `touch` content and artifacts in bulk added
Actions #2

Updated by dalley over 3 years ago

  • Description updated (diff)
Actions #3

Updated by dalley over 3 years ago

  • Tracker changed from Task to Issue
  • Severity set to 2. Medium
  • Triaged set to No
Actions #4

Updated by dalley over 3 years ago

  • Description updated (diff)
Actions #5

Updated by pulpbot over 3 years ago

  • Status changed from NEW to POST
Actions #6

Updated by dalley over 3 years ago

  • Assignee set to dkliban@redhat.com
  • Sprint/Milestone set to 3.15.0
Actions #7

Updated by dalley over 3 years ago

  • Copied to Backport #9264: Backport #9243 "30-50% re-sync performance regression due to touch() of content and artifacts during sync" to 3.14.z added
Actions #8

Updated by dkliban@redhat.com over 3 years ago

  • Related to Issue #9266: repository 'modify' touches all the content being added one by one added

Added by dkliban@redhat.com over 3 years ago

Revision 7a41c18b | View on GitHub

Use bulk touch() when processing Artifacts and Content

This patch only addresses the inefficiencies of the Stages API. Another patch is needed to address the inefficiency of the repository version modify operation.

closes: #9243

Actions #9

Updated by dkliban@redhat.com over 3 years ago

  • Status changed from POST to MODIFIED
Actions #10

Updated by pulpbot over 3 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF