Project

Profile

Help

Issue #3814

closed

File Support - Issue #3770: Pulp 3 is about 2x slower than pulp 2 in syncing a large file repo

RemositoryVersion's add_content and remove_content does not perform bulk operations

Added by bmbouter over 6 years ago. Updated over 4 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Performance
Sprint:
Quarter:

Description

Motivation

A cprofile report shows that a lot of time is being spent in RepositoryVersion.add_content() which interacts with the database on a unit-by-unit level. This is taking a long time. We need to improve the interface to perform add_content and remove_content via bulk operations.

Solution

1. Create a test in python that adds/removes X number of content units to a repo version
2. Benchmark the test
3. Update add_content and remove_content to support lists of content
4. Benchmark the change

I'll probably use different values of X starting with 1,000 and increasing up by factors of 10.

Actions #1

Updated by CodeHeeler over 6 years ago

  • Triaged changed from No to Yes
Actions #2

Updated by daviddavis over 6 years ago

  • Description updated (diff)
  • Status changed from NEW to ASSIGNED
  • Assignee set to daviddavis
Actions #3

Updated by daviddavis over 6 years ago

I modified add_content() and remove_content() to accept querysets. Here are the initial results for 1000 content units to a repo version:

add_content currently: 43.9s
add_content with bulk_create: 4.3s

remove_content currently: 44.4s
remove_content with a queryset: 0.5s

Actions #4

Updated by daviddavis over 6 years ago

  • Status changed from ASSIGNED to POST

Went ahead and opened a PR with the performance improvements:

https://github.com/pulp/pulp/pull/3548

Added by daviddavis over 6 years ago

Revision 9bfc50d9 | View on GitHub

Using querysets for add/remove_content methods

fixes #3814 https://pulp.plan.io/issues/3814

Added by daviddavis over 6 years ago

Revision 9bfc50d9 | View on GitHub

Using querysets for add/remove_content methods

fixes #3814 https://pulp.plan.io/issues/3814

Actions #5

Updated by daviddavis over 6 years ago

  • Status changed from POST to MODIFIED
Actions #6

Updated by daviddavis over 5 years ago

  • Sprint/Milestone set to 3.0.0
Actions #7

Updated by bmbouter over 5 years ago

  • Tags deleted (Pulp 3)
Actions #8

Updated by bmbouter almost 5 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Actions #9

Updated by bmbouter over 4 years ago

  • Tags Performance added
  • Tags deleted (Sync Performance)

Also available in: Atom PDF