Issue #3814
closedFile Support - Issue #3770: Pulp 3 is about 2x slower than pulp 2 in syncing a large file repo
RemositoryVersion's add_content and remove_content does not perform bulk operations
Description
Motivation¶
A cprofile report shows that a lot of time is being spent in RepositoryVersion.add_content() which interacts with the database on a unit-by-unit level. This is taking a long time. We need to improve the interface to perform add_content and remove_content via bulk operations.
Solution¶
1. Create a test in python that adds/removes X number of content units to a repo version
2. Benchmark the test
3. Update add_content and remove_content to support lists of content
4. Benchmark the change
I'll probably use different values of X starting with 1,000 and increasing up by factors of 10.
Updated by daviddavis over 6 years ago
- Description updated (diff)
- Status changed from NEW to ASSIGNED
- Assignee set to daviddavis
Updated by daviddavis over 6 years ago
I modified add_content()
and remove_content()
to accept querysets. Here are the initial results for 1000 content units to a repo version:
add_content currently: 43.9s
add_content with bulk_create: 4.3s
remove_content currently: 44.4s
remove_content with a queryset: 0.5s
Updated by daviddavis over 6 years ago
- Status changed from ASSIGNED to POST
Went ahead and opened a PR with the performance improvements:
Added by daviddavis over 6 years ago
Added by daviddavis over 6 years ago
Revision 9bfc50d9 | View on GitHub
Using querysets for add/remove_content methods
Updated by daviddavis over 6 years ago
- Status changed from POST to MODIFIED
Applied in changeset pulp|9bfc50d90a24c9d0ac4a93f5718187515b947058.
Updated by bmbouter almost 5 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Updated by bmbouter over 4 years ago
- Tags Performance added
- Tags deleted (
Sync Performance)
Using querysets for add/remove_content methods
fixes #3814 https://pulp.plan.io/issues/3814