Issue #4247
closedimprove performance of uploading ISO
Description
Adding a search criteria to filter out units as need to improve performance of "find_repo_content_units" in pulp_rpm/plugins/importers/iso/importer.py
We suffered serious performance issue to uploading units (ISO) to a repository when data volume get larger and larger. After applying the change[1], performance improved. In addition, some our internal function tests had been done, it's pass. We uses pulp 2.15.
Updated by Zhiming about 6 years ago
Steps to reproduce the issues:
1. Preparing test data. Uploading 30,000 small ISOs to Pulp.
2. Upload an ISO to Pulp and capture elapsed time. (It takes around 16 seconds in my test env. After applying the fix, it just takes around 2 - 3 seconds.)
Updated by Zhiming about 6 years ago
Analysis for the performance issue.
Let’s look at below piece of codes which is from “pulp/server/controllers/repository.py”.
def find_repo_content_units(..., repo_content_unit_q=None,..)
......
qs = model.RepositoryContentUnit.objects(q_obj=repo_content_unit_q,
repo_id=repository.repo_id)
......
for repo_content_unit in qs:
id_set = type_map.setdefault(repo_content_unit.unit_type_id, set())
id_set.add(repo_content_unit.unit_id)
content_unit_set = content_units.setdefault(repo_content_unit.unit_type_id, dict())
content_unit_set[repo_content_unit.unit_id] = repo_content_unit
......
As the value of parameter “repo_content_unit_q” equals “None” (invoker does not pass any data for it), "qs = model.RepositoryContentUnit.objects(q_obj=repo_content_unit_q, repo_id=repository.repo_id)" fetches all records of the repo from repo_content_units in MongoDB, then saves to Python a Map object and List objects.
Profiling shows the piece takes >80% of total time of uploading an unit (there are more than 30000 units in the repo). The major reason is result set “qs” is too large. Worse, “qs” will get larger as uploading more units, and performance will get worse and worse.
So passing a search criteria[1] "repo_content_unit_q" to filter out units as need can improve performance.
Updated by ttereshc about 6 years ago
- Status changed from NEW to POST
- Triaged changed from No to Yes
Added by Zhiming about 6 years ago
Updated by bmbouter almost 6 years ago
- Status changed from POST to CLOSED - WONTFIX
Pulp 2 is approaching maintenance mode, and this Pulp 2 ticket is not being actively worked on. As such, it is being closed as WONTFIX. Pulp 2 is still accepting contributions though, so if you want to contribute a fix for this ticket, please reopen or comment on it. If you don't have permissions to reopen this ticket, or you want to discuss an issue, please reach out via the developer mailing list.
Improve performance of uploading ISO
Adding a search criteria to filter out units as need to improve performance of "find_repo_content_units".
ref #4247 https://pulp.plan.io/issues/4247