Issue #7564
closedExcess number of suboptimal queries when querying collections list
Description
Request to /pulp/api/v3/content/ansible/collection_versions
(see Request) produces 6 + 2N SQL queries, where N is a number of collection version in response.
This is highly inefficient and may result in a high request latency in production environments. For Automation Hub it results in 1.8-4.2 seconds per request.
Queries list includes 2 queries that runs SELECT DISTINCT
over all fields and includes explicit list of UUIDs of all certified collections in the IN
clause.
Request:¶
GET /pulp/api/v3/content/ansible/collection_versions/?is_highest=True&offset=0&limit=12&exclude_fields=docs_blob&certification=certified&deprecated=false
Queries:¶
- Queries list:
https://gist.github.com/cutwater/70f6adbe68d88cec0df14141d508f5cf
- Explain analyze of query executed against the production database:
To be added later in comments.
Versions tested:¶
pulpcore==3.5.0 + pulp-ansible==0.2.0b15
pulpcore==3.7.0 + pulp-ansible==0.4.0
Actions required:¶
- Eliminate 2N queries.
- Optimize
SELECT DISTINCT
or remove. - Remove explicit list of all certified collections in the
IN
clause. Replace with a subquery if needed.
Files
Updated by fao89 about 4 years ago
the distinct comes from MultipleChoiceFilter: https://django-filter.readthedocs.io/en/stable/ref/filters.html?highlight=distinct#multiplechoicefilter
Updated by osapryki about 4 years ago
This is a partial dump of SQL query generated on CI environment which has 5290 collection version records in the database.
Updated by osapryki about 4 years ago
The previous dump is generated on API call to /api/automation-hub/v3/collections/
which calls pulp_ansible.app.galaxy.v3.CollectionViewSet
Updated by osapryki about 4 years ago
Updated by fao89 almost 4 years ago
- Status changed from NEW to CLOSED - DUPLICATE
addressed on #8012