Actions
Issue #4505
closedSlow syncs on large repositories
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 50
Quarter:
Description
When syncing a large repository like http://mirror.math.princeton.edu/pub/fedora/linux/releases/28/, the sync is very slow. After 40 minutes only ~5000 rpms have been processed as seen by the associate progress report. (this was with an on_demand repo)
I turned on slow sql logging and see:
- cat /var/lib/pgsql/data/pg_log/postgresql-Tue.log | grep duration | cut -c 1-100
LOG: parameter "log_min_duration_statement" changed to "200"
LOG: duration: 36385.807 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
LOG: duration: 36158.991 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
LOG: duration: 37753.231 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
LOG: duration: 38638.415 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
LOG: duration: 41236.294 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
LOG: duration: 41611.244 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
LOG: duration: 41071.438 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
LOG: duration: 41803.944 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
LOG: duration: 41617.086 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
LOG: duration: 41983.358 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
LOG: duration: 44623.680 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
LOG: duration: 44442.117 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
These are popping up about every minute and are likely a big source of the problem. Each one is very large:
cat /var/lib/pgsql/data/pg_log/postgresql-Tue.log | grep duration | tail -n1| cut -c 1-1000
LOG: duration: 44922.590 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id" = 2 WHERE ("pulp_app_repositorycontent"."content_id" IN (SELECT U0."content_ptr_id" FROM "rpm_package" U0 WHERE ((U0."arch" = 'x86_64' AND U0."epoch" = '0' AND U0."name" = '0ad' AND U0."release" = '5.fc28' AND U0."version" = '0.0.22' AND NOT (U0."content_ptr_id" = 51)) OR (U0."arch" = 'noarch' AND U0."epoch" = '0' AND U0."name" = '0ad-data' AND U0."release" = '2.fc28' AND U0."version" = '0.0.22' AND NOT (U0."content_ptr_id" = 52)) OR (U0."arch" = 'x86_64' AND U0."epoch" = '0' AND U0."name" = '0install' AND U0."release" = '1.fc27' AND U0."version" = '2.12.1' AND NOT (U0."content_ptr_id" = 53)) OR (U0."arch" = 'x86_64' AND U0."epoch" = '0' AND U0."name" = '0xFFFF' AND U0."release" = '15.fc26' AND U0."version" = '0.3.9' AND NOT (U0."content_ptr_id" = 54)) OR (U0."arch" = 'x86_64' AND U0."epoch" = '0' AND U0."name" = '2048-cli' AND U0."release" = '5.fc28' AND U0."version" = '0.9.1' AND NOT
cat /var/lib/pgsql/data/pg_log/postgresql-Tue.log | grep duration | tail -n1| wc -c
1147415
I'm not sure if the size of the query is the problem or how it is structured.
Related issues
Actions