Project

Profile

Help

Issue #4505

Slow syncs on large repositories

Added by jsherril@redhat.com 8 months ago. Updated 6 months ago.

Status:
MODIFIED
Priority:
High
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
Severity:
2. Medium
Version:
Platform Release:
Blocks Release:
OS:
Backwards Incompatible:
No
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 50

Description

When syncing a large repository like http://mirror.math.princeton.edu/pub/fedora/linux/releases/28/, the sync is very slow. After 40 minutes only ~5000 rpms have been processed as seen by the associate progress report. (this was with an on_demand repo)

I turned on slow sql logging and see:

  1. cat /var/lib/pgsql/data/pg_log/postgresql-Tue.log | grep duration | cut -c 1-100
    LOG: parameter "log_min_duration_statement" changed to "200"
    LOG: duration: 36385.807 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
    LOG: duration: 36158.991 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
    LOG: duration: 37753.231 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
    LOG: duration: 38638.415 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
    LOG: duration: 41236.294 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
    LOG: duration: 41611.244 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
    LOG: duration: 41071.438 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
    LOG: duration: 41803.944 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
    LOG: duration: 41617.086 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
    LOG: duration: 41983.358 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
    LOG: duration: 44623.680 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id
    LOG: duration: 44442.117 ms statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id

These are popping up about every minute and are likely a big source of the problem. Each one is very large:

cat /var/lib/pgsql/data/pg_log/postgresql-Tue.log | grep duration |  tail -n1| cut -c 1-1000

LOG:  duration: 44922.590 ms  statement: UPDATE "pulp_app_repositorycontent" SET "version_removed_id" = 2 WHERE ("pulp_app_repositorycontent"."content_id" IN (SELECT U0."content_ptr_id" FROM "rpm_package" U0 WHERE ((U0."arch" = 'x86_64' AND U0."epoch" = '0' AND U0."name" = '0ad' AND U0."release" = '5.fc28' AND U0."version" = '0.0.22' AND NOT (U0."content_ptr_id" = 51)) OR (U0."arch" = 'noarch' AND U0."epoch" = '0' AND U0."name" = '0ad-data' AND U0."release" = '2.fc28' AND U0."version" = '0.0.22' AND NOT (U0."content_ptr_id" = 52)) OR (U0."arch" = 'x86_64' AND U0."epoch" = '0' AND U0."name" = '0install' AND U0."release" = '1.fc27' AND U0."version" = '2.12.1' AND NOT (U0."content_ptr_id" = 53)) OR (U0."arch" = 'x86_64' AND U0."epoch" = '0' AND U0."name" = '0xFFFF' AND U0."release" = '15.fc26' AND U0."version" = '0.3.9' AND NOT (U0."content_ptr_id" = 54)) OR (U0."arch" = 'x86_64' AND U0."epoch" = '0' AND U0."name" = '2048-cli' AND U0."release" = '5.fc28' AND U0."version" = '0.9.1' AND NOT

cat /var/lib/pgsql/data/pg_log/postgresql-Tue.log | grep duration | tail -n1| wc -c
1147415

I'm not sure if the size of the query is the problem or how it is structured.


Related issues

Related to RPM Support - Test #4292: Add Travis job to nightly sync EPEL7 repo CLOSED - COMPLETE Actions

Associated revisions

Revision 802bcc4f View on GitHub
Added by dalley 7 months ago

Fix serious performance issue w/ large syncs

closes #4505
https://pulp.plan.io/issues/4505

History

#1 Updated by daviddavis 8 months ago

  • Sprint/Milestone set to 3.0

#2 Updated by jsherril@redhat.com 8 months ago

In addition, the sync failed after about 2.5 hours with the following:

"description": "stack depth limit exceeded\nHINT:  Increase the configuration parameter \"max_stack_depth\" (currently 2048kB), after ensuring the platform's stack depth limit is adequate.\n", 
"traceback": " File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/rq/worker.py\", line 799, in perform_job\n rv = job.perform()\n File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/rq/job.py\", line 600, in perform\n self._result = self._execute()\n File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/rq/job.py\", line 606, in _execute\n return self.func(*self.args, **self.kwargs)\n File \"/home/vagrant/nfs/pulp_rpm/pulp_rpm/app/tasks/synchronizing.py\", line 67, in synchronize\n dv.create()\n File \"/home/vagrant/devel/pulpcore-plugin/pulpcore/plugin/stages/declarative_version.py\", line 169, in create\n loop.run_until_complete(pipeline)\n File \"/usr/lib64/python3.6/asyncio/base_events.py\", line 468, in run_until_complete\n return future.result()\n File \"/home/vagrant/devel/pulpcore-plugin/pulpcore/plugin/stages/api.py\", line 209, in create_pipeline\n await asyncio.gather(*futures)\n File \"/home/vagrant/devel/pulpcore-plugin/pulpcore/plugin/stages/api.py\", line 43, in call\n await self.run()\n File \"/home/vagrant/devel/pulpcore-plugin/pulpcore/plugin/stages/association_stages.py\", line 130, in run\n self.new_version.remove_content(queryset_to_unassociate)\n File \"/home/vagrant/nfs/pulp/pulpcore/app/models/repository.py\", line 399, in remove_content\n q_set.update(version_removed=self)\n File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/django/db/models/query.py\", line 693, in update\n rows = query.get_compiler(self.db).execute_sql(CURSOR)\n File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/django/db/models/sql/compiler.py\", line 1383, in execute_sql\n cursor = super().execute_sql(result_type)\n File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/django/db/models/sql/compiler.py\", line 1065, in execute_sql\n cursor.execute(sql, params)\n File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/django/db/backends/utils.py\", line 68, in execute\n return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)\n File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/django/db/backends/utils.py\", line 77, in _execute_with_wrappers\n return executor(sql, params, many, context)\n File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/django/db/backends/utils.py\", line 85, in _execute\n return self.cursor.execute(sql, params)\n File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/django/db/utils.py\", line 89, in exit\n raise dj_exc_value.with_traceback(traceback) from exc_value\n File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/django/db/backends/utils.py\", line 85, in _execute\n return self.cursor.execute(sql, params)\n"
},

#3 Updated by CodeHeeler 8 months ago

  • Priority changed from Normal to High
  • Triaged changed from No to Yes
  • Sprint set to Sprint 50

#4 Updated by dkliban@redhat.com 7 months ago

Looking at the models.py0, I don't see any indexes defined for the fields being used in that query. We should probably update the models with indexes for each individual field and the combination of them all together.

[0] https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/models.py

#5 Updated by kersom 7 months ago

  • Tags Pulp 3 added

#6 Updated by jsherril@redhat.com 7 months ago

I added a bunch of indexes and didn't notice much improvement:

"foobar" btree (name, epoch, version, release, arch, checksum_type, "pkgId", content_ptr_id)

should fit the query, but IANADBA

In addition, it appears that the query seems to increase by about 8K characters at every iteration:

  1. tail /var/lib/pgsql/data/pg_log/postgresql-Mon.log | awk '{ print length }'
    1052291
    1061237
    1069792
    1078413
    1086990
    1095609
    1104472
    1113154
    1121694
    1130256

#7 Updated by dalley 7 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to dalley

#8 Updated by dalley 7 months ago

"unique_together" creates an index, so the indexes should already be there.

Here's a reformatted version of that query

UPDATE "pulp_app_repositorycontent" SET "version_removed_id" = 2 
WHERE ("pulp_app_repositorycontent"."content_id" IN 
    (SELECT U0."content_ptr_id" FROM "rpm_package" U0 WHERE (
        (U0."arch" = 'x86_64' AND U0."epoch" = '0' AND U0."name" = '0ad' AND U0."release" = '5.fc28' AND U0."version" = '0.0.22' AND NOT (U0."content_ptr_id" = 51)) OR
        (U0."arch" = 'noarch' AND U0."epoch" = '0' AND U0."name" = '0ad-data' AND U0."release" = '2.fc28' AND U0."version" = '0.0.22' AND NOT (U0."content_ptr_id" = 52)) OR 
        (U0."arch" = 'x86_64' AND U0."epoch" = '0' AND U0."name" = '0install' AND U0."release" = '1.fc27' AND U0."version" = '2.12.1' AND NOT (U0."content_ptr_id" = 53)) OR
        (U0."arch" = 'x86_64' AND U0."epoch" = '0' AND U0."name" = '0xFFFF' AND U0."release" = '15.fc26' AND U0."version" = '0.3.9' AND NOT (U0."content_ptr_id" = 54)) OR
        (U0."arch" = 'x86_64' AND U0."epoch" = '0' AND U0."name" = '2048-cli' AND U0."release" = '5.fc28' AND U0."version" = '0.9.1' AND NOT

#9 Updated by daviddavis 7 months ago

  • Sprint/Milestone deleted (3.0)
  • Tags Pulp 3 RC Blocker added

#10 Updated by dalley 7 months ago

  • Status changed from ASSIGNED to POST

#11 Updated by dalley 7 months ago

  • Status changed from POST to MODIFIED

#12 Updated by daviddavis 6 months ago

  • Sprint/Milestone set to 3.0

#13 Updated by bmbouter 6 months ago

  • Tags deleted (Pulp 3, Pulp 3 RC Blocker)

#14 Updated by kersom 6 months ago

  • Related to Test #4292: Add Travis job to nightly sync EPEL7 repo added

Please register to edit this issue

Also available in: Atom PDF