Issue #9398
closedNo optimized syncs after a change to repo's remote?
Description
My environment:
pulpcore - 3.14.5 pulp-rpm - 3.14.2 RHEL 8
I am fighting a weird performance issue. After a change to a remote that is attached to some repository (for example when a client certificate is changed), all further sync attempts for that repository involves reloading of the repository metadata and parsing it, even when no new content is available and no new repository version is created afterwards. This repeats for any subsequent sync attempt of that repo, and is a significant resource hog (eg. RHEL7 repos carry 30K+ packages). A sample task with progress reports looks like this:
{
"pulp_href": "/pulp/api/v3/tasks/a9ec4ba1-a804-483d-b9ae-02874a8f19ad/",
"pulp_created": "2021-09-15T08:43:25.974820Z",
"state": "running",
"name": "pulp_rpm.app.tasks.synchronizing.synchronize",
"logging_cid": "4bd0de408bce49f2932790a4c6075e5e",
"started_at": "2021-09-15T09:03:54.732570Z",
"finished_at": null,
"error": null,
"worker": "/pulp/api/v3/workers/0c1e8308-37a8-43e3-8a9b-fb91ebdc18b6/",
"parent_task": null,
"child_tasks": [],
"task_group": null,
"progress_reports": [
{
"message": "Downloading Metadata Files",
"code": "sync.downloading.metadata",
"state": "completed",
"total": null,
"done": 11,
"suffix": null
},
{
"message": "Downloading Artifacts",
"code": "sync.downloading.artifacts",
"state": "running",
"total": null,
"done": 0,
"suffix": null
},
{
"message": "Associating Content",
"code": "associating.content",
"state": "running",
"total": null,
"done": 0,
"suffix": null
},
{
"message": "Parsed Modulemd",
"code": "sync.parsing.modulemds",
"state": "completed",
"total": 374,
"done": 374,
"suffix": null
},
{
"message": "Parsed Modulemd-defaults",
"code": "sync.parsing.modulemd_defaults",
"state": "completed",
"total": 45,
"done": 45,
"suffix": null
},
{
"message": "Parsed Packages",
"code": "sync.parsing.packages",
"state": "completed",
"total": 19122,
"done": 19122,
"suffix": null
},
{
"message": "Parsed Advisories",
"code": "sync.parsing.advisories",
"state": "completed",
"total": 1566,
"done": 1566,
"suffix": null
},
{
"message": "Parsed Comps",
"code": "sync.parsing.comps",
"state": "completed",
"total": 67,
"done": 67,
"suffix": null
}
],
"created_resources": [
null
],
"reserved_resources_record": [
"/pulp/api/v3/remotes/rpm/rpm/9b11994b-f7ea-4df3-a7ed-094f4b183a6c/",
"/pulp/api/v3/repositories/rpm/rpm/c610ead3-8517-47c4-9ed8-d9dc9cc731ee/"
]
},
Updated by sskracic@redhat.com over 3 years ago
Forgot to add I am using the mirrored syncing, if that's relevant.
Updated by dalley over 3 years ago
@Sebastian do you think it could be some variant of https://pulp.plan.io/issues/9402, which rejects optimization because the "revision" at the newer remote is older than that of the previous remote?
See: https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/tasks/synchronizing.py#L332-L339
Updated by dalley over 3 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to dalley
Updated by dalley over 3 years ago
Updated by sskracic@redhat.com over 3 years ago
It all makes sense now. Once there is genuinely new content available, and a new repository version is created, I'm getting optimized syncs afterwards.
Updated by dalley over 3 years ago
Great! I think you're correct in thinking that it is being over-aggressive. Most properties of the remote do not matter in terms of sync optimization, only a very few such as URL and download policy. But it's hard to keep track of changes to just those few properties, and track successful vs. unsuccessful syncs, etc.
Would you say that this is still an issue for you even if it's not an "issue" per se? We could leave it open but not immediately prioritize it.
Updated by sskracic@redhat.com over 3 years ago
Strangely enough, it became an issue for RHUI deployment. The thing is, entitlement certificate for RHUI (which is supplied in the client_cert of the remote) is often revoked, sometimes on a daily basis, and is then replaced by a fresh one, obtained through a cron job. The remotes are then updated accordingly. Now a sync of 1200+ repos, instead of taking a minute or two when there is no new content, spans for several hours, draining significant network (as all metadata is mirrored) and CPU resources. Six hours later, the story repeats all over again.
I would suggest a less stringent approach, or maybe a configuration parameter.
Updated by dalley over 3 years ago
The thing is, entitlement certificate for RHUI (which is supplied in the client_cert of the remote) is often revoked, sometimes on a daily basis, and is then replaced by a fresh one, obtained through a cron job.
Ok, yup, I see how that could be an issue.
Updated by dalley over 3 years ago
As per discussion with Katello, they might also be subject to this problem (with frequent cert rotations) and simply haven't noticed it yet.
The long-term fix will likely require modelling changes. A potential short term fix for RHUI only would be to have a downstream patch specific to RHUI that eliminates this specific check, but the cost/benefit of that should be negotiated with the build team. This would work only because RHUI does not ever modify the download policy or URL of existing remotes, which we cannot assume in the general case.
Updated by dalley over 3 years ago
- Status changed from ASSIGNED to NEW
- Assignee deleted (
dalley)
Updated by dalley over 3 years ago
- Triaged changed from No to Yes
- Quarter set to Q4-2021
Updated by dalley about 3 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to dalley
Updated by yogizop about 3 years ago
This is needed since the new pulp-cli ACS test creates artifacts which it then deletes via orphan cleanup before the pulp-cli content test runs. https://techzpod.com/ https://get-mobdrovip.com
Updated by rchan about 3 years ago
- Sprint changed from Sprint 106 to Sprint 107
Updated by pulpbot about 3 years ago
- Status changed from ASSIGNED to POST
Added by dalley about 3 years ago
Updated by dalley about 3 years ago
- Status changed from POST to MODIFIED
Applied in changeset 6550a7448c5458852c31790089124647bdfa1db2.
Updated by pulpbot about 3 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Fix sync optimization when remote changes
closes: #9398 https://pulp.plan.io/issues/9398