Project

Profile

Help

Issue #9398

closed

No optimized syncs after a change to repo's remote?

Added by sskracic@redhat.com about 1 year ago. Updated 12 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Master
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 107
Quarter:
Q4-2021

Description

My environment:

pulpcore - 3.14.5 pulp-rpm - 3.14.2 RHEL 8

I am fighting a weird performance issue. After a change to a remote that is attached to some repository (for example when a client certificate is changed), all further sync attempts for that repository involves reloading of the repository metadata and parsing it, even when no new content is available and no new repository version is created afterwards. This repeats for any subsequent sync attempt of that repo, and is a significant resource hog (eg. RHEL7 repos carry 30K+ packages). A sample task with progress reports looks like this:

      {
            "pulp_href": "/pulp/api/v3/tasks/a9ec4ba1-a804-483d-b9ae-02874a8f19ad/",
            "pulp_created": "2021-09-15T08:43:25.974820Z",
            "state": "running",
            "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
            "logging_cid": "4bd0de408bce49f2932790a4c6075e5e",
            "started_at": "2021-09-15T09:03:54.732570Z",
            "finished_at": null,
            "error": null,
            "worker": "/pulp/api/v3/workers/0c1e8308-37a8-43e3-8a9b-fb91ebdc18b6/",
            "parent_task": null,
            "child_tasks": [],
            "task_group": null,
            "progress_reports": [
                {
                    "message": "Downloading Metadata Files",
                    "code": "sync.downloading.metadata",
                    "state": "completed",
                    "total": null,
                    "done": 11,
                    "suffix": null
                },
                {
                    "message": "Downloading Artifacts",
                    "code": "sync.downloading.artifacts",
                    "state": "running",
                    "total": null,
                    "done": 0,
                    "suffix": null
                },
                {
                    "message": "Associating Content",
                    "code": "associating.content",
                    "state": "running",
                    "total": null,
                    "done": 0,
                    "suffix": null
                },
                {
                    "message": "Parsed Modulemd",
                    "code": "sync.parsing.modulemds",
                    "state": "completed",
                    "total": 374,
                    "done": 374,
                    "suffix": null
                },
                {
                    "message": "Parsed Modulemd-defaults",
                    "code": "sync.parsing.modulemd_defaults",
                    "state": "completed",
                    "total": 45,
                    "done": 45,
                    "suffix": null
                },
                {
                    "message": "Parsed Packages",
                    "code": "sync.parsing.packages",
                    "state": "completed",
                    "total": 19122,
                    "done": 19122,
                    "suffix": null
                },
                {
                    "message": "Parsed Advisories",
                    "code": "sync.parsing.advisories",
                    "state": "completed",
                    "total": 1566,
                    "done": 1566,
                    "suffix": null
                },
                {
                    "message": "Parsed Comps",
                    "code": "sync.parsing.comps",
                    "state": "completed",
                    "total": 67,
                    "done": 67,
                    "suffix": null
                }
            ],
            "created_resources": [
                null
            ],
            "reserved_resources_record": [
                "/pulp/api/v3/remotes/rpm/rpm/9b11994b-f7ea-4df3-a7ed-094f4b183a6c/",
                "/pulp/api/v3/repositories/rpm/rpm/c610ead3-8517-47c4-9ed8-d9dc9cc731ee/"
            ]
        },


Actions #1

Updated by sskracic@redhat.com about 1 year ago

Forgot to add I am using the mirrored syncing, if that's relevant.

Actions #2

Updated by dalley about 1 year ago

@Sebastian do you think it could be some variant of https://pulp.plan.io/issues/9402, which rejects optimization because the "revision" at the newer remote is older than that of the previous remote?

See: https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/tasks/synchronizing.py#L332-L339

Actions #3

Updated by dalley about 1 year ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to dalley
Actions #5

Updated by sskracic@redhat.com about 1 year ago

It all makes sense now. Once there is genuinely new content available, and a new repository version is created, I'm getting optimized syncs afterwards.

Actions #6

Updated by dalley about 1 year ago

Great! I think you're correct in thinking that it is being over-aggressive. Most properties of the remote do not matter in terms of sync optimization, only a very few such as URL and download policy. But it's hard to keep track of changes to just those few properties, and track successful vs. unsuccessful syncs, etc.

Would you say that this is still an issue for you even if it's not an "issue" per se? We could leave it open but not immediately prioritize it.

Actions #7

Updated by sskracic@redhat.com about 1 year ago

Strangely enough, it became an issue for RHUI deployment. The thing is, entitlement certificate for RHUI (which is supplied in the client_cert of the remote) is often revoked, sometimes on a daily basis, and is then replaced by a fresh one, obtained through a cron job. The remotes are then updated accordingly. Now a sync of 1200+ repos, instead of taking a minute or two when there is no new content, spans for several hours, draining significant network (as all metadata is mirrored) and CPU resources. Six hours later, the story repeats all over again.

I would suggest a less stringent approach, or maybe a configuration parameter.

Actions #8

Updated by dalley about 1 year ago

The thing is, entitlement certificate for RHUI (which is supplied in the client_cert of the remote) is often revoked, sometimes on a daily basis, and is then replaced by a fresh one, obtained through a cron job.

Ok, yup, I see how that could be an issue.

Actions #9

Updated by dalley about 1 year ago

As per discussion with Katello, they might also be subject to this problem (with frequent cert rotations) and simply haven't noticed it yet.

The long-term fix will likely require modelling changes. A potential short term fix for RHUI only would be to have a downstream patch specific to RHUI that eliminates this specific check, but the cost/benefit of that should be negotiated with the build team. This would work only because RHUI does not ever modify the download policy or URL of existing remotes, which we cannot assume in the general case.

Actions #10

Updated by dalley about 1 year ago

  • Status changed from ASSIGNED to NEW
  • Assignee deleted (dalley)
Actions #11

Updated by dalley about 1 year ago

  • Triaged changed from No to Yes
  • Quarter set to Q4-2021
Actions #12

Updated by dalley about 1 year ago

  • Sprint set to Sprint 105
Actions #13

Updated by dalley about 1 year ago

  • Sprint deleted (Sprint 105)
Actions #14

Updated by dalley about 1 year ago

  • Sprint set to Sprint 106
Actions #15

Updated by dalley about 1 year ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to dalley
Actions #16

Updated by yogizop about 1 year ago

This is needed since the new pulp-cli ACS test creates artifacts which it then deletes via orphan cleanup before the pulp-cli content test runs. https://techzpod.com/ https://get-mobdrovip.com

Actions #17

Updated by rchan about 1 year ago

  • Sprint changed from Sprint 106 to Sprint 107
Actions #18

Updated by pulpbot about 1 year ago

  • Status changed from ASSIGNED to POST
Actions #19

Updated by dalley about 1 year ago

  • Sprint/Milestone set to 3.16.0

Added by dalley about 1 year ago

Revision 6550a744

Fix sync optimization when remote changes

closes: #9398 https://pulp.plan.io/issues/9398

Actions #20

Updated by dalley about 1 year ago

  • Status changed from POST to MODIFIED
Actions #21

Updated by pulpbot about 1 year ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF