Project

Profile

Help

Issue #9398

No optimized syncs after a change to repo's remote?

Added by sskracic@redhat.com 1 day ago. Updated about 1 hour ago.

Status:
NEW
Priority:
Normal
Assignee:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Master
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:
Q4-2021

Description

My environment:

pulpcore - 3.14.5 pulp-rpm - 3.14.2 RHEL 8

I am fighting a weird performance issue. After a change to a remote that is attached to some repository (for example when a client certificate is changed), all further sync attempts for that repository involves reloading of the repository metadata and parsing it, even when no new content is available and no new repository version is created afterwards. This repeats for any subsequent sync attempt of that repo, and is a significant resource hog (eg. RHEL7 repos carry 30K+ packages). A sample task with progress reports looks like this:

      {
            "pulp_href": "/pulp/api/v3/tasks/a9ec4ba1-a804-483d-b9ae-02874a8f19ad/",
            "pulp_created": "2021-09-15T08:43:25.974820Z",
            "state": "running",
            "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
            "logging_cid": "4bd0de408bce49f2932790a4c6075e5e",
            "started_at": "2021-09-15T09:03:54.732570Z",
            "finished_at": null,
            "error": null,
            "worker": "/pulp/api/v3/workers/0c1e8308-37a8-43e3-8a9b-fb91ebdc18b6/",
            "parent_task": null,
            "child_tasks": [],
            "task_group": null,
            "progress_reports": [
                {
                    "message": "Downloading Metadata Files",
                    "code": "sync.downloading.metadata",
                    "state": "completed",
                    "total": null,
                    "done": 11,
                    "suffix": null
                },
                {
                    "message": "Downloading Artifacts",
                    "code": "sync.downloading.artifacts",
                    "state": "running",
                    "total": null,
                    "done": 0,
                    "suffix": null
                },
                {
                    "message": "Associating Content",
                    "code": "associating.content",
                    "state": "running",
                    "total": null,
                    "done": 0,
                    "suffix": null
                },
                {
                    "message": "Parsed Modulemd",
                    "code": "sync.parsing.modulemds",
                    "state": "completed",
                    "total": 374,
                    "done": 374,
                    "suffix": null
                },
                {
                    "message": "Parsed Modulemd-defaults",
                    "code": "sync.parsing.modulemd_defaults",
                    "state": "completed",
                    "total": 45,
                    "done": 45,
                    "suffix": null
                },
                {
                    "message": "Parsed Packages",
                    "code": "sync.parsing.packages",
                    "state": "completed",
                    "total": 19122,
                    "done": 19122,
                    "suffix": null
                },
                {
                    "message": "Parsed Advisories",
                    "code": "sync.parsing.advisories",
                    "state": "completed",
                    "total": 1566,
                    "done": 1566,
                    "suffix": null
                },
                {
                    "message": "Parsed Comps",
                    "code": "sync.parsing.comps",
                    "state": "completed",
                    "total": 67,
                    "done": 67,
                    "suffix": null
                }
            ],
            "created_resources": [
                null
            ],
            "reserved_resources_record": [
                "/pulp/api/v3/remotes/rpm/rpm/9b11994b-f7ea-4df3-a7ed-094f4b183a6c/",
                "/pulp/api/v3/repositories/rpm/rpm/c610ead3-8517-47c4-9ed8-d9dc9cc731ee/"
            ]
        },


History

#1 Updated by sskracic@redhat.com 1 day ago

Forgot to add I am using the mirrored syncing, if that's relevant.

#2 Updated by dalley about 11 hours ago

@Sebastian do you think it could be some variant of https://pulp.plan.io/issues/9402, which rejects optimization because the "revision" at the newer remote is older than that of the previous remote?

See: https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/tasks/synchronizing.py#L332-L339

#3 Updated by dalley about 11 hours ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to dalley

#5 Updated by sskracic@redhat.com about 9 hours ago

It all makes sense now. Once there is genuinely new content available, and a new repository version is created, I'm getting optimized syncs afterwards.

#6 Updated by dalley about 4 hours ago

Great! I think you're correct in thinking that it is being over-aggressive. Most properties of the remote do not matter in terms of sync optimization, only a very few such as URL and download policy. But it's hard to keep track of changes to just those few properties, and track successful vs. unsuccessful syncs, etc.

Would you say that this is still an issue for you even if it's not an "issue" per se? We could leave it open but not immediately prioritize it.

#7 Updated by sskracic@redhat.com about 3 hours ago

Strangely enough, it became an issue for RHUI deployment. The thing is, entitlement certificate for RHUI (which is supplied in the client_cert of the remote) is often revoked, sometimes on a daily basis, and is then replaced by a fresh one, obtained through a cron job. The remotes are then updated accordingly. Now a sync of 1200+ repos, instead of taking a minute or two when there is no new content, spans for several hours, draining significant network (as all metadata is mirrored) and CPU resources. Six hours later, the story repeats all over again.

I would suggest a less stringent approach, or maybe a configuration parameter.

#8 Updated by dalley about 3 hours ago

The thing is, entitlement certificate for RHUI (which is supplied in the client_cert of the remote) is often revoked, sometimes on a daily basis, and is then replaced by a fresh one, obtained through a cron job.

Ok, yup, I see how that could be an issue.

#9 Updated by dalley about 2 hours ago

As per discussion with Katello, they might also be subject to this problem (with frequent cert rotations) and simply haven't noticed it yet.

The long-term fix will likely require modelling changes. A potential short term fix for RHUI only would be to have a downstream patch specific to RHUI that eliminates this specific check, but the cost/benefit of that should be negotiated with the build team. This would work only because RHUI does not ever modify the download policy or URL of existing remotes, which we cannot assume in the general case.

#10 Updated by dalley about 2 hours ago

  • Status changed from ASSIGNED to NEW
  • Assignee deleted (dalley)

#11 Updated by dalley about 1 hour ago

  • Triaged changed from No to Yes
  • Quarter set to Q4-2021

Please register to edit this issue

Also available in: Atom PDF