Project

Profile

Help

Issue #8981

Syncing mirrolist based remote fails

Added by cityofships 3 months ago. Updated 2 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
3. High
Version:
Master
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 100
Quarter:

Description

Step to reproduce - create a remote with a mirrorlist URL and try syncing it - it fails, see this paste: https://paste.centos.org/view/eeff7112


Related issues

Copied to RPM Support - Backport #9026: Backport 8981 (Syncing mirrolist based remote fails) to 3.13.3CLOSED - CURRENTRELEASE

<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

Associated revisions

Revision 440155ba View on GitHub
Added by ggainey 3 months ago

Taught get_repomd_file() to recognize that some URLs come with parameters.

Necessary to drive the ability to recognize/respond to mirrorlist URLs.

fixes #8981

History

#1 Updated by ggainey 3 months ago

Moving reproducer here before the pastebin expires:

 1560  pulp rpm remote create --name epel-mirror --url "https://mirrors.fedoraproject.org/mirrorlist?repo=epel-8&arch=x86_64&infra=stock&content=centos"
 1561  pulp rpm repository create --name epel-mirror --remote epel-mirror --autopublish
 1562  pulp rpm repository sync --name epel-mirror --mirror


failure log:
pulp [375c8cbc96ae40cebff0b113b206dd70]: pulp_rpm.app.tasks.synchronizing:INFO: Synchronizing: repository=epel-mirror remote=epel-mirror
pulp [f79a9dcd7b53489c85d3bd3296cb82dd]: 127.0.0.1 - admin [28/Jun/2021:18:16:10 +0000] "GET /pulp/api/v3/tasks/2fa8ea54-d7f3-4a70-a8fb-ec8574c0d89d/ HTTP/1.1" 200 655 "-" "python-requests/2.25.1"
cr_xml_parser_generic: parsing error '/var/lib/pulp/tmp/17561@pulp2-nightly-pulp3-source-centos7.padre-fedora.example.com/2fa8ea54-d7f3-4a70-a8fb-ec8574c0d89d/tmp6qtbrxs1': Document is empty

pulp [375c8cbc96ae40cebff0b113b206dd70]: rq.worker:ERROR: Traceback (most recent call last):
File "/usr/local/lib/pulp/lib64/python3.6/site-packages/rq/worker.py", line 1013, in perform_job
rv = job.perform()
File "/usr/local/lib/pulp/lib64/python3.6/site-packages/rq/job.py", line 709, in perform
self._result = self._execute()
File "/usr/local/lib/pulp/lib64/python3.6/site-packages/rq/job.py", line 732, in _execute
result = self.func(*self.args, **self.kwargs)
File "/home/vagrant/devel/pulp_rpm/pulp_rpm/app/tasks/synchronizing.py", line 340, in synchronize
if optimize and is_optimized_sync(repository, remote, remote_url):
File "/home/vagrant/devel/pulp_rpm/pulp_rpm/app/tasks/synchronizing.py", line 265, in is_optimized_sync
repomd = cr.Repomd(repomd_path)
File "/usr/local/lib/pulp/lib64/python3.6/site-packages/createrepo_c/__init__.py", line 155, in __init__
xml_parse_repomd(path, self)
File "/usr/local/lib/pulp/lib64/python3.6/site-packages/createrepo_c/__init__.py", line 346, in xml_parse_repomd
return _createrepo_c.xml_parse_repomd(path, repomdobj, warningcb)
createrepo_c.CreaterepoCError: Parse error '/var/lib/pulp/tmp/17561@pulp2-nightly-pulp3-source-centos7.padre-fedora.example.com/2fa8ea54-d7f3-4a70-a8fb-ec8574c0d89d/tmp6qtbrxs1' at line: 1 (Document is empty
)
pulp [44eeb4dec48c40979879b5971b9e6ff2]: 127.0.0.1 - admin [28/Jun/2021:18:16:11 +0000] "GET /pulp/api/v3/tasks/2fa8ea54-d7f3-4a70-a8fb-ec8574c0d89d/ HTTP/1.1" 200 1943 "-" "python-requests/2.25.1"

#2 Updated by ggainey 3 months ago

Problem was introduced when we fixed the misuse of urljoin in commit fd130b . Workaround patch:

(master) ~/github/Pulp3/pulp_rpm $ git diff
diff --git a/pulp_rpm/app/tasks/synchronizing.py b/pulp_rpm/app/tasks/synchronizing.py
index d6d62e3b..e1ce7337 100644
--- a/pulp_rpm/app/tasks/synchronizing.py
+++ b/pulp_rpm/app/tasks/synchronizing.py
@@ -186,7 +186,9 @@ def get_repomd_file(remote, url):
         pulpcore.plugin.download.DownloadResult: downloaded repomd.xml
 
     """
-    downloader = remote.get_downloader(url=urlpath_sanitize(url, "repodata/repomd.xml"))
+    from urllib.parse import urljoin
+    downloader = remote.get_downloader(url=urljoin(url, "repodata/repomd.xml"))
+    # downloader = remote.get_downloader(url=urlpath_sanitize(url, "repodata/repomd.xml"))
 
     try:
         result = downloader.fetch()

#3 Updated by ggainey 3 months ago

Description of what happened here:

#4 Updated by cityofships 3 months ago

I'm guessing the mirrorlist scenario is not covered by any CI tests?

#5 Updated by dalley 3 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to ggainey
  • Sprint set to Sprint 99

#6 Updated by ggainey 3 months ago

cityofships wrote:

I'm guessing the mirrorlist scenario is not covered by any CI tests?

It was, but not mirrorlist-url-with-params. It is, as of https://github.com/pulp/pulp_rpm/pull/2032

#7 Updated by dalley 3 months ago

  • Status changed from ASSIGNED to POST
  • Triaged changed from No to Yes
  • Sprint changed from Sprint 99 to Sprint 100

#9 Updated by ggainey 3 months ago

  • Status changed from POST to MODIFIED

#10 Updated by dalley 3 months ago

  • Copied to Backport #9026: Backport 8981 (Syncing mirrolist based remote fails) to 3.13.3 added

#11 Updated by dalley 3 months ago

  • Sprint/Milestone set to 3.13.3

#12 Updated by dalley 2 months ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Please register to edit this issue

Also available in: Atom PDF