Issue #4424
closedPulp does not provide a descriptive error message for RPM repos with invalid metadata
Description
Pulp 3 does not handle RPM repos with invalid metadata.
When syncing a repository that's missing its ``filelists.xml`` file. Like: missing file list
Traceback:
("Task report /pulp/api/v3/tasks/213/ contains a error: {'code': None, "
'\'description\': "404, message=\'Not Found\'", \'traceback\': \' File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", line 799, '
'in perform_job\\n rv = job.perform()\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/job.py", line 600, in '
'perform\\n self._result = self._execute()\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/job.py", line 606, in '
'_execute\\n return self.func(*self.args, **self.kwargs)\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", '
'line 79, in synchronize\\n loop.run_until_complete(pipeline)\\n File '
'"/usr/lib64/python3.7/asyncio/base_events.py", line 584, in '
'run_until_complete\\n return future.result()\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", '
'line 209, in create_pipeline\\n await asyncio.gather(*futures)\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", '
'line 43, in __call__\\n await self.run()\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", '
'line 234, in run\\n results = downloader.result()\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/download/base.py", '
'line 212, in run\\n return await self._run(extra_data=extra_data)\\n '
'File "/usr/local/lib/pulp/lib64/python3.7/site-packages/backoff/_async.py", '
'line 131, in retry\\n ret = await target(*args, **kwargs)\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/download/http.py", '
'line 183, in _run\\n response.raise_for_status()\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/aiohttp/client_reqrep.py", '
"line 942, in raise_for_status\\n headers=self.headers)\\n'}\n"
"Full task report: {'_href': '/pulp/api/v3/tasks/213/', '_created': "
"'2018-12-21T00:58:21.054706Z', 'job_id': "
"'5d967c5e-4903-43e9-a171-9fbad97bae9e', 'state': 'failed', 'name': "
"'pulp_rpm.app.tasks.synchronizing.synchronize', 'started_at': "
"'2018-12-21T00:58:21.130936Z', 'finished_at': '2018-12-21T00:58:21.589230Z', "
'\'non_fatal_errors\': [], \'error\': {\'code\': None, \'description\': "404, '
'message=\'Not Found\'", \'traceback\': \' File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", line 799, '
'in perform_job\\n rv = job.perform()\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/job.py", line 600, in '
'perform\\n self._result = self._execute()\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/job.py", line 606, in '
'_execute\\n return self.func(*self.args, **self.kwargs)\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", '
'line 79, in synchronize\\n loop.run_until_complete(pipeline)\\n File '
'"/usr/lib64/python3.7/asyncio/base_events.py", line 584, in '
'run_until_complete\\n return future.result()\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", '
'line 209, in create_pipeline\\n await asyncio.gather(*futures)\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", '
'line 43, in __call__\\n await self.run()\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", '
'line 234, in run\\n results = downloader.result()\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/download/base.py", '
'line 212, in run\\n return await self._run(extra_data=extra_data)\\n '
'File "/usr/local/lib/pulp/lib64/python3.7/site-packages/backoff/_async.py", '
'line 131, in retry\\n ret = await target(*args, **kwargs)\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/download/http.py", '
'line 183, in _run\\n response.raise_for_status()\\n File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/aiohttp/client_reqrep.py", '
"line 942, in raise_for_status\\n headers=self.headers)\\n'}, 'worker': "
"'/pulp/api/v3/workers/1/', 'parent': None, 'spawned_tasks': [], "
"'progress_reports': [{'message': 'Downloading and Parsing Metadata', "
"'state': 'failed', 'total': 2, 'done': 2, 'suffix': '', 'task': "
"'/pulp/api/v3/tasks/213/'}, {'message': 'Downloading Artifacts', 'state': "
"'canceled', 'total': None, 'done': 0, 'suffix': '', 'task': "
"'/pulp/api/v3/tasks/213/'}, {'message': 'Associating Content', 'state': "
"'canceled', 'total': None, 'done': 0, 'suffix': '', 'task': "
"'/pulp/api/v3/tasks/213/'}], 'created_resources': []}",
{'_created': '2018-12-21T00:58:21.054706Z',
'_href': '/pulp/api/v3/tasks/213/',
'created_resources': [],
'error': {'code': None,
'description': "404, message='Not Found'",
'traceback': ' File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", '
'line 799, in perform_job\n'
' rv = job.perform()\n'
' File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/job.py", '
'line 600, in perform\n'
' self._result = self._execute()\n'
' File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/job.py", '
'line 606, in _execute\n'
' return self.func(*self.args, **self.kwargs)\n'
' File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", '
'line 79, in synchronize\n'
' loop.run_until_complete(pipeline)\n'
' File '
'"/usr/lib64/python3.7/asyncio/base_events.py", line '
'584, in run_until_complete\n'
' return future.result()\n'
' File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", '
'line 209, in create_pipeline\n'
' await asyncio.gather(*futures)\n'
' File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", '
'line 43, in __call__\n'
' await self.run()\n'
' File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", '
'line 234, in run\n'
' results = downloader.result()\n'
' File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/download/base.py", '
'line 212, in run\n'
' return await self._run(extra_data=extra_data)\n'
' File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/backoff/_async.py", '
'line 131, in retry\n'
' ret = await target(*args, **kwargs)\n'
' File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/download/http.py", '
'line 183, in _run\n'
' response.raise_for_status()\n'
' File '
'"/usr/local/lib/pulp/lib64/python3.7/site-packages/aiohttp/client_reqrep.py", '
'line 942, in raise_for_status\n'
' headers=self.headers)\n'},
'finished_at': '2018-12-21T00:58:21.589230Z',
'job_id': '5d967c5e-4903-43e9-a171-9fbad97bae9e',
'name': 'pulp_rpm.app.tasks.synchronizing.synchronize',
'non_fatal_errors': [],
'parent': None,
'progress_reports': [{'done': 2,
'message': 'Downloading and Parsing Metadata',
'state': 'failed',
'suffix': '',
'task': '/pulp/api/v3/tasks/213/',
'total': 2},
{'done': 0,
'message': 'Downloading Artifacts',
'state': 'canceled',
'suffix': '',
'task': '/pulp/api/v3/tasks/213/',
'total': None},
{'done': 0,
'message': 'Associating Content',
'state': 'canceled',
'suffix': '',
'task': '/pulp/api/v3/tasks/213/',
'total': None}],
'spawned_tasks': [],
'started_at': '2018-12-21T00:58:21.130936Z',
'state': 'failed',
'worker': '/pulp/api/v3/workers/1/'})
Same happen using the following test repos:
missing other
missing primary
A descriptive error should be returned.
Pulp 2 does fail before any content being synced. Related issue: https://pulp.plan.io/issues/1287
Updated by ttereshc over 5 years ago
It is up for a discussion what a valid repo is.
Yum/dnf won't complain if repo has primary.xml only, some functionality, like file search, will be missing though.
There are complains about pulp2 that it requires filelist. Many RPM repo providers don't generate filelist due to its size.
+1 to handle this case in a better way.
Which way (require anything else except primary.xml or not) is an open question, in my opinion.
Updated by ttereshc over 5 years ago
Currently it's required for a remote RPM repo to have primary.xml, other.xml and filelists.xml.
To my knowledge yum accepts repos with primary.xml only.
DNF is ok with absence of other.xml.
Comment on that topic from Daniel Mach, Sep'18:
DNF could eventually work with primary.xml only assuming there are no file dependencies.
filelists.xml is typically required for deps and necessary for file queries.
At this moment filelists are required by DNF.
others.xml weren't used until recent time (we're adding support for changelogs)
If we try to support primary.xml only, the following should be taken into account:
- during sync package metadata is taken from primary.xml, other.xml, filelists.xml nad is stored in Pulp. It's not taken directly from the package headers (in case of on_demand it's impossible anyway)
- What to do when the same package appear in different repos with all repodata and without some repodata?
Example:
- package A is synced from a repo without filelists info
- the same package A is synced from another repo with all the repodata (or the existing remote repo started to publish all repodata)
- how do we handle this case and figure out if we need to "update" (in the DB) the existing RPM with additional metadata?
- how to do the "update" of the existing RPM?
- what are the expectations at publish time? all repodata/partial repodata/mixture of both?
My opinion - Pulp should require all three primary.xml, other.xml and filelists.xml.
Downside: some of the repos (e.g. gitlab) Pulp will reject to sync while yum/dnf will still work with them.
Updated by ipanova@redhat.com over 5 years ago
with the given input +1 to require primary, filelists, other.
Gitlab repos miss filelists so in theory dnf would also fail.
Updated by jsherril@redhat.com over 5 years ago
I'm fine with this as well, but could we re-purpose this issue to produce better errors for such cases? I don't see any mention of filelists in the traceback above. A better error explaining that filelists (or other) is missing and is required would reduce user confusion and frustration.
Updated by ttereshc over 5 years ago
- Subject changed from Pulp 3 does not handle RPMS repos with invalid metadata to Pulp does not provide a descriptive error message for RPM repos with invalid metadata
- Description updated (diff)
Updated by fao89 almost 5 years ago
- Status changed from NEW to POST
Added by Fabricio Aguiar almost 5 years ago
Updated by Anonymous almost 5 years ago
- Status changed from POST to MODIFIED
Applied in changeset 800e29c72ea8a169639f8c44d1a53f838bc8c50c.
Added by Fabricio Aguiar almost 5 years ago
Revision e896186f | View on GitHub
Descriptive error for RPM with invalid metadata
https://pulp.plan.io/issues/4424 closes #4424
(cherry picked from commit 800e29c72ea8a169639f8c44d1a53f838bc8c50c)
Updated by Anonymous almost 5 years ago
Applied in changeset e896186f5ce603fe0d5605f6b3a27ad329ededc9.
Updated by ttereshc almost 5 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Descriptive error for RPM with invalid metadata
https://pulp.plan.io/issues/4424 closes #4424