Project

Profile

Help

Issue #4424

closed

Pulp does not provide a descriptive error message for RPM repos with invalid metadata

Added by kersom almost 6 years ago. Updated almost 5 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

Pulp 3 does not handle RPM repos with invalid metadata.

When syncing a repository that's missing its ``filelists.xml`` file. Like: missing file list

Traceback:


("Task report /pulp/api/v3/tasks/213/ contains a error: {'code': None, "
 '\'description\': "404, message=\'Not Found\'", \'traceback\': \'  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", line 799, '
 'in perform_job\\n    rv = job.perform()\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/job.py", line 600, in '
 'perform\\n    self._result = self._execute()\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/job.py", line 606, in '
 '_execute\\n    return self.func(*self.args, **self.kwargs)\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", '
 'line 79, in synchronize\\n    loop.run_until_complete(pipeline)\\n  File '
 '"/usr/lib64/python3.7/asyncio/base_events.py", line 584, in '
 'run_until_complete\\n    return future.result()\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", '
 'line 209, in create_pipeline\\n    await asyncio.gather(*futures)\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", '
 'line 43, in __call__\\n    await self.run()\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", '
 'line 234, in run\\n    results = downloader.result()\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/download/base.py", '
 'line 212, in run\\n    return await self._run(extra_data=extra_data)\\n  '
 'File "/usr/local/lib/pulp/lib64/python3.7/site-packages/backoff/_async.py", '
 'line 131, in retry\\n    ret = await target(*args, **kwargs)\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/download/http.py", '
 'line 183, in _run\\n    response.raise_for_status()\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/aiohttp/client_reqrep.py", '
 "line 942, in raise_for_status\\n    headers=self.headers)\\n'}\n"
 "Full task report: {'_href': '/pulp/api/v3/tasks/213/', '_created': "
 "'2018-12-21T00:58:21.054706Z', 'job_id': "
 "'5d967c5e-4903-43e9-a171-9fbad97bae9e', 'state': 'failed', 'name': "
 "'pulp_rpm.app.tasks.synchronizing.synchronize', 'started_at': "
 "'2018-12-21T00:58:21.130936Z', 'finished_at': '2018-12-21T00:58:21.589230Z', "
 '\'non_fatal_errors\': [], \'error\': {\'code\': None, \'description\': "404, '
 'message=\'Not Found\'", \'traceback\': \'  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", line 799, '
 'in perform_job\\n    rv = job.perform()\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/job.py", line 600, in '
 'perform\\n    self._result = self._execute()\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/job.py", line 606, in '
 '_execute\\n    return self.func(*self.args, **self.kwargs)\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", '
 'line 79, in synchronize\\n    loop.run_until_complete(pipeline)\\n  File '
 '"/usr/lib64/python3.7/asyncio/base_events.py", line 584, in '
 'run_until_complete\\n    return future.result()\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", '
 'line 209, in create_pipeline\\n    await asyncio.gather(*futures)\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", '
 'line 43, in __call__\\n    await self.run()\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", '
 'line 234, in run\\n    results = downloader.result()\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/download/base.py", '
 'line 212, in run\\n    return await self._run(extra_data=extra_data)\\n  '
 'File "/usr/local/lib/pulp/lib64/python3.7/site-packages/backoff/_async.py", '
 'line 131, in retry\\n    ret = await target(*args, **kwargs)\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/download/http.py", '
 'line 183, in _run\\n    response.raise_for_status()\\n  File '
 '"/usr/local/lib/pulp/lib64/python3.7/site-packages/aiohttp/client_reqrep.py", '
 "line 942, in raise_for_status\\n    headers=self.headers)\\n'}, 'worker': "
 "'/pulp/api/v3/workers/1/', 'parent': None, 'spawned_tasks': [], "
 "'progress_reports': [{'message': 'Downloading and Parsing Metadata', "
 "'state': 'failed', 'total': 2, 'done': 2, 'suffix': '', 'task': "
 "'/pulp/api/v3/tasks/213/'}, {'message': 'Downloading Artifacts', 'state': "
 "'canceled', 'total': None, 'done': 0, 'suffix': '', 'task': "
 "'/pulp/api/v3/tasks/213/'}, {'message': 'Associating Content', 'state': "
 "'canceled', 'total': None, 'done': 0, 'suffix': '', 'task': "
 "'/pulp/api/v3/tasks/213/'}], 'created_resources': []}",
 {'_created': '2018-12-21T00:58:21.054706Z',
  '_href': '/pulp/api/v3/tasks/213/',
  'created_resources': [],
  'error': {'code': None,
            'description': "404, message='Not Found'",
            'traceback': '  File '
                         '"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", '
                         'line 799, in perform_job\n'
                         '    rv = job.perform()\n'
                         '  File '
                         '"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/job.py", '
                         'line 600, in perform\n'
                         '    self._result = self._execute()\n'
                         '  File '
                         '"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/job.py", '
                         'line 606, in _execute\n'
                         '    return self.func(*self.args, **self.kwargs)\n'
                         '  File '
                         '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", '
                         'line 79, in synchronize\n'
                         '    loop.run_until_complete(pipeline)\n'
                         '  File '
                         '"/usr/lib64/python3.7/asyncio/base_events.py", line '
                         '584, in run_until_complete\n'
                         '    return future.result()\n'
                         '  File '
                         '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", '
                         'line 209, in create_pipeline\n'
                         '    await asyncio.gather(*futures)\n'
                         '  File '
                         '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", '
                         'line 43, in __call__\n'
                         '    await self.run()\n'
                         '  File '
                         '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", '
                         'line 234, in run\n'
                         '    results = downloader.result()\n'
                         '  File '
                         '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/download/base.py", '
                         'line 212, in run\n'
                         '    return await self._run(extra_data=extra_data)\n'
                         '  File '
                         '"/usr/local/lib/pulp/lib64/python3.7/site-packages/backoff/_async.py", '
                         'line 131, in retry\n'
                         '    ret = await target(*args, **kwargs)\n'
                         '  File '
                         '"/usr/local/lib/pulp/lib64/python3.7/site-packages/pulpcore/plugin/download/http.py", '
                         'line 183, in _run\n'
                         '    response.raise_for_status()\n'
                         '  File '
                         '"/usr/local/lib/pulp/lib64/python3.7/site-packages/aiohttp/client_reqrep.py", '
                         'line 942, in raise_for_status\n'
                         '    headers=self.headers)\n'},
  'finished_at': '2018-12-21T00:58:21.589230Z',
  'job_id': '5d967c5e-4903-43e9-a171-9fbad97bae9e',
  'name': 'pulp_rpm.app.tasks.synchronizing.synchronize',
  'non_fatal_errors': [],
  'parent': None,
  'progress_reports': [{'done': 2,
                        'message': 'Downloading and Parsing Metadata',
                        'state': 'failed',
                        'suffix': '',
                        'task': '/pulp/api/v3/tasks/213/',
                        'total': 2},
                       {'done': 0,
                        'message': 'Downloading Artifacts',
                        'state': 'canceled',
                        'suffix': '',
                        'task': '/pulp/api/v3/tasks/213/',
                        'total': None},
                       {'done': 0,
                        'message': 'Associating Content',
                        'state': 'canceled',
                        'suffix': '',
                        'task': '/pulp/api/v3/tasks/213/',
                        'total': None}],
  'spawned_tasks': [],
  'started_at': '2018-12-21T00:58:21.130936Z',
  'state': 'failed',
  'worker': '/pulp/api/v3/workers/1/'})

Same happen using the following test repos:
missing other
missing primary

A descriptive error should be returned.

Pulp 2 does fail before any content being synced. Related issue: https://pulp.plan.io/issues/1287

Actions #1

Updated by ttereshc almost 6 years ago

It is up for a discussion what a valid repo is.
Yum/dnf won't complain if repo has primary.xml only, some functionality, like file search, will be missing though.
There are complains about pulp2 that it requires filelist. Many RPM repo providers don't generate filelist due to its size.

+1 to handle this case in a better way.
Which way (require anything else except primary.xml or not) is an open question, in my opinion.

Actions #2

Updated by ttereshc almost 6 years ago

  • Triaged changed from No to Yes
Actions #3

Updated by ttereshc over 5 years ago

Currently it's required for a remote RPM repo to have primary.xml, other.xml and filelists.xml.
To my knowledge yum accepts repos with primary.xml only.
DNF is ok with absence of other.xml.
Comment on that topic from Daniel Mach, Sep'18:

DNF could eventually work with primary.xml only assuming there are no file dependencies.
filelists.xml is typically required for deps and necessary for file queries.
At this moment filelists are required by DNF.
others.xml weren't used until recent time (we're adding support for changelogs)

If we try to support primary.xml only, the following should be taken into account:
- during sync package metadata is taken from primary.xml, other.xml, filelists.xml nad is stored in Pulp. It's not taken directly from the package headers (in case of on_demand it's impossible anyway)
- What to do when the same package appear in different repos with all repodata and without some repodata?
Example:
- package A is synced from a repo without filelists info
- the same package A is synced from another repo with all the repodata (or the existing remote repo started to publish all repodata)
- how do we handle this case and figure out if we need to "update" (in the DB) the existing RPM with additional metadata?
- how to do the "update" of the existing RPM?
- what are the expectations at publish time? all repodata/partial repodata/mixture of both?

My opinion - Pulp should require all three primary.xml, other.xml and filelists.xml.
Downside: some of the repos (e.g. gitlab) Pulp will reject to sync while yum/dnf will still work with them.

Actions #4

Updated by ipanova@redhat.com over 5 years ago

with the given input +1 to require primary, filelists, other.

Gitlab repos miss filelists so in theory dnf would also fail.

Actions #5

Updated by bmbouter over 5 years ago

+1 for all the reasons @ipanova also +1'd

Actions #6

Updated by jsherril@redhat.com over 5 years ago

I'm fine with this as well, but could we re-purpose this issue to produce better errors for such cases? I don't see any mention of filelists in the traceback above. A better error explaining that filelists (or other) is missing and is required would reduce user confusion and frustration.

Actions #7

Updated by ttereshc over 5 years ago

  • Subject changed from Pulp 3 does not handle RPMS repos with invalid metadata to Pulp does not provide a descriptive error message for RPM repos with invalid metadata
  • Description updated (diff)
Actions #8

Updated by bmbouter over 5 years ago

  • Tags deleted (Pulp 3)
Actions #9

Updated by fao89 about 5 years ago

  • Status changed from NEW to POST
Actions #10

Updated by fao89 about 5 years ago

  • Assignee set to fao89

Added by Fabricio Aguiar almost 5 years ago

Revision 800e29c7 | View on GitHub

Descriptive error for RPM with invalid metadata

https://pulp.plan.io/issues/4424 closes #4424

Actions #11

Updated by Anonymous almost 5 years ago

  • Status changed from POST to MODIFIED

Added by Fabricio Aguiar almost 5 years ago

Revision e896186f | View on GitHub

Descriptive error for RPM with invalid metadata

https://pulp.plan.io/issues/4424 closes #4424

(cherry picked from commit 800e29c72ea8a169639f8c44d1a53f838bc8c50c)

Actions #12

Updated by Anonymous almost 5 years ago

Actions #13

Updated by ttereshc almost 5 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF