Issue #8027
closedMicrosoft Visual Studio Debian repo fails to synchronize.
Description
Use the following test case to reproduce the issue:¶
- Variables:
BASE_ADDR=':'
ENTITIES_NAME='visual_studio'
UPSTREAM_URL='http://packages.microsoft.com/repos/vscode/'
DISTRIBUTIONS='stable'
- Commands to trigger the sync:
REPO_HREF=$(http post "${BASE_ADDR}"/pulp/api/v3/repositories/deb/apt/ name="${ENTITIES_NAME}" | jq -r '.pulp_href')
REMOTE_HREF=$(http post "${BASE_ADDR}"/pulp/api/v3/remotes/deb/apt/ name="${ENTITIES_NAME}" distributions="${DISTRIBUTIONS}" url="${UPSTREAM_URL}" | jq -r '.pulp_href')
SYNC_TASK_HREF=$(http post "${BASE_ADDR}${REPO_HREF}"sync/ remote="${REMOTE_HREF}" | jq -r '.task')
- Check the sync state:
http get "${BASE_ADDR}${SYNC_TASK_HREF}" | jq '.state'
Important: Since the upstream repo contains every package in a gazillion versions, you will need about 70GiB of disk space (and quite some time) to perform the test sync!
The full back trace:¶
ValueError: invalid literal for int() with base 10: 'NaN'
Traceback (most recent call last):
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/rq/worker.py", line 975, in perform_job
rv = job.perform()
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/rq/job.py", line 696, in perform
self._result = self._execute()
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/rq/job.py", line 719, in _execute
return self.func(*self.args, **self.kwargs)
File "/home/vagrant/devel/pulp_deb/pulp_deb/app/tasks/synchronizing.py", line 106, in synchronize
DebDeclarativeVersion(first_stage, repository, mirror=mirror).create()
File "/home/vagrant/devel/pulpcore/pulpcore/plugin/stages/declarative_version.py", line 148, in create
loop.run_until_complete(pipeline)
File "/usr/lib64/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/home/vagrant/devel/pulpcore/pulpcore/plugin/stages/api.py", line 225, in create_pipeline
await asyncio.gather(*futures)
File "/home/vagrant/devel/pulpcore/pulpcore/plugin/stages/api.py", line 43, in __call__
await self.run()
File "/home/vagrant/devel/pulpcore/pulpcore/plugin/stages/content_stages.py", line 95, in run
d_content.content.save()
File "/home/vagrant/devel/pulpcore/pulpcore/app/models/base.py", line 115, in save
return super().save(*args, **kwargs)
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django_lifecycle/mixins.py", line 129, in save
save(*args, **kwargs)
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/base.py", line 743, in save
self.save_base(using=using, force_insert=force_insert,
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/base.py", line 780, in save_base
updated = self._save_table(
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/base.py", line 873, in _save_table
result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/base.py", line 910, in _do_insert
return manager._insert([self], fields=fields, return_id=update_pk,
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/manager.py", line 82, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/query.py", line 1186, in _insert
return query.get_compiler(using=using).execute_sql(return_id)
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/sql/compiler.py", line 1376, in execute_sql
for sql, params in self.as_sql():
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django_readonly_field/compiler.py", line 31, in as_sql
return super(ReadonlySQLCompilerMixin, self).as_sql()
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/sql/compiler.py", line 1318, in as_sql
value_rows = [
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/sql/compiler.py", line 1319, in <listcomp>
[self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/sql/compiler.py", line 1319, in <listcomp>
[self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/sql/compiler.py", line 1260, in prepare_value
value = field.get_db_prep_save(value, connection=self.connection)
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/fields/__init__.py", line 793, in get_db_prep_save
return self.get_db_prep_value(value, connection=connection, prepared=False)
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/fields/__init__.py", line 788, in get_db_prep_value
value = self.get_prep_value(value)
File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/fields/__init__.py", line 1825, in get_prep_value
return int(value)
ValueError: invalid literal for int() with base 10: 'NaN'
Analysis of the problem¶
The Packages
files of the upstream repo contain Description:
fields with illegal newline characters (can be found using the vim search /^ V\@!
).
This causes python-debian to interpret everything past the offending newline characters as a new packages paragraph.
Typically this separate packages paragraph will contain only the Homepage:
field, which is obviously not valid.
In Pulp 2 times, we would simply ignore the error from such packages paragraphs, print a warning and hope for the best.
The minimum solution is to reproduce the Pulp 2 workaround for Pulp 3. However, it might be nice to at least print a somewhat more informative log warning. (Ideally something containing the path to the affected package index along with a line number were the problematic packages paragraph is located.)
Related issues
Updated by quba42 over 3 years ago
- Related to Issue #9333: Ignore package fields with corrupt values added
Updated by quba42 over 3 years ago
- Status changed from NEW to CLOSED - WORKSFORME
With this fix: https://github.com/pulp/pulp_deb/pull/348
The original Pulp 2 whitespace issue is no longer reproducible. I am closing this CLOSED - WORKSFORME.
If a reproducer is found, this can always be re-opened.