Project

Profile

Help

Issue #8027

closed

Microsoft Visual Studio Debian repo fails to synchronize.

Added by quba42 about 4 years ago. Updated about 3 years ago.

Status:
CLOSED - WORKSFORME
Priority:
Normal
Assignee:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version - Debian:
Platform Release:
Target Release - Debian:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

Use the following test case to reproduce the issue:

  1. Variables:
BASE_ADDR=':'
ENTITIES_NAME='visual_studio'
UPSTREAM_URL='http://packages.microsoft.com/repos/vscode/'
DISTRIBUTIONS='stable'
  1. Commands to trigger the sync:
REPO_HREF=$(http post "${BASE_ADDR}"/pulp/api/v3/repositories/deb/apt/ name="${ENTITIES_NAME}" | jq -r '.pulp_href')
REMOTE_HREF=$(http post "${BASE_ADDR}"/pulp/api/v3/remotes/deb/apt/ name="${ENTITIES_NAME}" distributions="${DISTRIBUTIONS}" url="${UPSTREAM_URL}" | jq -r '.pulp_href')
SYNC_TASK_HREF=$(http post "${BASE_ADDR}${REPO_HREF}"sync/ remote="${REMOTE_HREF}" | jq -r '.task')
  1. Check the sync state:
http get "${BASE_ADDR}${SYNC_TASK_HREF}" | jq '.state'

Important: Since the upstream repo contains every package in a gazillion versions, you will need about 70GiB of disk space (and quite some time) to perform the test sync!

The full back trace:

ValueError: invalid literal for int() with base 10: 'NaN'
Traceback (most recent call last):
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/rq/worker.py", line 975, in perform_job
    rv = job.perform()
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/rq/job.py", line 696, in perform
    self._result = self._execute()
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/rq/job.py", line 719, in _execute
    return self.func(*self.args, **self.kwargs)
  File "/home/vagrant/devel/pulp_deb/pulp_deb/app/tasks/synchronizing.py", line 106, in synchronize
    DebDeclarativeVersion(first_stage, repository, mirror=mirror).create()
  File "/home/vagrant/devel/pulpcore/pulpcore/plugin/stages/declarative_version.py", line 148, in create
    loop.run_until_complete(pipeline)
  File "/usr/lib64/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/home/vagrant/devel/pulpcore/pulpcore/plugin/stages/api.py", line 225, in create_pipeline
    await asyncio.gather(*futures)
  File "/home/vagrant/devel/pulpcore/pulpcore/plugin/stages/api.py", line 43, in __call__
    await self.run()
  File "/home/vagrant/devel/pulpcore/pulpcore/plugin/stages/content_stages.py", line 95, in run
    d_content.content.save()
  File "/home/vagrant/devel/pulpcore/pulpcore/app/models/base.py", line 115, in save
    return super().save(*args, **kwargs)
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django_lifecycle/mixins.py", line 129, in save
    save(*args, **kwargs)
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/base.py", line 743, in save
    self.save_base(using=using, force_insert=force_insert,
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/base.py", line 780, in save_base
    updated = self._save_table(
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/base.py", line 873, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/base.py", line 910, in _do_insert
    return manager._insert([self], fields=fields, return_id=update_pk,
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/query.py", line 1186, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/sql/compiler.py", line 1376, in execute_sql
    for sql, params in self.as_sql():
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django_readonly_field/compiler.py", line 31, in as_sql
    return super(ReadonlySQLCompilerMixin, self).as_sql()
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/sql/compiler.py", line 1318, in as_sql
    value_rows = [
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/sql/compiler.py", line 1319, in <listcomp>
    [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/sql/compiler.py", line 1319, in <listcomp>
    [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/sql/compiler.py", line 1260, in prepare_value
    value = field.get_db_prep_save(value, connection=self.connection)
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/fields/__init__.py", line 793, in get_db_prep_save
    return self.get_db_prep_value(value, connection=connection, prepared=False)
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/fields/__init__.py", line 788, in get_db_prep_value
    value = self.get_prep_value(value)
  File "/usr/local/lib/pulp/lib64/python3.8/site-packages/django/db/models/fields/__init__.py", line 1825, in get_prep_value
    return int(value)
ValueError: invalid literal for int() with base 10: 'NaN'

Analysis of the problem

The Packages files of the upstream repo contain Description: fields with illegal newline characters (can be found using the vim search /^ V\@!). This causes python-debian to interpret everything past the offending newline characters as a new packages paragraph. Typically this separate packages paragraph will contain only the Homepage: field, which is obviously not valid.

In Pulp 2 times, we would simply ignore the error from such packages paragraphs, print a warning and hope for the best.

The minimum solution is to reproduce the Pulp 2 workaround for Pulp 3. However, it might be nice to at least print a somewhat more informative log warning. (Ideally something containing the path to the affected package index along with a line number were the problematic packages paragraph is located.)


Related issues

Related to Debian Support - Issue #9333: Ignore package fields with corrupt valuesCLOSED - CURRENTRELEASEActions
Actions #1

Updated by quba42 over 3 years ago

  • Priority changed from Normal to High
Actions #2

Updated by quba42 over 3 years ago

  • Priority changed from High to Normal
Actions #3

Updated by quba42 over 3 years ago

  • Related to Issue #9333: Ignore package fields with corrupt values added
Actions #4

Updated by quba42 over 3 years ago

  • Status changed from NEW to CLOSED - WORKSFORME

With this fix: https://github.com/pulp/pulp_deb/pull/348

The original Pulp 2 whitespace issue is no longer reproducible. I am closing this CLOSED - WORKSFORME.

If a reproducer is found, this can always be re-opened.

Actions #5

Updated by quba42 about 3 years ago

  • Sprint/Milestone deleted (Katello)

Also available in: Atom PDF