Project

Profile

Help

Issue #9399

sync error: invalid memory alloc request size

Added by keilr 1 day ago. Updated about 12 hours ago.

Status:
NEW
Priority:
Normal
Assignee:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

pulp version: container image docker.io/pulp/pulp:3.15

remote repository definition:

$ pulp --base-url http://localhost:8080 --username admin --password secret rpm remote show --name "packages-microsoft-com-prod-rhel8"
{
  "pulp_href": "/pulp/api/v3/remotes/rpm/rpm/c455d98d-4c8c-446d-aa4d-dffe568675d6/",
  "pulp_created": "2021-09-14T14:51:03.146140Z",
  "name": "packages-microsoft-com-prod-rhel8",
  "url": "https://packages.microsoft.com/rhel/8/prod/",
  "ca_cert": null,
  "client_cert": null,
  "tls_validation": true,
  "proxy_url": "http://proxy.example.com:8080",
  "pulp_labels": {},
  "pulp_last_updated": "2021-09-14T14:51:03.146171Z",
  "download_concurrency": null,
  "max_retries": null,
  "policy": "immediate",
  "total_timeout": null,
  "connect_timeout": null,
  "sock_connect_timeout": null,
  "sock_read_timeout": null,
  "headers": null,
  "rate_limit": null,
  "sles_auth_token": null
}

error:

$ podman logs --follow pulp
pulp [505c8f64043741a7b8f09eac46fa8331]: pulpcore.tasking.pulpcore_worker:INFO: Task a7a26b28-9251-4da1-82ef-19272b6779f0 failed (invalid memory alloc request size 1073741824
)
pulp [505c8f64043741a7b8f09eac46fa8331]: pulpcore.tasking.pulpcore_worker:INFO:   File "/usr/local/lib/python3.8/site-packages/pulpcore/tasking/pulpcore_worker.py", line 323, in _perform_task
    result = func(*args, **kwargs)

  File "/usr/local/lib/python3.8/site-packages/pulp_rpm/app/tasks/synchronizing.py", line 471, in synchronize
    version = dv.create()

  File "/usr/local/lib/python3.8/site-packages/pulpcore/plugin/stages/declarative_version.py", line 151, in create
    loop.run_until_complete(pipeline)

  File "/usr/lib64/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()

  File "/usr/local/lib/python3.8/site-packages/pulpcore/plugin/stages/api.py", line 225, in create_pipeline
    await asyncio.gather(*futures)

  File "/usr/local/lib/python3.8/site-packages/pulpcore/plugin/stages/api.py", line 43, in __call__
    await self.run()

  File "/usr/local/lib/python3.8/site-packages/pulpcore/plugin/stages/content_stages.py", line 174, in run
    await sync_to_async(process_batch)()

  File "/usr/local/lib/python3.8/site-packages/asgiref/sync.py", line 444, in __call__
    ret = await asyncio.wait_for(future, timeout=None)

  File "/usr/lib64/python3.8/asyncio/tasks.py", line 455, in wait_for
    return await fut

  File "/usr/lib64/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)

  File "/usr/local/lib/python3.8/site-packages/asgiref/sync.py", line 486, in thread_handler
    return func(*args, **kwargs)

  File "/usr/local/lib/python3.8/site-packages/pulpcore/plugin/stages/content_stages.py", line 122, in process_batch
    d_content.content.save()

  File "/usr/local/lib/python3.8/site-packages/pulpcore/app/models/base.py", line 149, in save
    return super().save(*args, **kwargs)

  File "/usr/local/lib/python3.8/site-packages/django_lifecycle/mixins.py", line 134, in save
    save(*args, **kwargs)

  File "/usr/local/lib/python3.8/site-packages/django/db/models/base.py", line 726, in save
    self.save_base(using=using, force_insert=force_insert,

  File "/usr/local/lib/python3.8/site-packages/django/db/models/base.py", line 763, in save_base
    updated = self._save_table(

  File "/usr/local/lib/python3.8/site-packages/django/db/models/base.py", line 868, in _save_table
    results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)

  File "/usr/local/lib/python3.8/site-packages/django/db/models/base.py", line 906, in _do_insert
    return manager._insert(

  File "/usr/local/lib/python3.8/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)

  File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 1270, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)

  File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1416, in execute_sql
    cursor.execute(sql, params)

  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)

  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)

  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)

  File "/usr/local/lib/python3.8/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value

  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)

All other repo sync jobs work without problems. (e.g. official rhel(7|8) repos, epel, ...)

Is anyone able to reproduce this MS repo issue?


Related issues

Related to RPM Support - Issue #9406: Trivial OOM on sync for a particular Microsoft repoNEW<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

History

#1 Updated by keilr 1 day ago

Free memory shouldn't be an issue. The server has 12GB memory, so around 10GB available for sync jobs.

#2 Updated by dalley about 13 hours ago

I've never seen this error before. It seems to be a Postgresql error that is thrown when a query is larger than Postgresql's 1gb query size limit, and "1073741824" is 550 bytes more than 1gb.

It looks like the "filelists" metadata has an absolutely insane expansion factor, presumably because the repo has many copies of a few very large packages.

<data type="filelists">
.... snip ....
<size>7943458</size>
<open-size>1199799539</open-size>
</data>

That is, 7.6 megabytes compressed expands to 1.2gb decompressed.

I'm guessing the problem is that we try to batch inserts for efficiency, with a batch size of around 500 IIRC, so all of the packages would be included in this batch - and the transaction is growing too large for Postgresql to handle.

Normally this would perfectly fine but this particular repo is so "dense" in terms of amount of metadata per package that it is not.

#3 Updated by dalley about 13 hours ago

250

Additionally, this repo is a little strange for other reasons. A couple packages are listed twice with the same NEVRA (name-epoch-version-release-architecture) but different checksums. You're really not supposed to do that. It's difficult to know which package clients would actually end up installing. It looks like the only difference between the two is the filename itself, "blobfuse-1.4.1-RHEL-8.1-x86_64.rpm" vs "blobfuse-1.4.1-RHEL-8.2-x86_64.rpm" and I'm pretty sure that clients like DNF and YUM can't make that distinction.

#4 Updated by dalley about 12 hours ago

  • Related to Issue #9406: Trivial OOM on sync for a particular Microsoft repo added

#5 Updated by dalley about 12 hours ago

I actually can't reproduce because syncing this repository causes my VM to run out of memory (it has 9.6gb available).

That is a problem in and of itself, the memory consumption shouldn't be that high, and isn't for most repos, and I just confirmed that it does not do that for RHEL7, which is overall a much larger repo. So something weird is going on there as well - filed #9406 for that.

Please register to edit this issue

Also available in: Atom PDF