Pulp: Issueshttps://pulp.plan.io/https://pulp.plan.io/favicon.ico2022-02-25T09:17:36ZPulp
Planio Packaging - Issue #9678 (CLOSED - NOTABUG): Repository version errors : Path errors found. Paths ...https://pulp.plan.io/issues/96782022-02-25T09:17:36Zatom
<p>Hi,
when syncing a repo in Foreman I get this error, when I set "Mirroring Policy" to "Additive"</p>
<pre><code>Repository version errors : Path errors found. Paths are duplicated: dists/focal-updates/Release.gpg,dists/focal-updates/Release,dists/focal-updates/InRelease
</code></pre>
<p>When I then change Mirror Policy to "Content Only", I first get this error:</p>
<pre><code>Error message: the server returns an error
HTTP status code: 500
Response headers: {"date"=>"Fri, 25 Feb 2022 08:14:29 GMT", "server"=>"gunicorn", "content-type"=>"text/html; charset=UTF-8", "x-frame-options"=>"DENY", "content-length"=>"145", "vary"=>"Cookie", "x-content-type-options"=>"nosniff", "referrer-policy"=>"same-origin", "correlation-id"=>"76e5736f-1368-4192-ac43-a687df982dc0", "access-control-expose-headers"=>"Correlation-ID", "via"=>"1.1 test-foreman.nukular.local", "connection"=>"close"}
Response body:
<!doctype html>
<html lang="en">
<head>
<title>Server Error (500)</title>
</head>
<body>
<h1>Server Error (500)</h1><p></p>
</body>
</html>
</code></pre>
<p>and when I then sync again, Mirror Policy still Content Only, sync is ok, without errors.
When I change back to addive, problem starts again from the beginning.</p>
<p>I am syncing:
<a href="http://archive.ubuntu.com/ubuntu/" class="external">http://archive.ubuntu.com/ubuntu/</a> focal-updates main restricted</p>
<p><strong>My environment:</strong></p>
<p>Centos 7.9
Foreman 3.1
Pulp:</p>
<pre><code>curl -k https://localhost/pulp/api/v3/status/ |jq '.versions'
"component": "core",
"version": "3.16.3"
"component": "rpm",
"version": "3.17.3"
"component": "python",
"version": "3.5.2"
"component": "file",
"version": "1.10.1"
"component": "deb",
"version": "2.16.1"
"component": "container",
"version": "2.9.2"
"component": "certguard",
"version": "1.5.1"
"component": "ansible",
"version": "0.10.1"
</code></pre>
<p>see attached the errorsdatails.</p>
<p>Thanks for help</p> Packaging - Issue #9675 (CLOSED - DUPLICATE): pulp.pulp_installer.pulp_common : Run pip-compile t...https://pulp.plan.io/issues/96752022-02-01T17:22:34Zmhsmith
<p>When running the an initial install, using the ansible installer version 3.17.0 I get the following error:</p>
<p>fatal: [gazza-boy]: FAILED! => {"changed": false, "cmd": ["/usr/local/lib/pulp/bin/pip-compile"], "delta": "0:00:00.641376", "end": "2022-02-01 13:52:47.438926", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2022-02-01 13:52:46.797550", "stderr": "Traceback (most recent call last):\n File "/usr/local/lib/pulp/bin/pip-compile", line 8, in \n sys.exit(cli())\n File "/usr/local/lib/pulp/lib64/python3.8/site-packages/click/core.py", line 1128, in <strong>call</strong>\n return self.main(*args, **kwargs)\n File "/usr/local/lib/pulp/lib64/python3.8/site-packages/click/core.py", line 1053, in main\n rv = self.invoke(ctx)\n File "/usr/local/lib/pulp/lib64/python3.8/site-packages/click/core.py", line 1395, in invoke\n return ctx.invoke(self.callback, **ctx.params)\n File "/usr/local/lib/pulp/lib64/python3.8/site-packages/click/core.py", line 754, in invoke\n return __callback(*args, **kwargs)\n File "/usr/local/lib/pulp/lib64/python3.8/site-packages/click/decorators.py", line 26, in new_func\n return f(get_current_context(), *args, **kwargs)\n File "/usr/local/lib/pulp/lib64/python3.8/site-packages/piptools/scripts/compile.py", line 342, in cli\n repository = PyPIRepository(pip_args, cache_dir=cache_dir)\n File "/usr/local/lib/pulp/lib64/python3.8/site-packages/piptools/repositories/pypi.py", line 106, in <strong>init</strong>\n self._setup_logging()\n File "/usr/local/lib/pulp/lib64/python3.8/site-packages/piptools/repositories/pypi.py", line 455, in _setup_logging\n assert isinstance(handler, logging.StreamHandler)\nAssertionError", "stderr_lines": ["Traceback (most recent call last):", " File "/usr/local/lib/pulp/bin/pip-compile", line 8, in ", " sys.exit(cli())", " File "/usr/local/lib/pulp/lib64/python3.8/site-packages/click/core.py", line 1128, in <strong>call</strong>", " return self.main(*args, **kwargs)", " File "/usr/local/lib/pulp/lib64/python3.8/site-packages/click/core.py", line 1053, in main", " rv = self.invoke(ctx)", " File "/usr/local/lib/pulp/lib64/python3.8/site-packages/click/core.py", line 1395, in invoke", " return ctx.invoke(self.callback, **ctx.params)", " File "/usr/local/lib/pulp/lib64/python3.8/site-packages/click/core.py", line 754, in invoke", " return __callback(*args, **kwargs)", " File "/usr/local/lib/pulp/lib64/python3.8/site-packages/click/decorators.py", line 26, in new_func", " return f(get_current_context(), *args, **kwargs)", " File "/usr/local/lib/pulp/lib64/python3.8/site-packages/piptools/scripts/compile.py", line 342, in cli", " repository = PyPIRepository(pip_args, cache_dir=cache_dir)", " File "/usr/local/lib/pulp/lib64/python3.8/site-packages/piptools/repositories/pypi.py", line 106, in <strong>init</strong>", " self._setup_logging()", " File "/usr/local/lib/pulp/lib64/python3.8/site-packages/piptools/repositories/pypi.py", line 455, in _setup_logging", " assert isinstance(handler, logging.StreamHandler)", "AssertionError"], "stdout": "", "stdout_lines": []}</p> Pulp - Issue #9671 (CLOSED - DUPLICATE): Access policies can't be initialized for viewsets that d...https://pulp.plan.io/issues/96712022-01-14T15:17:03Znewswangerd
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2077":<a href="https://github.com/pulp/pulpcore/issues/2077" class="external">https://github.com/pulp/pulpcore/issues/2077</a></p>
<hr>
<p>Pulpcore can only initialize access policies in the database for viewsets that are registered in the pulp/api/v3/ router (<a href="https://github.com/pulp/pulpcore/blob/main/pulpcore/app/apps.py#L219" class="external">https://github.com/pulp/pulpcore/blob/main/pulpcore/app/apps.py#L219</a>). Because of this, there is no easy way to initialize access policies in the database for apps that implement APIs outside of pulp/api/v3/ such as galaxy_ng, pulp_ansible and pulp_container.</p> Pulp - Story #9670 (CLOSED - DUPLICATE): In an access policy for reposiroty versions repository p...https://pulp.plan.io/issues/96702022-01-12T17:32:38Zmdellweg
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2076":<a href="https://github.com/pulp/pulpcore/issues/2076" class="external">https://github.com/pulp/pulpcore/issues/2076</a></p> Pulp - Issue #9669 (CLOSED - DUPLICATE): S3 URL generation for content artifacts is broken for ap...https://pulp.plan.io/issues/96692022-01-12T15:57:28Zjlsm-se
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2075":<a href="https://github.com/pulp/pulpcore/issues/2075" class="external">https://github.com/pulp/pulpcore/issues/2075</a></p>
<hr>
<p>Here's the simplified setup I've used to reproduce this issue:</p>
<pre><code>Pulp server:
CentOS 8
pulp_installer/pulpcore 3.17.2
pulp_deb 2.16
Minio server:
Ubuntu 18.04.6
minio 2022-01-08T03:11:54Z
HTTP load-balancer in front of Minio:
CentOS 8
haproxy 1.8.27 (also tried with nginx and sidekick)
apt client:
Ubuntu 18.04.6
apt 1.16.14
</code></pre>
<p>When I run <code>apt-get update</code> on a client configured to use the Pulp server, Pulp responds with the 302 redirect pointing to the HTTP load-balancer I've set up in front of Minio. So far so good.</p>
<p>The problem is that the redirect URL contains a semicolon as a query separator, which none of the load-balancers I've tried seem to handle correctly (the <code>filename</code> parameter in <code>response-content-disposition</code> seem to get discarded). The apt client always gets a 4XX error (e.g. <code>401 unauthorized</code>).</p>
<p>This seems to happen because content-proxies (that is, the load-balancers) will strip semicolons from query parameters, <a href="https://www.w3.org/TR/2017/REC-html52-20171214/" class="external">because that is what is recommended by the WHATWG since december 2017</a> and somewhat recently discovered <a href="https://snyk.io/blog/cache-poisoning-in-popular-open-source-packages/" class="external">cache poisoning attacks</a> seems to have sped up efforts to follow this recommendation among languages and frameworks (see <a href="https://bugs.python.org/issue42967" class="external">CVE-2021-23336</a>).</p>
<p>These two comments in the golang issue tracker helped me come to this conclusion:
<a href="https://github.com/golang/go/issues/25192#issuecomment-385662789" class="external">https://github.com/golang/go/issues/25192#issuecomment-385662789</a>
<a href="https://github.com/golang/go/issues/25192#issuecomment-789799446" class="external">https://github.com/golang/go/issues/25192#issuecomment-789799446</a></p>
<p>I've managed to hackishly solve the issue (apt clients can now use my repos!) with the below patch, but I'm not sure if it's actually the correct solution or even the safest since it still involves a semicolon as a query separator. The ideal would maybe be to avoid the semicolon entirely, but I'm not sure if the AWS S3 specs allow for that.</p>
<pre><code class="diff syntaxhl" data-language="diff"><span class="gh">diff --git a/pulpcore/content/handler.py b/pulpcore/content/handler.py
index 1d8e834c6..0db26d1eb 100644
</span><span class="gd">--- a/pulpcore/content/handler.py
</span><span class="gi">+++ b/pulpcore/content/handler.py
</span><span class="p">@@ -773,7 +773,8 @@</span> class Handler:
if settings.DEFAULT_FILE_STORAGE == "pulpcore.app.models.storage.FileSystem":
return FileResponse(os.path.join(settings.MEDIA_ROOT, artifact_name), headers=headers)
elif settings.DEFAULT_FILE_STORAGE == "storages.backends.s3boto3.S3Boto3Storage":
<span class="gd">- content_disposition = f"attachment;filename={content_artifact.relative_path}"
</span><span class="gi">+ content_disposition = f"attachment%3Bfilename={content_artifact.relative_path}"
</span> parameters = {"ResponseContentDisposition": content_disposition}
if headers.get("Content-Type"):
parameters["ResponseContentType"] = headers.get("Content-Type")
</code></pre> Pulp - Issue #9661 (CLOSED - DUPLICATE): Repository version errors : Path errors found. Paths are...https://pulp.plan.io/issues/96612022-01-05T11:48:01Zrgp
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2074":<a href="https://github.com/pulp/pulpcore/issues/2074" class="external">https://github.com/pulp/pulpcore/issues/2074</a></p>
<hr>
<p>Hi,
I'm seeing the following errors when syncing ubuntu bionic repositories:</p>
<pre><code># grep duplicated production.log-20220105
2022-01-05T03:01:19 [E|bac|4a143c87] Repository version errors : Path errors found. Paths are duplicated: dists/stable/Release,dists/stable/InRelease,dists/stable/Release.gpg,dists/stable/main/binary-amd64/Packages,dists/stable/main/binary-amd64/Packages.gz,dists/stable/main/binary-amd64/Release
2022-01-05T03:02:09 [E|bac|4a143c87] Repository version errors : Path errors found. Paths are duplicated: dists/stable/Release,dists/stable/InRelease,dists/stable/Release.gpg,dists/stable/main/binary-amd64/Packages.gz,dists/stable/main/binary-amd64/Release,dists/stable/main/binary-amd64/Packages
2022-01-05T03:02:16 [E|bac|4a143c87] Repository version errors : Path errors found. Paths are duplicated: dists/bionic/Release,dists/bionic/InRelease,dists/bionic/Release.gpg,dists/bionic/multiverse/binary-amd64/Release,dists/bionic/multiverse/binary-amd64/Packages,dists/bionic/multiverse/binary-amd64/Packages.gz
2022-01-05T03:02:24 [E|bac|4a143c87] Repository version errors : Path errors found. Paths are duplicated: dists/bionic/InRelease,dists/bionic/Release.gpg,dists/bionic/Release,dists/bionic/main/binary-amd64/Packages,dists/bionic/main/binary-amd64/Packages.gz
2022-01-05T03:06:24 [E|bac|4a143c87] Repository version errors : Path errors found. Paths are duplicated: dists/bionic/main/binary-armhf/Packages,dists/bionic/main/binary-armhf/Packages.gz,dists/bionic/main/binary-arm64/Packages,dists/bionic/main/binary-arm64/Packages.gz,dists/bionic/InRelease,dists/bionic/Release,dists/bionic/Release.gpg,dists/bionic/main/binary-armel/Packages,dists/bionic/main/binary-armel/Packages.gz,dists/bionic/main/binary-amd64/Packages,dists/bionic/main/binary-amd64/Packages.gz,dists/bionic/main/binary-s390x/Packages,dists/bionic/main/binary-s390x/Packages.gz,dists/bionic/main/binary-i386/Packages,dists/bionic/main/binary-i386/Packages.gz,dists/bionic/main/binary-ppc64el/Packages,dists/bionic/main/binary-ppc64el/Packages.gz
2022-01-05T03:06:49 [E|bac|4a143c87] Repository version errors : Path errors found. Paths are duplicated: dists/stable/main/binary-i386/Packages,dists/stable/main/binary-i386/Packages.gz,dists/stable/main/binary-i386/Release,dists/stable/main/binary-aarch64/Packages,dists/stable/main/binary-aarch64/Packages.gz,dists/stable/main/binary-aarch64/Release,dists/stable/InRelease,dists/stable/Release,dists/stable/Release.gpg,dists/stable/main/binary-amd64/Packages,dists/stable/main/binary-amd64/Packages.gz,dists/stable/main/binary-amd64/Release,dists/stable/main/binary-arm64/Packages,dists/stable/main/binary-arm64/Packages.gz,dists/stable/main/binary-arm64/Release
</code></pre>
<p>Not sure if this is related, but I also encountered the following error before these:</p>
<pre><code>deadlock detected
DETAIL: Process 531 waits for ShareLock on transaction 24995797; blocked by process 524.
Process 524 waits for ShareLock on transaction 24995798; blocked by process 531.
HINT: See server log for query details.
CONTEXT: while updating tuple (7388,19) in relation "core_artifact"
</code></pre> Pulp - Issue #9659 (CLOSED - DUPLICATE): Task.error dict does not include a typehttps://pulp.plan.io/issues/96592021-12-23T16:05:35Zdkliban@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2073":<a href="https://github.com/pulp/pulpcore/issues/2073" class="external">https://github.com/pulp/pulpcore/issues/2073</a></p>
<hr>
<p><code>MemoryError</code> does not have a message string. So when the error shows up in a Task, the <code>description</code> is an empty string and only a <code>traceback</code> is present. The <code>Task.error</code> dict should include a <code>type</code> key[0].</p>
<pre><code>{
"child_tasks": [],
"created_resources": [],
"error": {
"description": "",
"traceback": " File \"/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py\", line 317, in _perform_task\n result = func(*args, **kwargs)\n File \"/usr/lib/python3.6/site-packages/pulpcore/app/tasks/importer.py\", line 161, in import_repository_version\n a_result = _import_file(os.path.join(rv_path, filename), res_class, do_raise=False)\n File \"/usr/lib/python3.6/site-packages/pulpcore/app/tasks/importer.py\", line 62, in _import_file\n data = Dataset().load(json_file.read(), format=\"json\")\n File \"/usr/lib/python3.6/site-packages/tablib/core.py\", line 403, in load\n stream = normalize_input(in_stream)\n File \"/usr/lib/python3.6/site-packages/tablib/utils.py\", line 10, in normalize_input\n return StringIO(stream)\n"
},
"finished_at": "2021-12-16T21:10:44.339744Z",
"logging_cid": "5771d8b0-21d3-42b5-b4a9-2d57b29cc756",
"name": "pulpcore.app.tasks.importer.import_repository_version",
"parent_task": "/pulp/api/v3/tasks/79b44b66-2b76-4be8-a29a-cbe9f4122a6e/",
"progress_reports": [
{
"code": "import.repo.version.content",
"done": 0,
"message": "Importing content for Red_Hat_Enterprise_Linux_8_for_x86_64_-_BaseOS_RPMs_8-2128169",
"state": "running",
"suffix": null,
"total": null
}
],
"pulp_created": "2021-12-16T20:50:49.847379Z",
"pulp_href": "/pulp/api/v3/tasks/d24dd8ec-bbfa-4bf8-8950-ac905bb27d30/",
"reserved_resources_record": [
"/pulp/api/v3/repositories/rpm/rpm/3a731cf5-a48f-4bf2-9aae-cd153cac213e/"
],
"started_at": "2021-12-16T20:50:49.891657Z",
"state": "failed",
"task_group": "/pulp/api/v3/task-groups/13d15dcc-0889-4dff-b513-a4e2d580af03/",
"worker": null
}
</code></pre>
<p>[0] <a href="https://github.com/pulp/pulpcore/blob/8f74f9098a5c2d393e4bfd3835b99df3de5a913a/pulpcore/exceptions/base.py#L44" class="external">https://github.com/pulp/pulpcore/blob/8f74f9098a5c2d393e4bfd3835b99df3de5a913a/pulpcore/exceptions/base.py#L44</a></p> Debian Support - Issue #9658 (CLOSED - DUPLICATE): PackageIndex with empty Fields fails to synchttps://pulp.plan.io/issues/96582021-12-23T13:58:22Zmbucher
<p>Tried to sync HashiCorp Debian repository:</p>
<pre><code class="text syntaxhl" data-language="text">http POST :/pulp/api/v3/remotes/deb/apt/ \
name="HashiCorp bullseye" \
url="https://apt.releases.hashicorp.com/" \
distributions="bullseye" \
architectures="amd64" \
components="main"
</code></pre>
<p>This fails with:</p>
<pre><code>pulp [5b3ec73ca07041f0b572664d3e17f85a]: pulp_deb.app.tasks.synchronizing:DEBUG: Downloading package consul
pulp [5b3ec73ca07041f0b572664d3e17f85a]: pulpcore.tasking.pulpcore_worker:INFO: Task 04dbd5d4-654c-4572-b0e4-285be7e2bcc3 failed ({'section': [ErrorDetail(string='This field may not be blank.', code='blank')], 'priority': [ErrorDetail(string='This field may not be blank.', code='blank')]})
pulp [5b3ec73ca07041f0b572664d3e17f85a]: pulpcore.tasking.pulpcore_worker:INFO: File "/home/vagrant/devel/pulpcore/pulpcore/tasking/pulpcore_worker.py", line 362, in _perform_task
result = func(*args, **kwargs)
File "/home/vagrant/devel/pulp_deb/pulp_deb/app/tasks/synchronizing.py", line 139, in synchronize
DebDeclarativeVersion(first_stage, repository, mirror=mirror).create()
File "/home/vagrant/devel/pulpcore/pulpcore/plugin/stages/declarative_version.py", line 161, in create
loop.run_until_complete(pipeline)
File "/usr/lib64/python3.10/asyncio/base_events.py", line 641, in run_until_complete
return future.result()
File "/home/vagrant/devel/pulpcore/pulpcore/plugin/stages/api.py", line 225, in create_pipeline
await asyncio.gather(*futures)
File "/home/vagrant/devel/pulpcore/pulpcore/plugin/stages/api.py", line 43, in __call__
await self.run()
File "/home/vagrant/devel/pulp_deb/pulp_deb/app/tasks/synchronizing.py", line 491, in run
await asyncio.gather(
File "/home/vagrant/devel/pulp_deb/pulp_deb/app/tasks/synchronizing.py", line 575, in _handle_distribution
await asyncio.gather(*sub_tasks)
File "/home/vagrant/devel/pulp_deb/pulp_deb/app/tasks/synchronizing.py", line 621, in _handle_component
await asyncio.gather(*pending_tasks)
File "/home/vagrant/devel/pulp_deb/pulp_deb/app/tasks/synchronizing.py", line 730, in _handle_package_index
serializer.is_valid(raise_exception=True)
File "/usr/local/lib/pulp/lib64/python3.10/site-packages/rest_framework/serializers.py", line 228, in is_valid
raise ValidationError(self.errors)
</code></pre>
<p>I assume the reason is that for the mentioned package the PackageIndex file has some fields defined, but empty (in this case <code>Section</code> and <code>Priority</code>:</p>
<pre><code>Package: consul
Version: 1.10.5
Section:
Priority:
Architecture: amd64
Maintainer: HashiCorp
Installed-Size: 102588
Depends: openssl
Homepage: https://github.com/hashicorp/consul
Description: Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
Filename: pool/amd64/main/consul_1.10.5_amd64.deb
SHA1: e9218ec846c5128489cd627ff8104c9aee7665b6
SHA256: 50c2567713573873c609b109a63500ccc750d2aec6a2c962265c657ce2f0a813
Size: 38328086
</code></pre> RPM Support - Issue #9655 (CLOSED - DUPLICATE): requesting lots of packages with a subset of fiel...https://pulp.plan.io/issues/96552021-12-21T17:23:07Zjsherril@redhat.comjsherril@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulp_rpm/2312":<a href="https://github.com/pulp/pulp_rpm/issues/2312" class="external">https://github.com/pulp/pulp_rpm/issues/2312</a></p>
<hr>
<p>When we make a request like this:</p>
<pre><code>/pulp/api/v3/content/rpm/packages/?fields=pulp_href%2Cname%2Cversion%2Crelease%2Carch%2Cepoch%2Csummary%2Cis_modular%2Crpm_sourcerpm%2Clocation_href%2CpkgId&limit=1&offset=1000&repository_version=%2Fpulp%2Fapi%2Fv3%2Frepositories%2Frpm%2Frpm%2Fc6b93206-22bb-4c58-ba07-828c32326330%2Fversions%2F1%2F
</code></pre>
<p>where some subset of fields is requested, we're still seeing all of the fields being loaded from the database. The result is that the query takes much much longer than it should</p>
<p>Testing on the rhel7 repo, it took about ~100 seconds on my machine to fetch all 32K packages specifying the fields above. If I comment out the filelists, changelogs, provides/requires from the model and serializers, that drops down to ~33 seconds. This is a huge improvement.</p>
<p>Reading this: <a href="https://stackoverflow.com/questions/53319787/how-can-i-select-specific-fields-in-django-rest-framework" class="external">https://stackoverflow.com/questions/53319787/how-can-i-select-specific-fields-in-django-rest-framework</a></p>
<p>it looks like its may be possible to exclude certain fields from the query itself fairly easily.</p> Pulp - Issue #9654 (CLOSED - CURRENTRELEASE): backport 9642 to 3.17.1: Migration 081 was incompat...https://pulp.plan.io/issues/96542021-12-21T15:05:31Zdkliban@redhat.comRPM Support - Issue #9651 (CLOSED - DUPLICATE): Sync creates publication but no new repository ve...https://pulp.plan.io/issues/96512021-12-21T09:37:32Zmgoddard
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulp_rpm/2311":<a href="https://github.com/pulp/pulp_rpm/issues/2311" class="external">https://github.com/pulp/pulp_rpm/issues/2311</a></p>
<hr>
<p>I have a nightly job that uses Ansible Squeezer modules to synchronise, publish and distribute some repositories. Every few days I hit an error like this:</p>
<pre><code>Found multiple matches for publication ({'repository_version': '/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/versions/4/
</code></pre>
<p>I have verified that this is the case. There is one publication created after the last successful sync, and another created today for the same version.</p>
<pre><code> pulp rpm publication list --repository-version /pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/versions/4/
[
{
"pulp_href": "/pulp/api/v3/publications/rpm/rpm/21680308-1fc4-4bea-a5fc-1e3c609533f1/",
"pulp_created": "2021-12-21T02:31:44.243390Z",
"repository_version": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/versions/4/",
"repository": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/",
"metadata_checksum_type": "unknown",
"package_checksum_type": "unknown",
"gpgcheck": 0,
"repo_gpgcheck": 1,
"sqlite_metadata": true
},
{
"pulp_href": "/pulp/api/v3/publications/rpm/rpm/4a0c6a75-bb86-4b96-bb6c-6d0f08763847/",
"pulp_created": "2021-12-15T02:31:07.260722Z",
"repository_version": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/versions/4/",
"repository": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/",
"metadata_checksum_type": "unknown",
"package_checksum_type": "unknown",
"gpgcheck": 0,
"repo_gpgcheck": 1,
"sqlite_metadata": true
}
]
</code></pre>
<p>I checked the sync task from today, and it completed successfully. However, it lists the new publication as a created resource, but no new repo version.</p>
<pre><code> {
"pulp_href": "/pulp/api/v3/tasks/5e132510-89cb-4224-9966-d1f22d49a4e1/",
"pulp_created": "2021-12-21T02:30:53.723988Z",
"state": "completed",
"name": "pulp_rpm.app.tasks.synchronizing.synchronize",
"logging_cid": "0a4dc729907842aaa5ba9605e418cdd4",
"started_at": "2021-12-21T02:30:53.801940Z",
"finished_at": "2021-12-21T02:31:44.897653Z",
"error": null,
"worker": "/pulp/api/v3/workers/605f92b7-9b71-4039-a3de-0af017d86651/",
"parent_task": null,
"child_tasks": [],
"task_group": null,
"progress_reports": [
{
"message": "Downloading Metadata Files",
"code": "sync.downloading.metadata",
"state": "completed",
"total": null,
"done": 8,
"suffix": null
},
{
"message": "Downloading Artifacts",
"code": "sync.downloading.artifacts",
"state": "completed",
"total": null,
"done": 297,
"suffix": null
},
{
"message": "Associating Content",
"code": "associating.content",
"state": "completed",
"total": null,
"done": 0,
"suffix": null
},
{
"message": "Parsed Packages",
"code": "sync.parsing.packages",
"state": "completed",
"total": null,
"done": 299,
"suffix": null
},
{
"message": "Un-Associating Content",
"code": "unassociating.content",
"state": "completed",
"total": null,
"done": 0,
"suffix": null
}
],
"created_resources": [
"/pulp/api/v3/publications/rpm/rpm/21680308-1fc4-4bea-a5fc-1e3c609533f1/"
],
"reserved_resources_record": [
"/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/",
"shared:/pulp/api/v3/remotes/rpm/rpm/7b6bc03e-787e-4266-ba33-425c4f9e540b/"
]
},
</code></pre>
<p>Comparing with another sync task, I see a repository version listed in the created_resources instead.</p>
<p>Here is one of the affected repos:</p>
<pre><code>{
"pulp_href": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/",
"pulp_created": "2021-11-19T13:21:20.971989Z",
"versions_href": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/versions/",
"pulp_labels": {},
"latest_version_href": "/pulp/api/v3/repositories/rpm/rpm/b9eb112b-28c3-46bc-a40e-594531b54925/versions/4/",
"name": "CentOS Stream 8 - NFV OpenvSwitch",
"description": null,
"retain_repo_versions": null,
"remote": null,
"autopublish": false,
"metadata_signing_service": null,
"retain_package_versions": 0,
"metadata_checksum_type": null,
"package_checksum_type": null,
"gpgcheck": 0,
"repo_gpgcheck": 0,
"sqlite_metadata": false
}
</code></pre>
<p>And the corresponding remote:</p>
<pre><code> {
"pulp_href": "/pulp/api/v3/remotes/rpm/rpm/7b6bc03e-787e-4266-ba33-425c4f9e540b/",
"pulp_created": "2021-11-19T13:21:41.147452Z",
"name": "CentOS Stream 8 - NFV OpenvSwitch-remote",
"url": "http://mirrorlist.centos.org/?release=8-stream&arch=x86_64&repo=nfv-openvswitch-2",
"ca_cert": null,
"client_cert": null,
"tls_validation": true,
"proxy_url": null,
"pulp_labels": {},
"pulp_last_updated": "2021-11-19T13:21:41.147492Z",
"download_concurrency": null,
"max_retries": null,
"policy": "immediate",
"total_timeout": null,
"connect_timeout": null,
"sock_connect_timeout": null,
"sock_read_timeout": null,
"headers": null,
"rate_limit": null,
"sles_auth_token": null
},
</code></pre>
<p>I'm using <code>policy: immediate</code> and <code>sync_policy: mirror_complete</code> when syncing.</p>
<p>Versions:</p>
<pre><code> {
"component": "core",
"version": "3.16.0"
},
{
"component": "rpm",
"version": "3.16.1"
},
{
"component": "file",
"version": "1.10.1"
},
{
"component": "deb",
"version": "2.16.0"
},
{
"component": "container",
"version": "2.9.0"
},
{
"component": "certguard",
"version": "1.5.1"
}
</code></pre> RPM Support - Issue #9647 (CLOSED - DUPLICATE): ULN-remotes HTTP-Proxy use brokenhttps://pulp.plan.io/issues/96472021-12-20T09:33:07Zmbucher
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulp_rpm/2322":<a href="https://github.com/pulp/pulp_rpm/issues/2322" class="external">https://github.com/pulp/pulp_rpm/issues/2322</a></p>
<hr>
<p>Using ULN-Remotes, using HTTP-proxy does not work.</p>
<pre><code>Cannot connect to host linux-update.oracle.com:443 ssl:default [Connect call failed ('138.1.51.46', 443)
</code></pre> Maven Plugin - Issue #9640 (CLOSED - DUPLICATE): Missing Rest api link in docshttps://pulp.plan.io/issues/96402021-12-15T17:19:40Zwibbit
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulp_maven/54":<a href="https://github.com/pulp/pulp_maven/issues/54" class="external">https://github.com/pulp/pulp_maven/issues/54</a></p>
<hr>
<p><a href="https://docs.pulpproject.org/pulp_maven/" class="external">https://docs.pulpproject.org/pulp_maven/</a></p> Pulp - Story #9635 (CLOSED - DUPLICATE): As a user, I can specify the desired maximum amount of m...https://pulp.plan.io/issues/96352021-12-13T16:23:46Zbmbouterbmbouter@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2069":<a href="https://github.com/pulp/pulpcore/issues/2069" class="external">https://github.com/pulp/pulpcore/issues/2069</a></p>
<hr>
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>It would be nice if users could specify a desired maximum amount of RAM to be used during sync. For example, a user can say I only want 1500 MB of RAM to be used max.</p>
<a name="What-is-already-in-place"></a>
<h2 >What is already in place<a href="#What-is-already-in-place" class="wiki-anchor">¶</a></h2>
<p>The stages pipeline restricts memory usage by only allowing 1000 declarative content objects between each stage (so for 8-9 stages that's 8000-9000 declarative content objects. This happens <a href="https://github.com/pulp/pulpcore/blob/main/pulpcore/plugin/stages/api.py#L217" class="external">here</a>.</p>
<p>Interestingly the docstring says this defaults to 100, but it seems to actually be 1000!</p>
<p>Also the stages perform batching, so they will only taking in a limited number of items (the batch size). That happens <a href="https://github.com/pulp/pulpcore/blob/main/pulpcore/plugin/stages/api.py#L84" class="external">with minsize</a>.</p>
<a name="Why-this-isnt-enough"></a>
<h2 >Why this isn't enough<a href="#Why-this-isnt-enough" class="wiki-anchor">¶</a></h2>
<p>These are count-based mechnisms and don't correspond to actual MB or GB of memory used. Some content units vary a lot in how much memory each DeclarativeContent objects take up.</p>
<p>Another lesser problem is that it doesn't help plugin writers restrict their usage of memory in FirstStage.</p>
<a name="Idea"></a>
<h2 >Idea<a href="#Idea" class="wiki-anchor">¶</a></h2>
<p>Add a new param called <code>max_mb</code> to base Remote, which defaults to None. If specified, the user will be specifying the desired maximum MB used by process syncing.</p>
<p>Have the queues between the stages, and the bather implementation, both check the total memory the current process is using and asyncio.sleep() polling until it goes down. This should keep the maximum amount used by all objects roughly to that number.</p>
<a name="Details"></a>
<h2 >Details<a href="#Details" class="wiki-anchor">¶</a></h2>
<p>Introduce a new <code>MBSizeQueue</code> which is a wrapper around <code>asyncio.Queue</code> used today. It will have the same <code>put()</code> call, only wait if the amount of memory in use is greater than the remote is configured for.</p>
<p>Then introduce the same memory checking feature in the batcher. I'm not completely sure this second part is needed though.</p>
<p>We have to be very careful not to deadlock with this feature. For example, we have to account for the base case where even a single item is larger than the memory desired. Repos in pulp_rpm have had a single unit use more than 1.2G if I remember right, so if someone was syncing with 800 MB and we weren't careful to allow that unit to still flow through the pipeline we'd deadlock.....</p> RPM Support - Task #9633 (CLOSED - DUPLICATE): Use repo priorities in the dependency solverhttps://pulp.plan.io/issues/96332021-12-12T06:09:47Zdalleydalley@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulp_rpm/2309":<a href="https://github.com/pulp/pulp_rpm/issues/2309" class="external">https://github.com/pulp/pulp_rpm/issues/2309</a></p>
<hr>
<p>We ought to be setting repo priorities such that for every set of copies, any matching RPMs present in the same repository are prioritzed over ones in other repos.</p>