Pulp: Issueshttps://pulp.plan.io/https://pulp.plan.io/favicon.ico2022-01-12T17:32:38ZPulp
Planio Pulp - Story #9670 (CLOSED - DUPLICATE): In an access policy for reposiroty versions repository p...https://pulp.plan.io/issues/96702022-01-12T17:32:38Zmdellweg
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2076":<a href="https://github.com/pulp/pulpcore/issues/2076" class="external">https://github.com/pulp/pulpcore/issues/2076</a></p> Pulp - Story #9635 (CLOSED - DUPLICATE): As a user, I can specify the desired maximum amount of m...https://pulp.plan.io/issues/96352021-12-13T16:23:46Zbmbouterbmbouter@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2069":<a href="https://github.com/pulp/pulpcore/issues/2069" class="external">https://github.com/pulp/pulpcore/issues/2069</a></p>
<hr>
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>It would be nice if users could specify a desired maximum amount of RAM to be used during sync. For example, a user can say I only want 1500 MB of RAM to be used max.</p>
<a name="What-is-already-in-place"></a>
<h2 >What is already in place<a href="#What-is-already-in-place" class="wiki-anchor">¶</a></h2>
<p>The stages pipeline restricts memory usage by only allowing 1000 declarative content objects between each stage (so for 8-9 stages that's 8000-9000 declarative content objects. This happens <a href="https://github.com/pulp/pulpcore/blob/main/pulpcore/plugin/stages/api.py#L217" class="external">here</a>.</p>
<p>Interestingly the docstring says this defaults to 100, but it seems to actually be 1000!</p>
<p>Also the stages perform batching, so they will only taking in a limited number of items (the batch size). That happens <a href="https://github.com/pulp/pulpcore/blob/main/pulpcore/plugin/stages/api.py#L84" class="external">with minsize</a>.</p>
<a name="Why-this-isnt-enough"></a>
<h2 >Why this isn't enough<a href="#Why-this-isnt-enough" class="wiki-anchor">¶</a></h2>
<p>These are count-based mechnisms and don't correspond to actual MB or GB of memory used. Some content units vary a lot in how much memory each DeclarativeContent objects take up.</p>
<p>Another lesser problem is that it doesn't help plugin writers restrict their usage of memory in FirstStage.</p>
<a name="Idea"></a>
<h2 >Idea<a href="#Idea" class="wiki-anchor">¶</a></h2>
<p>Add a new param called <code>max_mb</code> to base Remote, which defaults to None. If specified, the user will be specifying the desired maximum MB used by process syncing.</p>
<p>Have the queues between the stages, and the bather implementation, both check the total memory the current process is using and asyncio.sleep() polling until it goes down. This should keep the maximum amount used by all objects roughly to that number.</p>
<a name="Details"></a>
<h2 >Details<a href="#Details" class="wiki-anchor">¶</a></h2>
<p>Introduce a new <code>MBSizeQueue</code> which is a wrapper around <code>asyncio.Queue</code> used today. It will have the same <code>put()</code> call, only wait if the amount of memory in use is greater than the remote is configured for.</p>
<p>Then introduce the same memory checking feature in the batcher. I'm not completely sure this second part is needed though.</p>
<p>We have to be very careful not to deadlock with this feature. For example, we have to account for the base case where even a single item is larger than the memory desired. Repos in pulp_rpm have had a single unit use more than 1.2G if I remember right, so if someone was syncing with 800 MB and we weren't careful to allow that unit to still flow through the pipeline we'd deadlock.....</p> RPM Support - Task #9633 (CLOSED - DUPLICATE): Use repo priorities in the dependency solverhttps://pulp.plan.io/issues/96332021-12-12T06:09:47Zdalleydalley@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulp_rpm/2309":<a href="https://github.com/pulp/pulp_rpm/issues/2309" class="external">https://github.com/pulp/pulp_rpm/issues/2309</a></p>
<hr>
<p>We ought to be setting repo priorities such that for every set of copies, any matching RPMs present in the same repository are prioritzed over ones in other repos.</p> Pulp - Story #9615 (CLOSED - CURRENTRELEASE): Add async sign method for SigningServicehttps://pulp.plan.io/issues/96152021-12-07T20:32:27ZgerrodPulp - Story #9614 (CLOSED - DUPLICATE): As a developer, I can mark a Model as RBAC enabled and h...https://pulp.plan.io/issues/96142021-12-07T19:03:50Zbmbouterbmbouter@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2067":<a href="https://github.com/pulp/pulpcore/issues/2067" class="external">https://github.com/pulp/pulpcore/issues/2067</a></p>
<hr>
<p>This is build on <a href="https://pulp.plan.io/issues/9613" class="external">the introduction of <code>with_perm</code></a>.</p>
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Everytime a queryset is constructed that deals with an RBAC enabled object, we need to ensure that only those objects that user has permissions to operate on are available in the queryset results. For example, if I have the <code>core.delete_task</code> permission on some objects, but not others, I can't just run <code>Task.objects.all().delete()</code>.</p>
<p>We deal with querysets in so many places, it would be great to have a safer way to be told if I've filtered each queryset at least in some way by permissions.</p>
<a name="Proposal"></a>
<h2 >Proposal<a href="#Proposal" class="wiki-anchor">¶</a></h2>
<p>Add an attribute on all models called <code>RBAC_PROTECTED = False</code> and have models opt-in to using this safety feature by setting it to <code>True</code> on their model definition.</p>
<p>Then modify the querset evaluation to raise an exception if that queryset never had a <code>with_perm</code> call occur. This would be an opt-in, model-by-model safety feature.</p>
<p>There are some situations when you are supposed to not need a <code>with_perm</code> call. For example if the viewset queries for all objects, and then passes the list of pks to the task in the backend to handle, the backend queryset construction already handled permissions but there is no call to <code>with_perm</code> there.</p>
<p>Let's add a queryset method called <code>qs.with_no_perms()</code>. With this I could call <code>Task.objects.with_no_perms().all()</code> and I would not receive the exception even without a call to <code>with_perm</code>.</p>
<a name="Special-considerations"></a>
<h2 >Special considerations<a href="#Special-considerations" class="wiki-anchor">¶</a></h2>
<p>There could be situations where a new querset is made as a new object, e.g. boolean or set operations. Let's get examples of these kinds of situations right:</p>
<ul>
<li>
<code>qs.all() | qs.with_perm("core.task_show")</code> -> unsafe</li>
<li>
<code>qs.none() | qs.with_perm("core.task_show")</code> -> safe</li>
<li>
<code>qs.none() & qs.with_perm("core.task_show")</code> ??</li>
<li>
<code>qs.all() & qs.with_perm("core.task_show")</code> -> safe</li>
</ul> Pulp - Story #9613 (CLOSED - DUPLICATE): As a developer, I can make permission object filtering c...https://pulp.plan.io/issues/96132021-12-07T18:46:13Zbmbouterbmbouter@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2066":<a href="https://github.com/pulp/pulpcore/issues/2066" class="external">https://github.com/pulp/pulpcore/issues/2066</a></p>
<hr>
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>As a developer with the new Roles facilities in pulpcore==3.17, you likely will want to filter by permissions with something like this example taken from <a href="https://github.com/pulp/pulpcore/pull/1721/files" class="external">this PR</a>.</p>
<pre><code>current_user = get_current_authenticated_user()
qs = Task.objects.filter(finished_at__lt=finished_before, state__in=states)
units_deleted, details = get_objects_for_user(current_user, "core.delete_task", qs=qs).delete()
</code></pre>
<p>As you can see, this needs to determine who the current user is, and you can't build the queryset in one go by using chaining.</p>
<a name="Proposal"></a>
<h2 >Proposal<a href="#Proposal" class="wiki-anchor">¶</a></h2>
<p>Introduce a <code>with_perm</code> chainable call on all querysets for Pulp objects. It could be used like this:</p>
<ul>
<li><code>qs.with_perm("core.task_delete")</code></li>
<li><code>qs.with_perm("core.task_delete", "core.task_view")</code></li>
<li><code>qs.with_perms(["core.task_delete", "core.task_view"])</code></li>
<li><code>qs.with_perm("core.task_delete").with_perm( "core.task_view")</code></li>
</ul> Pulp - Story #9603 (CLOSED - DUPLICATE): Reclaim disk space without providing a list of repositorieshttps://pulp.plan.io/issues/96032021-12-03T17:22:17Ziballou
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2065":<a href="https://github.com/pulp/pulpcore/issues/2065" class="external">https://github.com/pulp/pulpcore/issues/2065</a></p>
<hr>
<p>Related: <a href="https://pulp.plan.io/issues/8459" class="external">https://pulp.plan.io/issues/8459</a></p>
<p>Katello would like to be able to clean out all repositories for a given Pulp installation. This would be useful for smart proxies, since we don't index the repository hrefs. As a work around, we have to query the repositories API to get all of the repository hrefs.</p>
<p>Perhaps this could be done by passing in an empty array for the repository hrefs.</p> Pulp - Story #9602 (CLOSED - DUPLICATE): Add --name to all pulp processes for better process titlehttps://pulp.plan.io/issues/96022021-12-03T08:51:50Zlzap@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2064":<a href="https://github.com/pulp/pulpcore/issues/2064" class="external">https://github.com/pulp/pulpcore/issues/2064</a></p>
<hr>
<p>Hello</p>
<p>To identify gunicorn processes in a nicer way, this package must be present and then they can be tagged for programs like top or ps:</p>
<p><a href="https://docs.gunicorn.org/en/stable/faq.html" class="external">https://docs.gunicorn.org/en/stable/faq.html</a>
<a href="https://docs.gunicorn.org/en/stable/settings.html#proc-name" class="external">https://docs.gunicorn.org/en/stable/settings.html#proc-name</a></p>
<p>Then --name can be used in all systemd units for better identification of processes. For example pulpcore-api is currently listed as cryptic gunicorn:wsgi something. Thanks!</p>
<p><a href="https://docs.gunicorn.org/en/stable/settings.html#proc-name" class="external">https://docs.gunicorn.org/en/stable/settings.html#proc-name</a></p> Container Support - Story #9597 (CLOSED - DUPLICATE): As a user, my image signature JSON payload ...https://pulp.plan.io/issues/95972021-12-01T17:49:02Zttereshcttereshc@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulp_container/512":<a href="https://github.com/pulp/pulp_container/issues/512" class="external">https://github.com/pulp/pulp_container/issues/512</a></p>
<hr>
<p>JSON provided with signature needs to adhere to the format described in <a href="https://github.com/containers/image/blob/main/docs/containers-signature.5.md#json-processing-and-forward-compatibility" class="external">the containers/images docs</a>.</p>
<p>Proposed solution: use jsonschema to define format and validate against it.
See <a href="https://github.com/pulp/pulp-2to3-migration/blob/main/pulp_2to3_migration/app/json_schema.py" class="external">an example in the migration plugin</a>.</p> Pulp - Story #9593 (CLOSED - DUPLICATE): As a plugin writer, I want a downloader which streams co...https://pulp.plan.io/issues/95932021-11-30T17:26:53Zttereshcttereshc@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2063":<a href="https://github.com/pulp/pulpcore/issues/2063" class="external">https://github.com/pulp/pulpcore/issues/2063</a></p>
<hr>
<p>There are cases when downloaded content does not need to be saved on a filesystem.<br>
It's just an additional overhead for some workflows. When data is very small, used immediately or saved to DB, no bits on a file system are needed.</p>
<p>E.g. Container signatures. They are small (under 1KB) and are parsed and saved in DB.</p> RPM Support - Task #9585 (CLOSED - DUPLICATE): Sub-tree-only sync breaks our model of how syncs a...https://pulp.plan.io/issues/95852021-11-23T21:00:08Zdalleydalley@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulp_rpm/2306":<a href="https://github.com/pulp/pulp_rpm/issues/2306" class="external">https://github.com/pulp/pulp_rpm/issues/2306</a></p>
<hr>
<p>See: <a href="https://pulp.plan.io/issues/9565" class="external">https://pulp.plan.io/issues/9565</a></p>
<p>We haven't considered the possibility of a repository having a .treeinfo but no top-level repodata. We need to investigate whether this breaks anything (for example: autopublish), and potentially add a new test fixture to exercise it.</p> Container Support - Story #9582 (CLOSED - DUPLICATE): As a user when I remove manifest from a rep...https://pulp.plan.io/issues/95822021-11-23T16:45:58Zipanova@redhat.comipanova@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulp_container/511":<a href="https://github.com/pulp/pulp_container/issues/511" class="external">https://github.com/pulp/pulp_container/issues/511</a></p> Container Support - Story #9581 (CLOSED - DUPLICATE): As a user I can audit existing container im...https://pulp.plan.io/issues/95812021-11-23T16:19:33Zipanova@redhat.comipanova@redhat.com
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulp_container/510":<a href="https://github.com/pulp/pulp_container/issues/510" class="external">https://github.com/pulp/pulp_container/issues/510</a></p>
<hr>
<p>There will be a separate API call that will be provided with the (1) signature verification policy config (2) public keys to verify against (3) content to verify</p>
<p>A follow-up question: what to do in case validation failed? There should be some steps that will take care of removing tempered content. Should this be automatic or a separate manual call?</p> Pulp - Story #9574 (CLOSED - DUPLICATE): The validate_duplicate_content function should provide m...https://pulp.plan.io/issues/95742021-11-17T10:28:36Zquba42
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulpcore/2062":<a href="https://github.com/pulp/pulpcore/issues/2062" class="external">https://github.com/pulp/pulpcore/issues/2062</a></p>
<hr>
<p>When the validate_duplicate_content function finds illegal duplicate content in a repo version being created, the output is (pulp_deb example):</p>
<p>"Cannot create repository version. More than one deb.package content with the duplicate values for package, version, architecture."</p>
<p>For users to have any chance of debugging this situation, it would be vital for the error to provide them with a list of the offending duplicate units, preferably the pulp_href, so they can go and look at them in detail.</p>
<p>Without this information I just know "I have duplicate units somewhere in the potentially tens of thousands of units in the repo version being created". (Since the repo version is then not created, I can't even go hunting for the duplicate units myself...) Right now, I can't even distinguish a situation where two packages are clashing, from one where all my packages are double (for example).</p>
<p>User reported backtrace for the error they encountered:</p>
<pre><code>File \"/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py\", line 317, in _perform_task
result = func(*args, **kwargs)
File \"/usr/lib/python3.6/site-packages/pulpcore/app/tasks/repository.py\", line 219, in add_and_remove
new_version.add_content(models.Content.objects.filter(pk__in=add_content_units))
File \"/usr/lib/python3.6/site-packages/pulpcore/app/models/repository.py\", line 963, in __exit__
repository.finalize_new_version(self)
File \"/usr/lib/python3.6/site-packages/pulp_deb/app/models/repository.py\", line 57, in finalize_new_version
validate_repo_version(new_version)
File \"/usr/lib/python3.6/site-packages/pulpcore/plugin/repo_version_utils.py\", line 137, in validate_repo_version
validate_duplicate_content(version)
File \"/usr/lib/python3.6/site-packages/pulpcore/plugin/repo_version_utils.py\", line 108, in validate_duplicate_content
_(\"Cannot create repository version. {msg}\").format(msg=\", \".join(error_messages))
</code></pre> Container Support - Task #9572 (CLOSED - DUPLICATE): Port the RBAC implementation to the pulpcore...https://pulp.plan.io/issues/95722021-11-16T13:48:53Zmdellweg
<p><strong>Ticket moved to GitHub</strong>: "pulp/pulp_container/508":<a href="https://github.com/pulp/pulp_container/issues/508" class="external">https://github.com/pulp/pulp_container/issues/508</a></p>
<hr>
<p>Start with a PoC PR to get the pulpcore PR merged first.</p>
<p>Once that is done, we need to write a data migration that will look for the autogenerated groups and translate them into user_object_roles.
To be discussed: Look if we can be clever with global permissions too.</p>