https://pulp.plan.io/https://pulp.plan.io/favicon.ico2019-07-05T13:43:29ZPulpPulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=454242019-07-05T13:43:29Zdaviddavis
<ul><li><strong>Project</strong> changed from <i>RPM Support</i> to <i>Pulp</i></li></ul> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=454272019-07-05T14:27:19Zdaviddavis
<ul><li><strong>Subject</strong> changed from <i>Creating artifact in pulp3 fails for big uploaded files in chunks</i> to <i>Creating artifact in pulp3 fails for big files</i></li></ul><p>Thanks for the excellent bug report. It makes investigating these issues easy.</p>
<p>I looked into why artifact creation is failing for files < 2GB. The reason is that it's taking too long to calculate the checksums. There are 6 checksum types and each one takes about 4-8 seconds from the command line in my test environment. Calculating the digests in Python seems to add about 1-2 seconds. The default timeout in gunicorn is 30 seconds after which you get:</p>
<pre><code>Jul 05 14:21:56 pulp3 gunicorn[13691]: [2019-07-05 14:21:56 +0000] [13691] [CRITICAL] WORKER TIMEOUT (pid:29843)
Jul 05 14:21:57 pulp3 gunicorn[13691]: [2019-07-05 14:21:57 +0000] [30031] [INFO] Booting worker with pid: 30031
</code></pre>
<p>You can raise this timeout or also you can pass in the checksums when creating the artifact[0]. I think the best solution though might be to make artifact creation a background task.</p>
<p>[0] http POST :24817/pulp/api/v3/artifacts/ upload=$UPLOAD sha256=abc...</p> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=454292019-07-05T14:36:25Zbmbouterbmbouter@redhat.com
<ul></ul><p>+1 to moving this to a task. It's there to allow for long-running workloads like this one.</p> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=454342019-07-05T14:50:35Zdaviddavis
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-11 priority-6 priority-default closed" href="/issues/4998">Issue #4998</a>: Artifact size is limited to 2 GB</i> added</li></ul> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=454902019-07-09T15:43:53Zdkliban@redhat.com
<ul></ul><p>We should calculate the checksums of each chunk and then simply add tehm up at the end. That way the final request can be performed quickly.</p> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=454922019-07-09T15:55:21Zamacdona@redhat.comaustin@redhat.com
<ul><li><strong>Triaged</strong> changed from <i>No</i> to <i>Yes</i></li><li><strong>Sprint</strong> set to <i>Sprint 55</i></li></ul> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=455792019-07-11T11:14:47Zdaviddavis
<ul><li><strong>Status</strong> changed from <i>NEW</i> to <i>ASSIGNED</i></li><li><strong>Assignee</strong> set to <i>daviddavis</i></li></ul> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=456552019-07-12T13:27:46Zdkliban@redhat.com
<ul><li><strong>Sprint</strong> changed from <i>Sprint 55</i> to <i>Sprint 56</i></li></ul> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=456972019-07-12T17:52:42Zdkliban@redhat.com
<ul></ul><p>Artifact creation API calculates the checksums of the upload as it is being received. So this call can stay synchronous. However, we should make the 'upload_commit] operation[0] asynchronous. The checksums calculated during that task should then be saved to the db so they can be used for creating an artifact from the upload.</p>
<p>[0] <a href="https://docs.pulpproject.org/en/3.0/nightly/restapi.html#operation/uploads_commit" class="external">https://docs.pulpproject.org/en/3.0/nightly/restapi.html#operation/uploads_commit</a></p> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=456992019-07-12T18:07:09Zdaviddavis
<ul></ul><p>The upload commit action only calculates the sha256 checksum. We'd have to duplicate the logic that calculates checksums from artifact creation to upload commit. Why avoid having a background task for artifact creation?</p> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=457012019-07-12T18:45:48Zdkliban@redhat.com
<ul></ul><p>@daviddavis and I discussed this some more on IRC and here is the plan we came up with:</p>
<p>Make the 'uploads_commit'[0] return a 202 and calculate the checksum of a file in a task. The created_resource of that task will be an Artifact.</p>
<p>Remove the ability of the user to submit an upload href when creating an Artifact with 'artifacts_create'[1].</p>
<p>[0] <a href="https://docs.pulpproject.org/en/3.0/nightly/restapi.html#operation/uploads_commit" class="external">https://docs.pulpproject.org/en/3.0/nightly/restapi.html#operation/uploads_commit</a><br>
[1] <a href="https://docs.pulpproject.org/en/3.0/nightly/restapi.html#operation/artifacts_create" class="external">https://docs.pulpproject.org/en/3.0/nightly/restapi.html#operation/artifacts_create</a></p> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=457212019-07-15T14:37:02Zdaviddavis
<ul><li><strong>Assignee</strong> changed from <i>daviddavis</i> to <i>fao89</i></li></ul> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=459112019-07-22T21:27:26Zdaviddavis
<ul></ul><p>Regarding the design in <a href="https://pulp.plan.io/issues/5087#note-11" class="external">https://pulp.plan.io/issues/5087#note-11</a>, we have a <code>PUT /uploads/<uuid>/commit/</code> endpoint that dispatches a task that (among other things) creates an artifact. This artifact is set as a created_resource in the task.</p>
<p>The problem is that pulp-smash is not set up to handle such a case currently as it expects an endpoint that creates a resource to use POST[0]. I lean towards keeping it PUT since the main action is to commit the upload and the artifact creation is a side effect.</p>
<p>Looking for feedback.</p>
<p>[0] <a href="https://git.io/fjMjP" class="external">https://git.io/fjMjP</a></p> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=459122019-07-22T21:39:57Zdkliban@redhat.com
<ul></ul><p>pulp-smash should not drive our design. However, I always associate PUT requests with specific resources. In this case the user is making a request on an action URL for the resource. So doing a POST to /pulp/api/v3/uploads/<id>/commit/' seems most appropriate.</p> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=459732019-07-23T22:37:37Zdaviddavis
<ul><li><strong>Status</strong> changed from <i>ASSIGNED</i> to <i>POST</i></li></ul><p><a href="https://github.com/pulp/pulpcore/pull/227" class="external">https://github.com/pulp/pulpcore/pull/227</a></p> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=459942019-07-24T16:03:31ZAnonymous
<ul><li><strong>Status</strong> changed from <i>POST</i> to <i>MODIFIED</i></li></ul><p>Applied in changeset <a class="changeset" title="async artifact creation closes #5087" href="https://pulp.plan.io/projects/pulp/repository/pulpcore/revisions/95e513047cfc8a432a6faf9e1ebe868ff5a46091">pulpcore|95e513047cfc8a432a6faf9e1ebe868ff5a46091</a>.</p> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=505052019-12-13T16:03:27Zbmbouterbmbouter@redhat.com
<ul><li><strong>Sprint/Milestone</strong> set to <i>3.0.0</i></li></ul> Pulp - Issue #5087: Creating artifact in pulp3 fails for big fileshttps://pulp.plan.io/issues/5087?journal_id=509882019-12-13T17:28:52Zbmbouterbmbouter@redhat.com
<ul><li><strong>Status</strong> changed from <i>MODIFIED</i> to <i>CLOSED - CURRENTRELEASE</i></li></ul>