https://pulp.plan.io/https://pulp.plan.io/favicon.ico2019-11-26T21:08:38ZPulpFile Support - Issue #5813: syncing a very large file repository takes a VERY long time https://pulp.plan.io/issues/5813?journal_id=496162019-11-26T21:08:38Zjsherril@redhat.comjsherril@redhat.com
<ul><li><strong>Tags</strong> <i>Katello-P2</i> added</li></ul> File Support - Issue #5813: syncing a very large file repository takes a VERY long time https://pulp.plan.io/issues/5813?journal_id=496232019-11-27T13:31:46Zdaviddavis
<ul><li><strong>Sprint</strong> set to <i>Sprint 62</i></li></ul><p>Adding to the sprint to hopefully resolve before 3.0 GA.</p> File Support - Issue #5813: syncing a very large file repository takes a VERY long time https://pulp.plan.io/issues/5813?journal_id=496292019-11-27T14:41:55Zdaviddavis
<ul></ul><p>I talked to @partha since @jsherrill is out. Sounds like the file repo had 150K small files.</p>
<p>We can probably use the pulp-fixtures script to generate such a repo:</p>
<p><a href="https://github.com/PulpQE/pulp-fixtures/blob/master/file/gen-fixtures.sh" class="external">https://github.com/PulpQE/pulp-fixtures/blob/master/file/gen-fixtures.sh</a></p> File Support - Issue #5813: syncing a very large file repository takes a VERY long time https://pulp.plan.io/issues/5813?journal_id=497112019-12-02T19:38:07Zbmbouterbmbouter@redhat.com
<ul><li><strong>Status</strong> changed from <i>NEW</i> to <i>ASSIGNED</i></li><li><strong>Assignee</strong> set to <i>bmbouter</i></li></ul> File Support - Issue #5813: syncing a very large file repository takes a VERY long time https://pulp.plan.io/issues/5813?journal_id=497322019-12-03T15:37:36Zfao89
<ul><li><strong>Triaged</strong> changed from <i>No</i> to <i>Yes</i></li></ul> File Support - Issue #5813: syncing a very large file repository takes a VERY long time https://pulp.plan.io/issues/5813?journal_id=497772019-12-05T16:50:43Zbmbouterbmbouter@redhat.com
<ul></ul><p>I applied the WIP PR from @fabricio here: which adjusts the batch sizes which provides a great speedup. This is syncing with policy='immediate' from <a href="http://quartet.usersys.redhat.com/pub/fake-repos/very_large_file_150k/" class="external">http://quartet.usersys.redhat.com/pub/fake-repos/very_large_file_150k/</a></p>
<p>It shows a runtime of ~ 66 minutes even with cprofile recording the run, which is great. We're going to merge that PR and I think it will resolve this issue.</p>
<pre><code>{
"pulp_href": "/pulp/api/v3/tasks/a60bd695-8129-406f-8ee5-c13fb3a6b680/",
"pulp_created": "2019-12-05T15:36:32.725152Z",
"state": "completed",
"name": "pulp_file.app.tasks.synchronizing.synchronize",
"started_at": "2019-12-05T15:36:33.024143Z",
"finished_at": "2019-12-05T16:42:41.263819Z",
"error": null,
"worker": "/pulp/api/v3/workers/5eb9dcb6-7561-4ef2-aa1e-6b9f9468f7d8/",
"progress_reports": [
{
"message": "Downloading Metadata",
"code": "downloading.metadata",
"state": "completed",
"total": null,
"done": 1,
"suffix": null
},
{
"message": "Parsing Metadata Lines",
"code": "parsing.metadata",
"state": "completed",
"total": 150001,
"done": 150001,
"suffix": null
},
{
"message": "Downloading Artifacts",
"code": "downloading.artifacts",
"state": "completed",
"total": null,
"done": 150001,
"suffix": null
},
{
"message": "Associating Content",
"code": "associating.content",
"state": "completed",
"total": null,
"done": 150001,
"suffix": null
}
],
"created_resources": [
"/pulp/api/v3/repositories/file/file/cbfdbf42-a3cd-4e62-8a28-e8f55957c469/versions/1/"
],
"reserved_resources_record": [
"/pulp/api/v3/repositories/file/file/cbfdbf42-a3cd-4e62-8a28-e8f55957c469/",
"/pulp/api/v3/remotes/file/file/03c91c3f-c750-407a-8cd5-7b182e9680c6/"
]
}
</code></pre> File Support - Issue #5813: syncing a very large file repository takes a VERY long time https://pulp.plan.io/issues/5813?journal_id=497822019-12-05T18:00:12Zbmbouterbmbouter@redhat.com
<ul></ul><p>Here's a sync with policy=on_demand and cprofiling enabled ~ 26 minutes</p>
<pre><code>{
"pulp_href": "/pulp/api/v3/tasks/d51a63a3-f737-4898-a294-44198f378823/",
"pulp_created": "2019-12-05T17:33:46.837539Z",
"state": "completed",
"name": "pulp_file.app.tasks.synchronizing.synchronize",
"started_at": "2019-12-05T17:33:46.950234Z",
"finished_at": "2019-12-05T17:59:30.521306Z",
"error": null,
"worker": "/pulp/api/v3/workers/cf4b2e6e-3e81-4968-ba14-8dab7edbb6b3/",
"progress_reports": [
{
"message": "Downloading Metadata",
"code": "downloading.metadata",
"state": "completed",
"total": null,
"done": 1,
"suffix": null
},
{
"message": "Parsing Metadata Lines",
"code": "parsing.metadata",
"state": "completed",
"total": 150001,
"done": 150001,
"suffix": null
},
{
"message": "Downloading Artifacts",
"code": "downloading.artifacts",
"state": "completed",
"total": null,
"done": 0,
"suffix": null
},
{
"message": "Associating Content",
"code": "associating.content",
"state": "completed",
"total": null,
"done": 150001,
"suffix": null
}
],
"created_resources": [
"/pulp/api/v3/repositories/file/file/090ed56e-4f30-450c-829b-9f861397bc21/versions/1/"
],
"reserved_resources_record": [
"/pulp/api/v3/remotes/file/file/780e65aa-d66b-47ed-98cd-d0b1bf32f18c/",
"/pulp/api/v3/repositories/file/file/090ed56e-4f30-450c-829b-9f861397bc21/"
]
}
</code></pre> File Support - Issue #5813: syncing a very large file repository takes a VERY long time https://pulp.plan.io/issues/5813?journal_id=497842019-12-05T18:35:45Zbmbouterbmbouter@redhat.com
<ul><li><strong>File</strong> <a href="/attachments/526513">cprofile_150k_repo</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/526513/cprofile_150k_repo">cprofile_150k_repo</a> added</li><li><strong>File</strong> <a href="/attachments/526514">cprofile_150k_repo_on_demand</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/526514/cprofile_150k_repo_on_demand">cprofile_150k_repo_on_demand</a> added</li></ul><p>Adding cprofiled outputs so anyone can analyze</p> File Support - Issue #5813: syncing a very large file repository takes a VERY long time https://pulp.plan.io/issues/5813?journal_id=498382019-12-06T14:43:55Zrchan
<ul><li><strong>Sprint</strong> changed from <i>Sprint 62</i> to <i>Sprint 63</i></li></ul> File Support - Issue #5813: syncing a very large file repository takes a VERY long time https://pulp.plan.io/issues/5813?journal_id=498752019-12-06T20:07:07Zbmbouterbmbouter@redhat.com
<ul><li><strong>Status</strong> changed from <i>ASSIGNED</i> to <i>MODIFIED</i></li></ul><p>This was fixed by: <a href="https://github.com/pulp/pulpcore/pull/440" class="external">https://github.com/pulp/pulpcore/pull/440</a></p> File Support - Issue #5813: syncing a very large file repository takes a VERY long time https://pulp.plan.io/issues/5813?journal_id=513492019-12-13T17:37:45Zbmbouterbmbouter@redhat.com
<ul><li><strong>Sprint/Milestone</strong> set to <i>0.1.0</i></li></ul> File Support - Issue #5813: syncing a very large file repository takes a VERY long time https://pulp.plan.io/issues/5813?journal_id=514022019-12-13T17:38:15Zbmbouterbmbouter@redhat.com
<ul><li><strong>Status</strong> changed from <i>MODIFIED</i> to <i>CLOSED - CURRENTRELEASE</i></li></ul> File Support - Issue #5813: syncing a very large file repository takes a VERY long time https://pulp.plan.io/issues/5813?journal_id=560522020-05-08T17:44:28Zggainey
<ul><li><strong>Tags</strong> <i>Katello</i> added</li><li><strong>Tags</strong> deleted (<del><i>Katello-P2</i></del>)</li></ul>