https://pulp.plan.io/https://pulp.plan.io/favicon.ico2016-08-30T11:12:21ZPulpPulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=141012016-08-30T11:12:21Zjluzajluza@redhat.com
<ul></ul><p>All the evil happens here: <a href="https://github.com/pulp/pulp/blob/master/server/pulp/server/managers/repo/unit_association_query.py#L47" class="external">https://github.com/pulp/pulp/blob/master/server/pulp/server/managers/repo/unit_association_query.py#L47</a><br>
and on this specific part <a href="https://github.com/pulp/pulp/blob/master/server/pulp/server/managers/repo/unit_association_query.py#L97" class="external">https://github.com/pulp/pulp/blob/master/server/pulp/server/managers/repo/unit_association_query.py#L97</a><br>
Due lack of join functionality in mongo, I suppose it won't be easy to solve this.<br>
I found only this <a href="http://stackoverflow.com/questions/5681851/mongodb-combine-data-from-multiple-collections-into-one-how" class="external">http://stackoverflow.com/questions/5681851/mongodb-combine-data-from-multiple-collections-into-one-how</a><br>
However I'm not sure if it will be helpful.</p> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=141022016-08-30T11:21:57Zrmcgoverrmcgover@redhat.com
<ul></ul><p>I was looking at this spot:</p>
<pre><code>units_cursors = (self._associated_units_by_type_cursor(t, criteria,
associations_lookup[t].keys())
for t in association_unit_types if t in associations_lookup)
</code></pre>
<p>That associations_lookup[t].keys() is the part which is too big; it seems like splitting it into batches could work. e.g. if a repo with 400,000 RPMs could produce 4 unit cursors querying 100,000 unit keys each, instead of the current situation of 1 unit cursor querying 400,000 keys, it looks to me like the rest of the code would still work (not certain about skip/limit).</p> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=141032016-08-30T14:29:30Zttereshcttereshc@redhat.com
<ul></ul><p>I would test it on a newer version of Pulp because search for units during copy <a href="https://github.com/pulp/pulp/blob/2.8-release/server/pulp/server/managers/repo/unit_association.py#L240-L242" class="external">changed and it seems like the path you discuss is not used</a> (at least for your example, when you have only RPMs).</p> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=141162016-08-30T14:59:25Zamacdona@redhat.comaustin@redhat.com
<ul><li><strong>Priority</strong> changed from <i>Normal</i> to <i>High</i></li><li><strong>Triaged</strong> changed from <i>No</i> to <i>Yes</i></li></ul> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=141172016-08-30T15:00:03Zamacdona@redhat.comaustin@redhat.com
<ul></ul><p>It should be verified that this bug still affects more recent Pulps.</p> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=141202016-08-30T15:19:04Zjluzajluza@redhat.com
<ul></ul><p><a href="mailto:amacdona@redhat.com" class="email">amacdona@redhat.com</a> wrote:</p>
<blockquote>
<p>It should be verified that this bug still affects more recent Pulps.</p>
</blockquote>
<p>I attached links poiting to master branch that contains related code, but if it's not enough we can write simple reproducer or try to reproduce on latest pulp</p> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=141912016-09-01T13:42:54Zjcline@redhat.comjcline@redhat.com
<ul><li><strong>Groomed</strong> changed from <i>No</i> to <i>Yes</i></li><li><strong>Sprint Candidate</strong> changed from <i>No</i> to <i>Yes</i></li></ul> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=142012016-09-01T15:27:47Zbmbouterbmbouter@redhat.com
<ul><li><strong>Status</strong> changed from <i>NEW</i> to <i>CLOSED - WONTFIX</i></li></ul><p>Pulp 3.0 is switching to PostgreSQL and away from mongodb so this issue is unique to the 2.y line. The original reporter ran into this issue by aggregating a large number of rpms from many repos into 1 repo and never removing them. The original reporter worked around this issue by removing old rpms that they no longer needed and moving some of them to other repos. Since their issue is resolve and it won't be an issue with Pulp 3.0 I'm going to close as wontfix.</p>
<p>If other users experience this issue or want to fix this issue please reopen or comment.</p> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=386632019-04-15T20:25:26Zbmbouterbmbouter@redhat.com
<ul><li><strong>Tags</strong> <i>Pulp 2</i> added</li></ul> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=517032020-01-07T14:53:05Zipanova@redhat.comipanova@redhat.com
<ul><li><strong>Status</strong> changed from <i>CLOSED - WONTFIX</i> to <i>POST</i></li></ul><p><a href="https://github.com/pulp/pulp/pull/3971" class="external">https://github.com/pulp/pulp/pull/3971</a></p> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=517042020-01-07T14:55:56Zipanova@redhat.comipanova@redhat.com
<ul><li><strong>Sprint</strong> set to <i>Sprint 63</i></li></ul> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=517992020-01-10T15:23:38Zipanova@redhat.comipanova@redhat.com
<ul><li><strong>Status</strong> changed from <i>POST</i> to <i>MODIFIED</i></li></ul><p>The fix was merged.</p> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=522692020-01-22T15:40:47Zipanova@redhat.comipanova@redhat.com
<ul><li><strong>Platform Release</strong> set to <i>2.21.1</i></li></ul> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=535182020-02-27T16:38:30Zipanova@redhat.comipanova@redhat.com
<ul><li><strong>Status</strong> changed from <i>MODIFIED</i> to <i>5</i></li></ul> Pulp - Issue #2220: Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same typehttps://pulp.plan.io/issues/2220?journal_id=537352020-03-04T16:46:35Zipanova@redhat.comipanova@redhat.com
<ul><li><strong>Status</strong> changed from <i>5</i> to <i>CLOSED - CURRENTRELEASE</i></li></ul>