Project

Profile

Help

Issue #6806

closed

[pulp2] "BSON too large" error when unassociating from large repo

Added by jluza almost 4 years ago. Updated over 3 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
2.21.3
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

When repository contains too much units (hundreds of thousands) and when user tries to remove content from it it, mongo fails with "BSON too large error"

Here's reported traceback: Apr 28 04:20:11 pulp-03 pulp: pulp.server.async.tasks:INFO: [e8e1b784] Task failed : [e8e1b784-c07e-4df4-a687-df9f858dea77] Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) Task pulp.server.managers.repo.unit_association.unassociate_by_criteria[e8e1b784-c07e-4df4-a687-df9f858dea77] raised unexpected: DocumentTooLarge('BSON document too large (17039341 bytes) - the connected serversupports BSON document sizes up to 16777216 bytes.',) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) Traceback (most recent call last): Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 367, in trace_task Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) R = retval = fun(*args, **kwargs) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 529, in call Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) return super(Task, self).call(*args, **kwargs) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 107, in call Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) return super(PulpTask, self).call(*args, **kwargs) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 622, in protected_call Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) return self.run(*args, **kwargs) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib/python2.7/site-packages/pulp/server/managers/repo/unit_association.py", line 359, in unassociate_by_criteria Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) unassociate_units = load_associated_units(repo_id, criteria) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib/python2.7/site-packages/pulp/server/managers/repo/unit_association.py", line 443, in load_associated_units Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) associate_us = association_query_manager.get_units(source_repo_id, criteria=criteria) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib/python2.7/site-packages/pulp/server/managers/repo/unit_association_query.py", line 160, in get_units Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) return list(units_generator) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib/python2.7/site-packages/pulp/server/managers/repo/unit_association_query.py", line 530, in _merged_units_unique_units Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) for unit in associated_units: Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib64/python2.7/site-packages/pymongo/cursor.py", line 1097, in next Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) if len(self.__data) or self._refresh(): Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib64/python2.7/site-packages/pymongo/cursor.py", line 1019, in _refresh Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) self.__read_concern)) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib64/python2.7/site-packages/pymongo/cursor.py", line 850, in __send_message Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) **kwargs) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib64/python2.7/site-packages/pymongo/mongo_client.py", line 794, in _send_message_with_response Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) exhaust) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib64/python2.7/site-packages/pymongo/mongo_client.py", line 805, in _reset_on_error Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) return func(*args, **kwargs) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib64/python2.7/site-packages/pymongo/server.py", line 119, in send_message_with_response Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) sock_info.send_message(data, max_doc_size) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) File "/usr/lib64/python2.7/site-packages/pymongo/pool.py", line 234, in send_message Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) (max_doc_size, self.max_bson_size)) Apr 28 04:20:11 pulp-03 pulp: celery.app.trace:ERROR: [e8e1b784] (43211-12032) DocumentTooLarge: BSON document too large (17039341 bytes) - the connected serversupports BSON document sizes up to 16777216 bytes.


Files

bson-too-large-fix.txt.gz (24 KB) bson-too-large-fix.txt.gz Output from time python3 -m unittest discover pulp_2_tests.tests.rpm.api_v2 |& tee bson-too-large-fix.txt ggainey, 06/01/2020 10:32 PM
midnightercz-bson-too-large-fix-changed.txt.gz (7.66 KB) midnightercz-bson-too-large-fix-changed.txt.gz Post-change output from 'time python3 -m unittest discover pulp_2_tests.tests.rpm.api_v2 |& tee midnightercz-bson-too-large-fix-changed.txt' ggainey, 06/02/2020 07:53 PM
Actions #1

Updated by ipanova@redhat.com almost 4 years ago

  • Project changed from Docker Support to Pulp
  • Status changed from NEW to POST
  • Triaged changed from No to Yes
  • Tags Pulp 2 added

Added by jluza almost 4 years ago

Revision edb85453 | View on GitHub

When unassociating from large repos, to connect repo unit with content unit pulp queries all units in the repository and then query content units according to the unit filters + _id in returned by query to repo units. If this is too large, mongo will fail with BSON document too large error. This commit changes the approach of querying the db. Db is queried for unit_ids in batches to void sending too big query. As resulted units are yield from the method, there won't be any noticable difference outside of this method

closes #6806

Actions #4

Updated by jluza almost 4 years ago

  • Status changed from POST to MODIFIED
Actions #5

Updated by ttereshc almost 4 years ago

  • Platform Release set to 2.21.3

Added by jluza over 3 years ago

Revision d2aebd2d | View on GitHub

When unassociating from large repos, to connect repo unit with content unit pulp queries all units in the repository and then query content units according to the unit filters + _id in returned by query to repo units. If this is too large, mongo will fail with BSON document too large error. This commit changes the approach of querying the db. Db is queried for unit_ids in batches to void sending too big query. As resulted units are yield from the method, there won't be any noticable difference outside of this method

closes #6806

(cherry picked from commit edb854538e99fd1b2998d95453213ec8def13f71)

Actions #6

Updated by jluza over 3 years ago

Actions #7

Updated by ttereshc over 3 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF