Project

Profile

Help

Issue #5058

closed

ISO publish fails with BSON document too large

Added by ipanova@redhat.com almost 5 years ago. Updated about 4 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
2.21.0
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

A scaling issue has been discovered when publishing isos via fast forward way.

BSON document too large (20946918 bytes) - the connected serversupports BSON document sizes up to 16777216 bytes.

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/pulp/plugins/file/distributor.py", line 181, in publish_repo_fast_forward
    unit_absent_set = publish_conduit.get_units(criteria=criteria)
  File "/usr/lib/python2.7/site-packages/pulp/plugins/conduits/mixins.py", line 173, in get_units
    return do_get_repo_units(self.repo_id, criteria, self.exception_class, as_generator)
  File "/usr/lib/python2.7/site-packages/pulp/plugins/conduits/mixins.py", line 704, in do_get_repo_units
    return list(_transfer_object_generator())
  File "/usr/lib/python2.7/site-packages/pulp/plugins/conduits/mixins.py", line 691, in _transfer_object_generator
    for u in units:
  File "/usr/lib/python2.7/site-packages/pulp/server/managers/repo/unit_association_query.py", line 530, in _merged_units_unique_units
    for unit in associated_units:
  File "/usr/lib64/python2.7/site-packages/pymongo/cursor.py", line 1097, in next
    if len(self.__data) or self._refresh():
  File "/usr/lib64/python2.7/site-packages/pymongo/cursor.py", line 1019, in _refresh
    self.__read_concern))
  File "/usr/lib64/python2.7/site-packages/pymongo/cursor.py", line 850, in __send_message
    **kwargs)
  File "/usr/lib64/python2.7/site-packages/pymongo/mongo_client.py", line 794, in _send_message_with_response
    exhaust)
  File "/usr/lib64/python2.7/site-packages/pymongo/mongo_client.py", line 805, in _reset_on_error
    return func(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/pymongo/server.py", line 119, in send_message_with_response
    sock_info.send_message(data, max_doc_size)
  File "/usr/lib64/python2.7/site-packages/pymongo/pool.py", line 228, in send_message
    (max_doc_size, self.max_bson_size))
DistributorConduitException: BSON document too large (20946918 bytes) - the connected serversupports BSON document sizes up to 16777216 bytes.

As noticed by Content Delivery team, the problem is coming from here:

/usr/lib/python2.7/site-packages/pulp/plugins/file/distributor.py

                # Copy incremental files into publishing directories
                checksum_absent_set = unit_checksum_set - unit_checksum_old_set
                criteria = UnitAssociationCriteria(
                    unit_filters={'checksum': {"$in": list(checksum_absent_set)}})
                unit_absent_set = publish_conduit.get_units(criteria=criteria)
                for unit in unit_absent_set:
                    links_to_create = self.get_paths_for_unit(unit)
                    self._symlink_unit(build_dir, unit, links_to_create)

There's a limit to how large a single mongo query can be. If checksum_absent_set contains too many elements, the query in above code will exceed that limit and crash. We apparently have enough items in redhat-sigstore to hit this limit.

Also available in: Atom PDF