Project

Profile

Help

Issue #2220

closed

Copying units between repositories hits DocumentTooLarge: BSON document too large, if source repo contains > 345,000 units of same type

Added by rmcgover over 5 years ago. Updated about 2 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
High
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
3. High
Version:
2.7.1
Platform Release:
2.21.1
OS:
RHEL 6
Triaged:
Yes
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
Pulp 2
Sprint:
Sprint 63
Quarter:

Description

If a repo contains greater than approx. 345000 units of the same type, then attempting to associate any units of that type using API is likely to hit an error such as:

most recent call last):
  File "/usr/lib/python2.6/site-packages/celery/app/trace.py", line 240, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/lib/python2.6/site-packages/pulp/server/async/tasks.py", line 393, in __call__
    return super(Task, self).__call__(*args, **kwargs)
  File "/usr/lib/python2.6/site-packages/celery/app/trace.py", line 437, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/lib/python2.6/site-packages/pulp/server/managers/repo/unit_association.py", line 204, in associate_from_repo
    associate_us = load_associated_units(source_repo_id, criteria)
  File "/usr/lib/python2.6/site-packages/pulp/server/managers/repo/unit_association.py", line 408, in load_associated_units
    associate_us = association_query_manager.get_units(source_repo_id, criteria=criteria)
  File "/usr/lib/python2.6/site-packages/pulp/server/managers/repo/unit_association_query.py", line 205, in get_units
    return list(units_generator)
  File "/usr/lib/python2.6/site-packages/pulp/server/managers/repo/unit_association_query.py", line 572, in _merged_units_unique_units
    for unit in associated_units:
  File "/usr/lib/python2.6/site-packages/pulp/server/managers/repo/unit_association_query.py", line 498, in _units_from_chained_cursors
    for element in cursor:
  File "/usr/lib64/python2.6/site-packages/pymongo/cursor.py", line 1058, in next
    if len(self.__data) or self._refresh():
  File "/usr/lib64/python2.6/site-packages/pymongo/cursor.py", line 1002, in _refresh
    self.__uuid_subtype))
  File "/usr/lib64/python2.6/site-packages/pymongo/cursor.py", line 915, in __send_message
    res = client._send_message_with_response(message, **kwargs)
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_replica_set_client.py", line 1676, in _send_message_with_response
    response = self.__try_read(member, msg, **kwargs)
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_replica_set_client.py", line 1561, in __try_read
    return self.__send_and_receive(member, msg, **kwargs)
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_replica_set_client.py", line 1534, in __send_and_receive
    rqst_id, data = self.__check_bson_size(msg, member.max_bson_size)
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_replica_set_client.py", line 1469, in __check_bson_size
    (max_doc_size, max_size))
DocumentTooLarge: BSON document too large (16820289 bytes) - the connected server supports BSON document sizes up to 16777216 bytes.

This occurs because the code appearing in the backtrace above generates a query containing every ID of the requested unit type in the source repo. BSON-encoded, this works out to >16MB if the source repo has approximately greater than 345,000 units of that type, e.g.

$ python
Python 2.7.5 (default, Aug  9 2016, 05:27:46) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import bson
>>> len(bson.BSON.encode({"_id":{"$in":["24ec9b1a-d9fa-4f7d-a5d2-71dc6755a7e9"]*345000}}))
16793915

To reproduce:

  • Create a repo with yum importer, e.g. "all-rpm-content"

  • Create another empty repo with yum importer, e.g. "target"

  • Import many RPMs (maybe 346,000 to be sure)

  • POST to:

    /pulp/api/v2/repositories/target/actions/associate/
    {
      'source_repo_id' : 'all-rpm-content',
      'criteria': {
        'type_ids' : ['rpm'],
        'filters' : {
          'unit' : {
            'filename': 'test-rpm.rpm'
          }
        }
      }
    }
    

Expected result: test-rpm.rpm is associated with 'target' repo.

Actual result: association task fails with: DocumentTooLarge: BSON document too large

Although this was observed in Pulp 2.7, the Pulp 2.10 code on review seems likely to hit the same problem.

Actions #1

Updated by jluza over 5 years ago

Actions #2

Updated by rmcgover over 5 years ago

I was looking at this spot:

units_cursors = (self._associated_units_by_type_cursor(t, criteria,
                                                               associations_lookup[t].keys())
                         for t in association_unit_types if t in associations_lookup)

That associations_lookup[t].keys() is the part which is too big; it seems like splitting it into batches could work. e.g. if a repo with 400,000 RPMs could produce 4 unit cursors querying 100,000 unit keys each, instead of the current situation of 1 unit cursor querying 400,000 keys, it looks to me like the rest of the code would still work (not certain about skip/limit).

Actions #3

Updated by ttereshc over 5 years ago

I would test it on a newer version of Pulp because search for units during copy changed and it seems like the path you discuss is not used (at least for your example, when you have only RPMs).

Actions #4

Updated by amacdona@redhat.com over 5 years ago

  • Priority changed from Normal to High
  • Triaged changed from No to Yes
Actions #5

Updated by amacdona@redhat.com over 5 years ago

It should be verified that this bug still affects more recent Pulps.

Actions #6

Updated by jluza over 5 years ago

wrote:

It should be verified that this bug still affects more recent Pulps.

I attached links poiting to master branch that contains related code, but if it's not enough we can write simple reproducer or try to reproduce on latest pulp

Actions #9

Updated by jcline@redhat.com over 5 years ago

  • Groomed changed from No to Yes
  • Sprint Candidate changed from No to Yes
Actions #11

Updated by bmbouter over 5 years ago

  • Status changed from NEW to CLOSED - WONTFIX

Pulp 3.0 is switching to PostgreSQL and away from mongodb so this issue is unique to the 2.y line. The original reporter ran into this issue by aggregating a large number of rpms from many repos into 1 repo and never removing them. The original reporter worked around this issue by removing old rpms that they no longer needed and moving some of them to other repos. Since their issue is resolve and it won't be an issue with Pulp 3.0 I'm going to close as wontfix.

If other users experience this issue or want to fix this issue please reopen or comment.

Actions #12

Updated by bmbouter about 3 years ago

  • Tags Pulp 2 added
Actions #13

Updated by ipanova@redhat.com over 2 years ago

  • Status changed from CLOSED - WONTFIX to POST
Actions #14

Updated by ipanova@redhat.com over 2 years ago

  • Sprint set to Sprint 63
Actions #15

Updated by ipanova@redhat.com over 2 years ago

  • Status changed from POST to MODIFIED

The fix was merged.

Actions #16

Updated by ipanova@redhat.com over 2 years ago

  • Platform Release set to 2.21.1
Actions #17

Updated by ipanova@redhat.com about 2 years ago

  • Status changed from MODIFIED to 5
Actions #18

Updated by ipanova@redhat.com about 2 years ago

  • Status changed from 5 to CLOSED - CURRENTRELEASE

Also available in: Atom PDF