Issue #2620
closedAll RPM repo searches are broken
Description
Let's say that one has created an RPM repository with an ID of "foo". One can search for content units in that RPM repository by making an HTTP POST request to /pulp/api/v2/repositories/foo/search/units/
with a body of at least {"criteria": {}}
. (A more typical body is {"criteria": {"type_ids": ["rpm"]}}
.) Unfortunately, executing such a search causes Pulp to return an HTTP 500. The following is logged:
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: Unhandled Exception
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) 'utf8' codec can't decode byte 0x9c in position 1: invalid start byte
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) Traceback (most recent call last):
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) File "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 112, in get_response
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) response = wrapped_callback(request, *callback_args, **callback_kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) File "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 69, in view
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) return self.dispatch(request, *args, **kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) File "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 87, in dispatch
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) return handler(request, *args, **kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) File "/usr/lib/python2.7/site-packages/pulp/server/webservices/views/decorators.py", line 241, in _auth_decorator
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) return _verify_auth(self, operation, super_user_only, method, *args, **kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) File "/usr/lib/python2.7/site-packages/pulp/server/webservices/views/decorators.py", line 195, in _verify_auth
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) value = method(self, *args, **kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) File "/usr/lib/python2.7/site-packages/pulp/server/webservices/views/util.py", line 130, in wrapper
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) return func(*args, **kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) File "/usr/lib/python2.7/site-packages/pulp/server/webservices/views/search.py", line 127, in post
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) return self._generate_response(query, options, *args, **kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) File "/usr/lib/python2.7/site-packages/pulp/server/webservices/views/repositories.py", line 294, in _generate_response
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) return generate_json_response_with_pulp_encoder(units)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) File "/usr/lib/python2.7/site-packages/pulp/server/webservices/views/util.py", line 52, in generate_json_response
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) json_obj = json.dumps(content, default=default)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) File "/usr/lib64/python2.7/json/__init__.py", line 250, in dumps
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) sort_keys=sort_keys, **kw).encode(obj)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) File "/usr/lib64/python2.7/json/encoder.py", line 207, in encode
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) chunks = self.iterencode(o, _one_shot=True)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) File "/usr/lib64/python2.7/json/encoder.py", line 270, in iterencode
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) return _iterencode(o, 0)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) UnicodeDecodeError: 'utf8' codec can't decode byte 0x9c in position 1: invalid start byte
A similar error is logged for all RPM repository searches. Triggering this failure is as easy as executing the following script:
pulp-admin rpm repo create --repo-id foo --feed https://repos.fedorapeople.org/pulp/pulp/fixtures/rpm-unsigned/
pulp-admin rpm repo sync run --repo-id foo
pulp-admin rpm repo content rpm --repo-id foo
The final command spits out an error:
$ pulp-admin rpm repo content rpm --repo-id foo
An internal error occurred on the Pulp server:
RequestException: POST request
on /pulp/api/v2/repositories/foo/search/units/ failed with 500 - 'utf8' codec
can't decode byte 0x9c in position 1: invalid start byte
This error is present for Pulp 2.12 and 2.13 nightlies on all supported platforms. The error is reproducible on both Jenkins and on my personal systems. Here's the packages installed on one of my systems:
[root@rhel-7-3-pulp-2-12 ~]# rpm -qa | grep -i pulp | sort
pulp-admin-client-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
pulp-docker-admin-extensions-2.3.1-0.1.alpha.git.5.052c506.el7.noarch
pulp-docker-plugins-2.3.1-0.1.alpha.git.5.052c506.el7.noarch
pulp-ostree-admin-extensions-1.2.1-0.1.alpha.git.27.7f9a84b.el7.noarch
pulp-ostree-plugins-1.2.1-0.1.alpha.git.27.7f9a84b.el7.noarch
pulp-puppet-admin-extensions-2.12.2-0.1.alpha.git.2.f338f5d.el7.noarch
pulp-puppet-plugins-2.12.2-0.1.alpha.git.2.f338f5d.el7.noarch
pulp-python-admin-extensions-2.0.1-0.1.alpha.git.6.8c46f3f.el7.noarch
pulp-python-plugins-2.0.1-0.1.alpha.git.6.8c46f3f.el7.noarch
pulp-rpm-admin-extensions-2.12.2-0.1.alpha.git.19.da51b5f.el7.noarch
pulp-rpm-plugins-2.12.2-0.1.alpha.git.19.da51b5f.el7.noarch
pulp-selinux-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
pulp-server-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
python-isodate-0.5.0-4.pulp.el7.noarch
python-kombu-3.0.33-6.pulp.el7.noarch
python-pulp-bindings-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
python-pulp-client-lib-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
python-pulp-common-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
python-pulp-docker-common-2.3.1-0.1.alpha.git.5.052c506.el7.noarch
python-pulp-oid_validation-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
python-pulp-ostree-common-1.2.1-0.1.alpha.git.27.7f9a84b.el7.noarch
python-pulp-puppet-common-2.12.2-0.1.alpha.git.2.f338f5d.el7.noarch
python-pulp-python-common-2.0.1-0.1.alpha.git.6.8c46f3f.el7.noarch
python-pulp-repoauth-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
python-pulp-rpm-common-2.12.2-0.1.alpha.git.19.da51b5f.el7.noarch
python-pulp-streamer-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
Related issues
Updated by semyers over 7 years ago
Ichimonji10 wrote:
This error is present for Pulp 2.12 and 2.13 on all supported platforms. ...
Do you know if this error is present for both 2.12.0 and 2.12.1, or if it's just 2.12.1?
Updated by Ichimonji10 over 7 years ago
Do you know if this error is present for both 2.12.0 and 2.12.1, or if it's just 2.12.1?
AFAIK, it's only present for the development versions of 2.12.1 and 2.13. I didn't observe it in the 2.12.1 release.
Issue updated to state "2.12 and 2.13 nightlies."
Updated by semyers over 7 years ago
- Priority changed from Normal to High
- Severity changed from 2. Medium to 3. High
Tentatively marking as a blocker for 2.12.z+, and raising the prio/sev each to high accordingly for a release blocker, pending review by the team during triage tomorrow.
Updated by ttereshc over 7 years ago
+1 for blocker
During search all the units data from db is selected including gzipped XML snippets for RPM/SRPM units, and this issue happens when json module tries to serialize this data to generate response.
I have not found any other calls affected by the similar issue.
It is possible to solve this issue in two ways I think:
- do not return to user these XML snippets at all
- unzip the data and return it to user
I am leaning towards the first option. These XML snippets are needed only for publish purposes and during search we just return everything we have in DB including such data which is meant to be for internal use only.
This is a hacky way but just to show that one way or the other we can exclude 'repodata' from the units data:
diff --git a/server/pulp/server/webservices/views/repositories.py b/server/pulp/server/webservices/views/repositories.py
index e0389c8..6ef4943 100644
--- a/server/pulp/server/webservices/views/repositories.py
+++ b/server/pulp/server/webservices/views/repositories.py
@@ -291,6 +291,7 @@ class RepoUnitSearch(search.SearchView):
units = manager.get_units(repo_id, criteria=criteria)
for unit in units:
content.remap_fields_with_serializer(unit['metadata'])
+ unit['metadata'].pop('repodata', None)
return generate_json_response_with_pulp_encoder(units)
The potential downside(?) that we no longer return XML snippets like we did before, though I can't come up with a use case when this data may be needed for someone.
Updated by Ichimonji10 over 7 years ago
During search all the units data from db is selected including gzipped XML snippets for RPM/SRPM units,
This is consistent with Pulp's behaviour. Tests which search for DRPM units are not affected, as I discovered while walking through the failing tests and writing up Pulp Smash #575.
Updated by bizhang over 7 years ago
- Sprint/Milestone set to 34
- Triaged changed from No to Yes
Updated by ipanova@redhat.com over 7 years ago
I confirm that it happens just with rpm and srpm.
Updated by daviddavis over 7 years ago
Could you maybe use projection to not return the field?
https://docs.mongodb.com/manual/tutorial/project-fields-from-query-results/
Regardless, option 1 sounds good to me.
Updated by jortel@redhat.com over 7 years ago
Seems like not returning the metadata would perform better but won't that break semantic versioning by altering the data returned by the API.
Updated by semyers over 7 years ago
We chatted about this in IRC and agree that option 1 would probably break semver, leaving option 2. ttereshc has updated comment 8 accordingly.
Updated by ttereshc over 7 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to ttereshc
Updated by semyers over 7 years ago
- Status changed from ASSIGNED to POST
After a lot of investigation, and a process of elimination going through successively less simple solutions, we put together these PRs to fix views affected by this (hopefully) without breaking anything else:
https://github.com/pulp/pulp/pull/2962
https://github.com/pulp/pulp_rpm/pull/1037
Added by ttereshc over 7 years ago
Added by ttereshc over 7 years ago
Revision 22efbda4 | View on GitHub
Replace remap_fields with serialize_unit
In order to fix an RPM bug, an opportunity presented itself to slightly
improve this interface, effectively renaming
remap_fields_with_serializer
to serialize_unit_with_serializer
, and
exposing a new "serialize" method on ModelSerializer
that can be
overridden in subclasses as desired to help with serializing "exotic"
field types, such as BLObs and other things not easily handled with our
custom JSON encoder.
Added by ttereshc over 7 years ago
Revision 313fe912 | View on GitHub
Decompress RPM/SRPM metadata in their serializer
Updated by ttereshc over 7 years ago
- Status changed from POST to MODIFIED
Applied in changeset pulp:pulp|22efbda46d8e3e3f8d6068b6cd9cd353b677e2e2.
Updated by semyers over 7 years ago
We had a good team chat about the "Verification Required" flag on Monday, and decided that the release of 2.12.2 should not be blocked on the verification of this issue.
Updated by Ichimonji10 over 7 years ago
We had a good team chat about the "Verification Required" flag on Monday, and decided that the release of 2.12.2 should not be blocked on the verification of this issue.
Hunh? This issue is already verified. The 2.12.2 release wouldn't be blocked no matter the status of the "Verification Required" flag on this issue.
Also, this issue should definitely block the 2.12.2 release. (But the issue is fixed and has been verified, so there's no need to block.)
Updated by bizhang over 7 years ago
- Status changed from 5 to CLOSED - CURRENTRELEASE
Updated by daviddavis over 7 years ago
- Has duplicate Issue #2753: Cannot search rpm content from the CLI added
Replace remap_fields with serialize_unit
In order to fix an RPM bug, an opportunity presented itself to slightly improve this interface, effectively renaming
remap_fields_with_serializer
toserialize_unit_with_serializer
, and exposing a new "serialize" method onModelSerializer
that can be overridden in subclasses as desired to help with serializing "exotic" field types, such as BLObs and other things not easily handled with our custom JSON encoder.closes #2620 https://pulp.plan.io/issues/2620