Project

Profile

Help

Issue #2620

All RPM repo searches are broken

Added by Ichimonji10 10 months ago. Updated 8 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
High
Assignee:
Category:
-
Sprint/Milestone:
Severity:
3. High
Version:
Platform Release:
2.12.2
Blocks Release:
2.12.z, 2.13.z
OS:
Backwards Incompatible:
No
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
QA Contact:
Complexity:
Smash Test:
Verified:
Yes
Verification Required:
No

Description

Let's say that one has created an RPM repository with an ID of "foo". One can search for content units in that RPM repository by making an HTTP POST request to /pulp/api/v2/repositories/foo/search/units/ with a body of at least {"criteria": {}}. (A more typical body is {"criteria": {"type_ids": ["rpm"]}}.) Unfortunately, executing such a search causes Pulp to return an HTTP 500. The following is logged:

Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: Unhandled Exception
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) 'utf8' codec can't decode byte 0x9c in position 1: invalid start byte
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) Traceback (most recent call last):
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)   File "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 112, in get_response
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)     response = wrapped_callback(request, *callback_args, **callback_kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)   File "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 69, in view
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)     return self.dispatch(request, *args, **kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)   File "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 87, in dispatch
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)     return handler(request, *args, **kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)   File "/usr/lib/python2.7/site-packages/pulp/server/webservices/views/decorators.py", line 241, in _auth_decorator
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)     return _verify_auth(self, operation, super_user_only, method, *args, **kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)   File "/usr/lib/python2.7/site-packages/pulp/server/webservices/views/decorators.py", line 195, in _verify_auth
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)     value = method(self, *args, **kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)   File "/usr/lib/python2.7/site-packages/pulp/server/webservices/views/util.py", line 130, in wrapper
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)     return func(*args, **kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)   File "/usr/lib/python2.7/site-packages/pulp/server/webservices/views/search.py", line 127, in post
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)     return self._generate_response(query, options, *args, **kwargs)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)   File "/usr/lib/python2.7/site-packages/pulp/server/webservices/views/repositories.py", line 294, in _generate_response
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)     return generate_json_response_with_pulp_encoder(units)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)   File "/usr/lib/python2.7/site-packages/pulp/server/webservices/views/util.py", line 52, in generate_json_response
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)     json_obj = json.dumps(content, default=default)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)   File "/usr/lib64/python2.7/json/__init__.py", line 250, in dumps
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)     sort_keys=sort_keys, **kw).encode(obj)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)   File "/usr/lib64/python2.7/json/encoder.py", line 207, in encode
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)     chunks = self.iterencode(o, _one_shot=True)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)   File "/usr/lib64/python2.7/json/encoder.py", line 270, in iterencode
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856)     return _iterencode(o, 0)
Mar 06 11:33:11 rhel-7-3-pulp-2-12 pulp[3309]: pulp.server.webservices.middleware.exception:ERROR: (3309-85856) UnicodeDecodeError: 'utf8' codec can't decode byte 0x9c in position 1: invalid start byte

A similar error is logged for all RPM repository searches. Triggering this failure is as easy as executing the following script:

pulp-admin rpm repo create --repo-id foo --feed https://repos.fedorapeople.org/pulp/pulp/fixtures/rpm-unsigned/
pulp-admin rpm repo sync run --repo-id foo
pulp-admin rpm repo content rpm --repo-id foo

The final command spits out an error:

$ pulp-admin rpm repo content rpm --repo-id foo                                                            
An internal error occurred on the Pulp server:

RequestException: POST request
on /pulp/api/v2/repositories/foo/search/units/ failed with 500 - 'utf8' codec
can't decode byte 0x9c in position 1: invalid start byte

This error is present for Pulp 2.12 and 2.13 nightlies on all supported platforms. The error is reproducible on both Jenkins and on my personal systems. Here's the packages installed on one of my systems:

[root@rhel-7-3-pulp-2-12 ~]# rpm -qa | grep -i pulp | sort
pulp-admin-client-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
pulp-docker-admin-extensions-2.3.1-0.1.alpha.git.5.052c506.el7.noarch
pulp-docker-plugins-2.3.1-0.1.alpha.git.5.052c506.el7.noarch
pulp-ostree-admin-extensions-1.2.1-0.1.alpha.git.27.7f9a84b.el7.noarch
pulp-ostree-plugins-1.2.1-0.1.alpha.git.27.7f9a84b.el7.noarch
pulp-puppet-admin-extensions-2.12.2-0.1.alpha.git.2.f338f5d.el7.noarch
pulp-puppet-plugins-2.12.2-0.1.alpha.git.2.f338f5d.el7.noarch
pulp-python-admin-extensions-2.0.1-0.1.alpha.git.6.8c46f3f.el7.noarch
pulp-python-plugins-2.0.1-0.1.alpha.git.6.8c46f3f.el7.noarch
pulp-rpm-admin-extensions-2.12.2-0.1.alpha.git.19.da51b5f.el7.noarch
pulp-rpm-plugins-2.12.2-0.1.alpha.git.19.da51b5f.el7.noarch
pulp-selinux-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
pulp-server-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
python-isodate-0.5.0-4.pulp.el7.noarch
python-kombu-3.0.33-6.pulp.el7.noarch
python-pulp-bindings-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
python-pulp-client-lib-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
python-pulp-common-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
python-pulp-docker-common-2.3.1-0.1.alpha.git.5.052c506.el7.noarch
python-pulp-oid_validation-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
python-pulp-ostree-common-1.2.1-0.1.alpha.git.27.7f9a84b.el7.noarch
python-pulp-puppet-common-2.12.2-0.1.alpha.git.2.f338f5d.el7.noarch
python-pulp-python-common-2.0.1-0.1.alpha.git.6.8c46f3f.el7.noarch
python-pulp-repoauth-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch
python-pulp-rpm-common-2.12.2-0.1.alpha.git.19.da51b5f.el7.noarch
python-pulp-streamer-2.12.2-0.1.alpha.git.17.8ca5cf2.el7.noarch

Related issues

Duplicated by Pulp - Issue #2753: Cannot search rpm content from the CLI CLOSED - DUPLICATE Actions

Associated revisions

Revision 22efbda4 View on GitHub
Added by ttereshc 9 months ago

Replace remap_fields with serialize_unit

In order to fix an RPM bug, an opportunity presented itself to slightly
improve this interface, effectively renaming
`remap_fields_with_serializer` to `serialize_unit_with_serializer`, and
exposing a new "serialize" method on `ModelSerializer` that can be
overridden in subclasses as desired to help with serializing "exotic"
field types, such as BLObs and other things not easily handled with our
custom JSON encoder.

closes #2620
https://pulp.plan.io/issues/2620

Revision 313fe912 View on GitHub
Added by ttereshc 9 months ago

Decompress RPM/SRPM metadata in their serializer

re #2620
https://pulp.plan.io/issues/2620

History

#1 Updated by Ichimonji10 10 months ago

  • Description updated (diff)

#2 Updated by Ichimonji10 10 months ago

  • Description updated (diff)

#3 Updated by Ichimonji10 10 months ago

  • Description updated (diff)

#4 Updated by semyers 10 months ago

Ichimonji10 wrote:

This error is present for Pulp 2.12 and 2.13 on all supported platforms. ...

Do you know if this error is present for both 2.12.0 and 2.12.1, or if it's just 2.12.1?

#5 Updated by Ichimonji10 10 months ago

Do you know if this error is present for both 2.12.0 and 2.12.1, or if it's just 2.12.1?

AFAIK, it's only present for the development versions of 2.12.1 and 2.13. I didn't observe it in the 2.12.1 release.

Issue updated to state "2.12 and 2.13 nightlies."

#6 Updated by Ichimonji10 10 months ago

  • Description updated (diff)

#7 Updated by semyers 10 months ago

  • Priority changed from Normal to High
  • Severity changed from 2. Medium to 3. High
  • Blocks Release 2.12.z, 2.13.z added

Tentatively marking as a blocker for 2.12.z+, and raising the prio/sev each to high accordingly for a release blocker, pending review by the team during triage tomorrow.

#8 Updated by ttereshc 10 months ago

+1 for blocker

During search all the units data from db is selected including gzipped XML snippets for RPM/SRPM units, and this issue happens when json module tries to serialize this data to generate response.
I have not found any other calls affected by the similar issue.

It is possible to solve this issue in two ways I think:
- do not return to user these XML snippets at all
- unzip the data and return it to user

I am leaning towards the first option. These XML snippets are needed only for publish purposes and during search we just return everything we have in DB including such data which is meant to be for internal use only.
This is a hacky way but just to show that one way or the other we can exclude 'repodata' from the units data:

diff --git a/server/pulp/server/webservices/views/repositories.py b/server/pulp/server/webservices/views/repositories.py
index e0389c8..6ef4943 100644
--- a/server/pulp/server/webservices/views/repositories.py
+++ b/server/pulp/server/webservices/views/repositories.py
@@ -291,6 +291,7 @@ class RepoUnitSearch(search.SearchView):
             units = manager.get_units(repo_id, criteria=criteria)
         for unit in units:
             content.remap_fields_with_serializer(unit['metadata'])
+            unit['metadata'].pop('repodata', None)
         return generate_json_response_with_pulp_encoder(units)

The potential downside(?) that we no longer return XML snippets like we did before, though I can't come up with a use case when this data may be needed for someone.

#9 Updated by Ichimonji10 10 months ago

During search all the units data from db is selected including gzipped XML snippets for RPM/SRPM units,

This is consistent with Pulp's behaviour. Tests which search for DRPM units are not affected, as I discovered while walking through the failing tests and writing up Pulp Smash #575.

#10 Updated by bizhang 10 months ago

  • Sprint/Milestone set to Sprint 16
  • Triaged changed from No to Yes

#11 Updated by ipanova@redhat.com 10 months ago

I confirm that it happens just with rpm and srpm.

#12 Updated by daviddavis@redhat.com 10 months ago

Could you maybe use projection to not return the field?

https://docs.mongodb.com/manual/tutorial/project-fields-from-query-results/

Regardless, option 1 sounds good to me.

#13 Updated by jortel@redhat.com 10 months ago

Seems like not returning the metadata would perform better but won't that break semantic versioning by altering the data returned by the API.

#14 Updated by semyers 9 months ago

We chatted about this in IRC and agree that option 1 would probably break semver, leaving option 2. ttereshc has updated comment 8 accordingly.

#15 Updated by ttereshc 9 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to ttereshc

#16 Updated by semyers 9 months ago

  • Status changed from ASSIGNED to POST

After a lot of investigation, and a process of elimination going through successively less simple solutions, we put together these PRs to fix views affected by this (hopefully) without breaking anything else:
https://github.com/pulp/pulp/pull/2962
https://github.com/pulp/pulp_rpm/pull/1037

#17 Updated by ttereshc 9 months ago

  • Status changed from POST to MODIFIED

#18 Updated by semyers 9 months ago

  • Platform Release set to 2.12.2

#19 Updated by pthomas@redhat.com 9 months ago

  • Verified changed from No to Yes

#20 Updated by semyers 9 months ago

  • Verification Required changed from No to Yes

#21 Updated by semyers 9 months ago

  • Status changed from MODIFIED to ON_QA

#22 Updated by semyers 9 months ago

  • Verification Required changed from Yes to No

We had a good team chat about the "Verification Required" flag on Monday, and decided that the release of 2.12.2 should not be blocked on the verification of this issue.

#23 Updated by Ichimonji10 9 months ago

We had a good team chat about the "Verification Required" flag on Monday, and decided that the release of 2.12.2 should not be blocked on the verification of this issue.

Hunh? This issue is already verified. The 2.12.2 release wouldn't be blocked no matter the status of the "Verification Required" flag on this issue.

Also, this issue should definitely block the 2.12.2 release. (But the issue is fixed and has been verified, so there's no need to block.)

#25 Updated by bizhang 8 months ago

  • Status changed from ON_QA to CLOSED - CURRENTRELEASE

#26 Updated by daviddavis@redhat.com 7 months ago

  • Duplicated by Issue #2753: Cannot search rpm content from the CLI added

Please register to edit this issue

Also available in: Atom PDF