Issue #1903
closedRPM import traceback (non-utf-8 metadata slipping through)
Description
Hi,
I'm getting a "special" problem with a few RPMs.
Those RPMs have finnish alphabet characters in their author's names in the %description metadata.
`Authors:¶
Juha Yrj<F6>l<E4> <jyrjola@cc.hut.fi>
Antti Tapaninen <aet@cc.hut.fi>
Timo Ter<E4>s <timo.teras@iki.fi>
Olaf Kirch <okir@suse.de>
%files`
When you run them through iconv from latin to utf-8 it all magically clears up:
`iconv -f iso8859-1 -t utf-8 opensc.spec | grep -A6 Authors Authors:
Juha Yrjölä <jyrjola@cc.hut.fi>
Antti Tapaninen <aet@cc.hut.fi>
Timo Teräs <timo.teras@iki.fi>
Olaf Kirch <okir@suse.de>
%files`
So my impression is that the SPEC file has utf-8 content (yay) but is in fact stored as latin1 (NOES).
Trying to upload such a package of them triggers a traceback.
We try to mirror all packages in all versions we need, so a re-issue of the package doesn't really solve matters.
(Example to clarify this: Let's assume they're on one of the OS DVD's and we want to mirror them 1:1)
An example package is opensc-0.11.6-5.27.1.x86_64.rpm from SLES11SP2:
It can apparently be obtained via
http://mirror.mes.edu.cu/SLES_11_SP2/CD1/suse/x86_64/opensc-0.11.6-5.27.1.x86_64.rpm
md5: 0c463515b28998ac9966400e5d14588d
The traceback looks like this:
May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) unexpected error occurred importing uploaded file May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) Traceback (most recent call last): May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/yum/upload.py", line 118, in upload May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) handlers[type_id](repo, type_id, unit_key, metadata, file_path, conduit, config) May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/yum/upload.py", line 390, in _handle_package May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) unit.save_and_import_content(file_path) May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) File "/usr/lib/python2.7/site-packages/pulp/server/db/model/__init__.py", line 802, in save_and_import_content May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) self.save() May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) File "/usr/lib/python2.7/site-packages/mongoengine/document.py", line 324, in save May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) object_id = collection.save(doc, **write_concern) May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 2180, in save May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) check_keys, False, manipulate, write_concern) May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 709, in _update May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) codec_options=self.codec_options).copy() May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) File "/usr/lib64/python2.7/site-packages/pymongo/pool.py", line 216, in command May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) self._raise_connection_failure(error) May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) File "/usr/lib64/python2.7/site-packages/pymongo/pool.py", line 343, in _raise_connection_failure May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) raise error May 06 14:49:18 myhost pulp[27466]: pulp_rpm.plugins.importers.yum.upload:ERROR: (27466-26912) InvalidStringData: strings in documents must be valid UTF-8: 'OpenSC provides a set of libraries and utilities to access smart cards.\nIt mainly focuses on cards that support cryptographic operations. It\nfacilitates their use in security applications such as mail encryption,\nauthentication, and digital signature. OpenSC implements the PKCS#11\nAPI. Applications supporting this API, such as Mozilla Firefox and\nThunderbird, can use it. OpenSC implements the PKCS#15 standard and\naims to be compatible with every software that does so, too.\n\nBefore purchasing any cards, please read carefully documentation in\n/usr/share/doc/packages/opensc/wiki/index.html - only some cards are\nsupported. Not only card type matters, but also card version, card OS\nversion and preloaded applet. Only subset of possible operations may be\nsupported for your card. Card initialization may require third party\nproprietary software.\n\n\n\nAuthors:\n--------\n Juha Yrj\xf6l\xe4 <jyrjola@cc.hut.fi>\n Antti Tapaninen <aet@cc.hut.fi>\n Timo Ter\xe4s <timo.teras@iki.fi>\n Olaf Kirch <okir@suse.de>' May 06 14:49:18 myhost pulp[27466]: py.warnings:WARNING: (27466-26912) /usr/lib/python2.7/site-packages/mongoengine/document.py:367: DeprecationWarning: update is deprecated. Use replace_one, update_one or update_many instead. May 06 14:49:18 myhost pulp[27466]: py.warnings:WARNING: (27466-26912) upsert=upsert, **write_concern) May 06 14:49:18 myhost pulp[27466]: py.warnings:WARNING: (27466-26912)
I've been hinted to look at string_to_unicode(data) in pulp_rpm/yum_plugin/util.py which in my test runs all but two lines through the utf-8 path and the two with the 'umlauts' fail there and (supposedly) run through the iso-8859-1 path.
That seems to work but where later, when running self.save() it seems there are demons.
I'm attaching the extracted SPEC, too.
Since the issue occurs basically with some random RPMs I don't have influence on I'd love to help improve the input handling here so it generally survives them.
Files
Related issues
Updated by darkfader over 8 years ago
- File opensc.spec opensc.spec added
Attachment: the funny SPECfile!
Btw, for reasons we don't know yet a rather recent rpmlint on SLES12 doesn't alert on this, while the one on CentOS7 does.
We're trying to track that one down.
Updated by mhrivnak over 8 years ago
During sync, there is some logic to run the XML snippet for the package through a function called "string_to_unicode", which ensures all the text can be saved in our DB. This defends against RPMs with non-standard encodings in metadata. However, no such defense is present in the upload workflow.
For this particular issue, running the string on this line: https://github.com/pulp/pulp_rpm/blob/pulp-rpm-2.8.2-1/plugins/pulp_rpm/plugins/importers/yum/upload.py#L542
... through the "string_to_unicode" function would likely resolve the problem. There may be other metadata fields we should consider running through that function.
Updated by darkfader over 8 years ago
For the record:
https://pulp-rpm.readthedocs.io/en/latest/user-guide/troubleshooting.html refers to a similar case.
A possible solution seems to by "ftfy" from pip which apparently knows how to detect double-miscodings like this.
Handling mildly broken input in another way than a Traceback would be much desirable, too, since that causes other issues.
(i.e. people will need to exclude "this kind of traceback" from Pulp's monitoring :-)
Updated by darkfader over 8 years ago
using string_to_unicode on this works.
I'll add a PR tomorrow.
Updated by mhrivnak over 8 years ago
Following up on today's triage discussion, it looks like we've been through this problem at least once before.
For this one, we determined that SUSE had removed the offending package, and we would document that pulp_rpm requires utf-8. We also determined that when createrepo faces such a package, it would fall back to decoding as latin1. (based on a quick test just now, createrepo_c seems to do the same)
https://pulp.plan.io/issues/490
Similar to createrepo's behavior, pulp's sync workflow will try to decode with utf-8, and if that fails, will fall back to latin1. That behavior originates here:
https://bugzilla.redhat.com/show_bug.cgi?id=911650
And was vastly improved here:
https://bugzilla.redhat.com/show_bug.cgi?id=923448
https://github.com/pulp/pulp_rpm/pull/157
Given all that, I think the easiest resolution is to call this a duplicate of #490. It looks like the same exact problem, and we could happily stick with our previous decision on it.
That said, I'm not sure which is the better user experience. Consider a user trying to manage RPMs they got from somewhere else, which is most of our users. Is there harm in decoding as latin1, even though we know it's not the right encoding? From that user's standpoint, having a few incorrect characters in pulp's "description" field is probably more useful than not having the RPM in pulp at all. Are there negative consequences to that besides a few weird characters in the description? That's the behavior we have right now during sync, so if we decide to be strict, we should probably adjust the sync as well.
However, if we decide to have a policy that is more strict than other tools that work with the same data, such as createrepo*, we need to think carefully about why and explain it well.
Updated by rbarlow over 8 years ago
The attached patch is dangerous. The only reason it works (it doesn't actually work, it just hides the error message) is that latin-1 uses the full 8 bits, and so no codepoint can raise an error since they are all valid (though not correct). The most populous continent of the world does not (and can not) use latin-1. RPMs encoded from those languages will result in garbage data throughout the RPM. If Pulp's fields are not valuable to the point that we are willing to fill them with garbage data intentionally, then why not just remove the fields?
The sensible approach is to raise an error message for RPMs that do not conform to the standard (UTF-8, which includes ASCII). This allows the user to take appropriate action empowered with a correct error message. Rather than walking away believing that Pulp is garbage and it ate their data, they can instead solve the problem or bring the matter to the vendor of the RPM so they can solve it.
Updated by darkfader over 8 years ago
I'll currently traveling but will write something about 'sensible' as soon
as I can.
Basically, You can pick between not correct or useless to the main purpose.
Florian
Am 16.05.2016 3:26 nachm. schrieb "Pulp" <dropbox+pulp+c71e+pulp@plan.io>:
Updated by rbarlow over 8 years ago
Another option is to allow the RPM into Pulp, but set any non-UTF-8 data to None. This way users can still have the package in Pulp, and we don't have garbage data either. Missing data is better then incorrect data, and we can still give warning/error messages to the user about why the data is null.
Updated by mhrivnak over 8 years ago
- Priority changed from Normal to Low
- Severity changed from 2. Medium to 1. Low
- Triaged changed from No to Yes
Updated by darkfader about 8 years ago
Hi,
I kinda forgot about this although probably we lost the functionality again with the last update.
Anyway, the very summarized feedback:
If Pulp removes invalid (wrongly encoded) data, that is "meh" but better than breaking.
Ideally, it should leave a message in that case.
Besides that:
Pulp iirc was made to manage repositories of RPM packages. Not to maintain consistency of a low-criticality metadata field. As per that, yes, let it do anything to manage the RPM repo management thing.
Not uploading an RPM that is manageable by RPM, yum, zypper and createrepo is failing the main job.
If we need to run rpmlint and put that in our release process, we can (and likely DO) that. It's just not related to the most basic use case here.
And breaking that to ensure there's no bad encoding on the field is illogical.
Updated by semyers about 8 years ago
- Groomed changed from No to Yes
darkfader wrote:
Pulp iirc was made to manage repositories of RPM packages. Not to maintain consistency of a low-criticality metadata field.
After discussing this some more, we generally agree. I've opened up a separate (non-blocking) task[0] to document our conclusions, which are basically that "If createrepo_c can make a repo with the RPM, Pulp should be able to sync it". createrepo_c tries to use utf-8, and if it fails it munges the data by converting to latin1 just like createrepo_c would.
Bonus points to whoever implements this for logging when Pulp encounters this case that the incoming data encoding was invalid so Pulp punted to using latin1.
Updated by bmbouter about 8 years ago
- Project changed from Pulp to RPM Support
Moving to the RPM tracker.
Updated by bmbouter about 8 years ago
- Status changed from NEW to POST
PR from ulif available: https://github.com/pulp/pulp_rpm/pull/1004
Updated by bmbouter almost 8 years ago
After learning more about unicode, I have a new idea on how to fix this better than falling back to latin-1. tl;dr: we should decode to utf-8 using the 'replace' option. data.decode('utf-8', 'replace')
.
Consider this example, where we have Pulp try to decode invalid utf-8 encoded data. Note the '\x9a' is not a valid utf-8 encoded unicode code point.
[bmbouter@localhost devel]$ python
Python 2.7.12 (default, Sep 29 2016, 13:30:34)
[GCC 6.2.1 20160916 (Red Hat 6.2.1-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> utf8_snowman = u"\u2603".encode('utf-8')
>>> print utf8_snowman
☃
>>> my_utf_8_bytes = utf8_snowman + '\x9a'
>>> my_utf_8_bytes.decode('utf-8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x9a in position 3: invalid start byte
>>> my_utf_8_bytes.decode('latin-1')
u'\xe2\x98\x83\x9a'
>>> print(my_utf_8_bytes.decode('latin-1'))
â
You can see the snowman is nowhere to be found even though the snowman symbol was not corrupted. This demonstrates how falling back to latin-1 could cause the entire string to become garbled even though there is a single corrupted symbol.
Consider now using the 'replace' option instead of falling back to 'latin-1'.
[bmbouter@localhost devel]$ python
Python 2.7.12 (default, Sep 29 2016, 13:30:34)
[GCC 6.2.1 20160916 (Red Hat 6.2.1-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> utf8_snowman = u"\u2603".encode('utf-8')
>>> print utf8_snowman
☃
>>> my_utf_8_bytes = utf8_snowman + '\x9a'
>>> my_utf_8_bytes.decode('utf-8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x9a in position 3: invalid start byte
>>> my_utf_8_bytes.decode('utf-8', 'replace')
u'\u2603\ufffd'
>>> print my_utf_8_bytes.decode('utf-8', 'replace')
☃�
>>>
Observe that the snowman is still preserved! Notice also that the corrupted character is replaced by the unicode codepoint representing an unknown symbol.
To accomplish this, we should apply a patch that is roughly:
diff --git a/plugins/pulp_rpm/yum_plugin/util.py b/plugins/pulp_rpm/yum_plugin/util.py
index 2a6acff..1b44007 100644
--- a/plugins/pulp_rpm/yum_plugin/util.py
+++ b/plugins/pulp_rpm/yum_plugin/util.py
@@ -115,12 +115,7 @@ def string_to_unicode(data):
:return: data as a unicode object
:rtype: unicode
"""
- for code in ENCODING_LIST:
- try:
- return data.decode(code)
- except UnicodeError:
- # try others
- continue
+ return data.decode('utf-8', 'replace')
LISTING_FILE_NAME = 'listing'
Updated by dkliban@redhat.com almost 8 years ago
- Sprint/Milestone changed from 31 to 32
Updated by mhrivnak almost 8 years ago
- Status changed from POST to NEW
- Priority changed from Low to Normal
I'm putting this back at NEW since I think we need a wholly different patch, and the PR appears to be abandoned by its author.
Updated by jortel@redhat.com almost 8 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to jortel@redhat.com
Updated by jortel@redhat.com almost 8 years ago
Nice leg work on this bmbouter.
Looks like there are (2) identical string_to_unicode() functions in the RPM package. Any objection to getting rid of both and changing the code to use decode('utf-8', 'replace') directly? According to documentation, the replace error policy will prevent UnicodeError from being raised. I don't see any value provided by the util functions. One consideration is that getting rid of the functions could break 3rd party plugins using this code. Thoughts?
[0] https://github.com/pulp/pulp_rpm/blob/2.12-dev/plugins/pulp_rpm/yum_plugin/util.py#L104
[1] https://github.com/pulp/pulp_rpm/blob/2.12-dev/plugins/pulp_rpm/plugins/importers/yum/parse/rpm.py#L80
Updated by mhrivnak almost 8 years ago
I think that's a great plan. I don't know of any 3rd-party plugins that import code from this plugin, and we certainly don't make any guarantees at this point about using a plugin as a library.
Updated by jortel@redhat.com almost 8 years ago
- Status changed from ASSIGNED to POST
Added by jortel@redhat.com almost 8 years ago
Updated by jortel@redhat.com almost 8 years ago
- Status changed from POST to MODIFIED
Applied in changeset a4ccac978f90b02516ce2061f0d393ca2f9ec5af.
Updated by semyers almost 8 years ago
- Platform Release changed from 2.13.0 to 2.12.2
woops, should've been 2.12.2
Updated by Ichimonji10 almost 8 years ago
- Status changed from 5 to ASSIGNED
Uploading the RPM listed in the original bug description causes a traceback. Here's a script demonstrating the issue:
wget 'http://mirror.mes.edu.cu/SLES_11_SP2/CD1/suse/x86_64/opensc-0.11.6-5.27.1.x86_64.rpm'
pulp-admin login -u admin
pulp-admin rpm repo create --repo-id foo
pulp-admin rpm repo uploads rpm --repo-id foo --file opensc-0.11.6-5.27.1.x86_64.rpm
Sample output from last step:
[root@fedora-24-pulp-2-12 ~]# pulp-admin rpm repo uploads rpm --repo-id foo --file opensc-0.11.6-5.27.1.x86_64.rpm
+----------------------------------------------------------------------+
Unit Upload
+----------------------------------------------------------------------+
Extracting necessary metadata for each request...
[==================================================] 100%
Analyzing: opensc-0.11.6-5.27.1.x86_64.rpm
... completed
Creating upload requests on the server...
[==================================================] 100%
Initializing: opensc-0.11.6-5.27.1.x86_64.rpm
... completed
Starting upload of selected units. If this process is stopped through ctrl+c,
the uploads will be paused and may be resumed later using the resume command or
canceled entirely using the cancel command.
Uploading: opensc-0.11.6-5.27.1.x86_64.rpm
[==================================================] 100%
503870/503870 bytes
... completed
Importing into the repository...
This command may be exited via ctrl+c without affecting the request.
[\]
Running...
Task Failed
The importer yum_importer indicated a failed response when uploading rpm unit to
repository foo.
Deleting the upload request...
... completed
Output from journalctl
:
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) unexpected error occurred importing uploaded file: strings in documents must be valid UTF-8: 'OpenSC provides a set of libraries and utilities to access smart cards.\nIt mainly focuses on cards that support cryptographic operations. It\nfacilitates their use in security applications such as mail encryption,\nauthentication, and digital signature. OpenSC implements the PKCS#11\nAPI. Applications supporting this API, such as Mozilla Firefox and\nThunderbird, can use it. OpenSC implements the PKCS#15 standard and\naims to be compatible with every software that does so, too.\n\nBefore purchasing any cards, please read carefully documentation in\n/usr/share/doc/packages/opensc/wiki/index.html - only some cards are\nsupported. Not only card type matters, but also card version, card OS\nversion and preloaded applet. Only subset of possible operations may be\nsupported for your card. Card initialization may require third party\nproprietary software.\n\n\n\nAuthors:\n--------\n Juha Yrj\xf6l\xe4 <jyrjola@cc.hut.fi>\n Antti Tapaninen <aet@cc.hut.fi>\n Timo Ter\xe4s <timo.teras@iki.fi>\n Olaf Kirch <okir@suse.de>'
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) Traceback (most recent call last):
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/yum/upload.py", line 118, in upload
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) handlers[type_id](repo, type_id, unit_key, metadata, file_path, conduit, config)
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/yum/upload.py", line 434, in _handle_package
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) unit.save_and_import_content(file_path)
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) File "/usr/lib/python2.7/site-packages/pulp/server/db/model/__init__.py", line 906, in save_and_import_content
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) self.save()
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) File "/usr/lib/python2.7/site-packages/mongoengine/document.py", line 324, in save
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) object_id = collection.save(doc, **write_concern)
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 2185, in save
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) check_keys, False, manipulate, write_concern)
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 709, in _update
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) codec_options=self.codec_options).copy()
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) File "/usr/lib64/python2.7/site-packages/pymongo/pool.py", line 214, in command
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) self._raise_connection_failure(error)
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) File "/usr/lib64/python2.7/site-packages/pymongo/pool.py", line 342, in _raise_connection_failure
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) raise error
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp_rpm.plugins.importers.yum.upload:ERROR: (1228-25472) InvalidStringData: strings in documents must be valid UTF-8: 'OpenSC provides a set of libraries and utilities to access smart cards.\nIt mainly focuses on cards that support cryptographic operations. It\nfacilitates their use in security applications such as mail encryption,\nauthentication, and digital signature. OpenSC implements the PKCS#11\nAPI. Applications supporting this API, such as Mozilla Firefox and\nThunderbird, can use it. OpenSC implements the PKCS#15 standard and\naims to be compatible with every software that does so, too.\n\nBefore purchasing any cards, please read carefully documentation in\n/usr/share/doc/packages/opensc/wiki/index.html - only some cards are\nsupported. Not only card type matters, but also card version, card OS\nversion and preloaded applet. Only subset of possible operations may be\nsupported for your card. Card initialization may require third party\nproprietary software.\n\n\n\nAuthors:\n--------\n Juha Yrj\xf6l\xe4 <jyrjola@cc.hut.fi>\n Antti Tapaninen <aet@cc.hut.fi>\n Timo Ter\xe4s <timo.teras@iki.fi>\n Olaf Kirch <okir@suse.de>'
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp.server.managers.content.upload:ERROR: (1228-25472) Error from the importer while importing uploaded unit to repository [foo]
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp.server.managers.content.upload:ERROR: (1228-25472) Traceback (most recent call last):
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp.server.managers.content.upload:ERROR: (1228-25472) File "/usr/lib/python2.7/site-packages/pulp/server/managers/content/upload.py", line 223, in import_uploaded_unit
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp.server.managers.content.upload:ERROR: (1228-25472) unit_type=unit_type_id, summary=result['summary'], details=result['details']
Mar 13 16:35:10 fedora-24-pulp-2-12 pulp[1228]: pulp.server.managers.content.upload:ERROR: (1228-25472) PulpCodedException: The importer yum_importer indicated a failed response when uploading rpm unit to repository foo.
I observe this error on at least Fedora 24 and Fedora 25. Here's the packages installed on one of my test systems:
[root@fedora-24-pulp-2-12 ~]# rpm -qa | grep -i pulp | sort
pulp-admin-client-2.12.2-0.1.beta.fc24.noarch
pulp-docker-admin-extensions-2.3.0-1.fc24.noarch
pulp-docker-plugins-2.3.0-1.fc24.noarch
pulp-ostree-admin-extensions-1.2.1-0.1.beta.fc24.noarch
pulp-ostree-plugins-1.2.1-0.1.beta.fc24.noarch
pulp-puppet-admin-extensions-2.12.2-0.1.beta.fc24.noarch
pulp-puppet-plugins-2.12.2-0.1.beta.fc24.noarch
pulp-python-admin-extensions-1.1.3-1.fc24.noarch
pulp-python-plugins-1.1.3-1.fc24.noarch
pulp-rpm-admin-extensions-2.12.2-0.1.beta.fc24.noarch
pulp-rpm-plugins-2.12.2-0.1.beta.fc24.noarch
pulp-selinux-2.12.2-0.1.beta.fc24.noarch
pulp-server-2.12.2-0.1.beta.fc24.noarch
python-kombu-3.0.33-6.pulp.fc24.noarch
python-pulp-bindings-2.12.2-0.1.beta.fc24.noarch
python-pulp-client-lib-2.12.2-0.1.beta.fc24.noarch
python-pulp-common-2.12.2-0.1.beta.fc24.noarch
python-pulp-docker-common-2.3.0-1.fc24.noarch
python-pulp-oid_validation-2.12.2-0.1.beta.fc24.noarch
python-pulp-ostree-common-1.2.1-0.1.beta.fc24.noarch
python-pulp-puppet-common-2.12.2-0.1.beta.fc24.noarch
python-pulp-python-common-1.1.3-1.fc24.noarch
python-pulp-repoauth-2.12.2-0.1.beta.fc24.noarch
python-pulp-rpm-common-2.12.2-0.1.beta.fc24.noarch
python-pulp-streamer-2.12.2-0.1.beta.fc24.noarch
Updated by Ichimonji10 almost 8 years ago
Updated by semyers almost 8 years ago
We had a good team chat about the "Verification Required" flag on Monday, and decided that the release of 2.12.2 should not be blocked on the verification of this issue.
Added by ulif almost 8 years ago
Revision f1102a99 | View on GitHub
Fix non-utf8 when found in uploaded RPMs (really).
When uploading RPM packages with non-utf8 metadata, the upload was aborted. A fix already applied apparently did not fix this completely.
Ensures that metadata from uploaded packages can be encoded to utf-8. Where no such encoding is possible, replacement chars are inserted.
Updated by jortel@redhat.com almost 8 years ago
Community member ulif has indicated they will be submitting a PR.
Updated by bmbouter almost 8 years ago
- Status changed from ASSIGNED to POST
- Platform Release deleted (
2.12.2)
PR from ulif available at: https://github.com/pulp/pulp_rpm/pull/1040/files
I'm unsetting the platform release since we aren't sure if this will be included in the 2.12.2 release or not since it is being cut today. This should not release
Updated by jortel@redhat.com almost 8 years ago
The original patch only fixed encoding issues in the primary XML fragment but the opensc-0.11.6-5.27.1.x86_64.rpm has invalid utf8 in the changelog which is found in both the others and filelists. PR https://github.com/pulp/pulp_rpm/pull/1040 encodes the entire dictionary.
Updated by ulif almost 8 years ago
- Status changed from POST to MODIFIED
Applied in changeset f1102a9993b15a78586110a3e0007045585343b1.
Updated by bizhang over 7 years ago
- Status changed from 5 to CLOSED - CURRENTRELEASE
Updated by ttereshc over 7 years ago
- Related to Issue #2622: Sync fails when non-ASCII characters are present in primary.xml added
Fix non-utf8 when found in uploaded RPMs. closes #1903