Project

Profile

Help

Issue #8893

closed

3rd party repository sync fails with 'InvalidStringData: strings in documents must be valid UTF-8'

Added by ggainey almost 3 years ago. Updated over 2 years ago.

Status:
MODIFIED
Priority:
High
Assignee:
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Katello, Pulp 2
Sprint:
Quarter:

Description

ModulemdDefaults are BSON-encoded before being saved into MongoDB because MongoDB (apparently) has restrictions on valid keys which are incompatible with the module data we need to store.

https://github.com/pulp/pulp_rpm/blob/2-master/plugins/pulp_rpm/plugins/importers/yum/repomd/modules.py#L94-L95

...But the serialized BSON strings being created by the encoding function we are using are not always valid UTF-8... Presumably, it works the vast majority of the time, enough so that it wasn't noticed.

Only specific permutations of the input data appear to trigger this. If I delete key from the profiles dictionary, all of a sudden it's UTF-8 compatible. This is presumably why the problem pops into and out of existence.

At some point we attempt to save this string to MongoDB, and Mongo decides to apply a UTF-8 validation to it, and it blows up.


Related issues

Related to Migration Plugin - Issue #8982: Support migrating any client systems that have applied the hotfix for 8982CLOSED - CURRENTRELEASEdalleyActions

Also available in: Atom PDF