RPM Sync Issue - Duplicate content
Ticket moved to GitHub: "pulp/pulp_rpm/2271":https://github.com/pulp/pulp_rpm/issues/2271
I've hit an odd sync issue with rpm (https://pulp.plan.io/issues/8615).
This is syncing against a pulp2 repo that is populated using pulp-admin's upload facility. I think I may know the cause, though this is conjecture.
Speaking with ttereshc, it's been confirmed that the upload command has the same issue as copying content between repositories does not perform any kind of de-duplication of data where the NEVRA is the same, but the hash differs.
I worked around this issue by creating a "dummy" repository, copying the content into it, and then setting the original repository up, to have it's feed set to the dummy repository and syncing it, this then engages the deduplication logic.
A subsequent sync of this repository from Pulp3 worked cleanly.
It strikes me, that Pulp3 probably should have been able to deal with this gracefully, I'm not familiar enough with sync logic to understand where the core problem was.