Project

Profile

Help

Issue #4731

Order of data in PULP_MANIFEST returned by Pulp is different from feed url

Added by kersom over 1 year ago. Updated over 1 year ago.

Status:
NEW
Priority:
Normal
Assignee:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

Order of data in PULP_MANIFEST returned by Pulp is different than what is provided by the synced repo.

1. Create a repository
2. Create a file remote - https://repos.fedorapeople.org/pulp/pulp/fixtures/file/
3. Sync the file remote
4. Create a file publisher
5. Create a publication
6. Create a distribution from the publication
7. Fetch PULP_MANIFEST

PULP_MANIFEST provided by feed url

'1.iso,cbd1d07a63f8ac122b7adf75658fc22f9754796f8bbcd9395f1bcc00bbc6e2d8,1024\n2.iso,7ab0ad049b044879b03d3bc5acbe4e43c98c359fe52a60475e6611ee55033646,1024\n3.iso,ddc5a9ac99a0cb546cce44be3da447c6e591df8d4860c592f3f1be6e33b66e62,1024\n'

PULP_MANIFEST downloaded from Pulp

'3.iso, ddc5a9ac99a0cb546cce44be3da447c6e591df8d4860c592f3f1be6e33b66e62, 1024\n2.iso, 7ab0ad049b044879b03d3bc5acbe4e43c98c359fe52a60475e6611ee55033646, 1024\n1.iso, cbd1d07a63f8ac122b7adf75658fc22f9754796f8bbcd9395f1bcc00bbc6e2d8, 1024\n'

See: https://github.com/pulp/pulp_file

This makes verifying the integrity of PULP_MANIFEST downloaded from Pulp a bit more complex.


Related issues

Related to Pulp - Test #4519: Test 500 error while getting published metadataCLOSED - COMPLETE<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

History

#1 Updated by kersom over 1 year ago

  • Project changed from RPM Support to Pulp

#2 Updated by kersom over 1 year ago

  • Related to Test #4519: Test 500 error while getting published metadata added

#3 Updated by ttereshc over 1 year ago

  • Project changed from Pulp to File Support

#4 Updated by ttereshc over 1 year ago

I don't think it's a bug.
In general, it's not safe to rely on the order of metadata.
E.g. RPM packages in primary.xml can be in different order every time, depends on how createrepo_c handles it, not under Pulp's control.

For PULP_MANIFEST: each row has a specific format - data in a certain order separated by commas: relative_path, checksum, size.
The order of rows is not guaranteed to be preserved.
If we decide to publish in incremental way at some point in the future (adding metadata to the existing file), there will be no good way to preserve the order, even if we want to.

If this is needed for test purposes: split by newline and sort.

The inconsistency I can see which is potentially not good and inconvenient is that Pulp produces additional spaces between comma-separated values.

#5 Updated by daviddavis over 1 year ago

I don't think the ordering is a bug either but I find it strange that we sort the files deterministically by when they are created[0]. If we sort at all it should be by filename but we sort by created as a way to eliminate duplicates in the manifest (see [1]).

I agree that the space after commas should be fixed. Also, users shouldn't rely on the ordering of the manifest file--we should probably focus our efforts on #4028 instead.

[0] https://github.com/pulp/pulp_file/blob/dd366601de3ae8741a7f0c2ee8f288f90f74d142/pulp_file/app/tasks/publishing.py#L73
[1] https://pulp.plan.io/issues/4028

#6 Updated by ttereshc over 1 year ago

daviddavis, the ordering by created date is a way to keep the most recently added file if there are duplicates.

+1 to focus on #4028

#7 Updated by kersom over 1 year ago

My goal filing this one was to share and make people aware of it, I was not sure if it was a bug as well.

#8 Updated by amacdona@redhat.com over 1 year ago

  • Triaged changed from No to Yes

I think this is a problem isolated to the pulp_file plugin. Most plugins implement the API of some existing ecosystem, but pulp_file creates its own (the PULP_MANIFEST).

Since this API does not exist elsewhere, a section needs to be added to the pulp_file docs to explain what this manifest is, how it is structured (and that it is not ordered).

#9 Updated by bmbouter over 1 year ago

  • Tags deleted (Pulp 3)

Please register to edit this issue

Also available in: Atom PDF