Project

Profile

Help

Issue #4731

closed

Order of data in PULP_MANIFEST returned by Pulp is different from feed url

Added by kersom about 5 years ago. Updated over 2 years ago.

Status:
CLOSED - DUPLICATE
Priority:
Normal
Assignee:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

Ticket moved to GitHub: "pulp/pulp_file/612":https://github.com/pulp/pulp_file/issues/612


Order of data in PULP_MANIFEST returned by Pulp is different than what is provided by the synced repo.

1. Create a repository
2. Create a file remote - https://repos.fedorapeople.org/pulp/pulp/fixtures/file/
3. Sync the file remote
4. Create a file publisher
5. Create a publication
6. Create a distribution from the publication
7. Fetch PULP_MANIFEST

PULP_MANIFEST provided by feed url

'1.iso,cbd1d07a63f8ac122b7adf75658fc22f9754796f8bbcd9395f1bcc00bbc6e2d8,1024\n2.iso,7ab0ad049b044879b03d3bc5acbe4e43c98c359fe52a60475e6611ee55033646,1024\n3.iso,ddc5a9ac99a0cb546cce44be3da447c6e591df8d4860c592f3f1be6e33b66e62,1024\n'

PULP_MANIFEST downloaded from Pulp

'3.iso, ddc5a9ac99a0cb546cce44be3da447c6e591df8d4860c592f3f1be6e33b66e62, 1024\n2.iso, 7ab0ad049b044879b03d3bc5acbe4e43c98c359fe52a60475e6611ee55033646, 1024\n1.iso, cbd1d07a63f8ac122b7adf75658fc22f9754796f8bbcd9395f1bcc00bbc6e2d8, 1024\n'

See: https://github.com/pulp/pulp_file

This makes verifying the integrity of PULP_MANIFEST downloaded from Pulp a bit more complex.


Related issues

Related to Pulp - Test #4519: Test 500 error while getting published metadataCLOSED - COMPLETEkersomActions
Actions #1

Updated by kersom about 5 years ago

  • Project changed from RPM Support to Pulp
Actions #2

Updated by kersom about 5 years ago

  • Related to Test #4519: Test 500 error while getting published metadata added
Actions #3

Updated by ttereshc about 5 years ago

  • Project changed from Pulp to File Support
Actions #4

Updated by ttereshc about 5 years ago

I don't think it's a bug.
In general, it's not safe to rely on the order of metadata.
E.g. RPM packages in primary.xml can be in different order every time, depends on how createrepo_c handles it, not under Pulp's control.

For PULP_MANIFEST: each row has a specific format - data in a certain order separated by commas: relative_path, checksum, size.
The order of rows is not guaranteed to be preserved.
If we decide to publish in incremental way at some point in the future (adding metadata to the existing file), there will be no good way to preserve the order, even if we want to.

If this is needed for test purposes: split by newline and sort.

The inconsistency I can see which is potentially not good and inconvenient is that Pulp produces additional spaces between comma-separated values.

Actions #5

Updated by daviddavis about 5 years ago

I don't think the ordering is a bug either but I find it strange that we sort the files deterministically by when they are created[0]. If we sort at all it should be by filename but we sort by created as a way to eliminate duplicates in the manifest (see [1]).

I agree that the space after commas should be fixed. Also, users shouldn't rely on the ordering of the manifest file--we should probably focus our efforts on #4028 instead.

[0] https://github.com/pulp/pulp_file/blob/dd366601de3ae8741a7f0c2ee8f288f90f74d142/pulp_file/app/tasks/publishing.py#L73
[1] https://pulp.plan.io/issues/4028

Actions #6

Updated by ttereshc about 5 years ago

@daviddavis, the ordering by created date is a way to keep the most recently added file if there are duplicates.

+1 to focus on #4028

Actions #7

Updated by kersom about 5 years ago

My goal filing this one was to share and make people aware of it, I was not sure if it was a bug as well.

Actions #8

Updated by amacdona@redhat.com about 5 years ago

  • Triaged changed from No to Yes

I think this is a problem isolated to the pulp_file plugin. Most plugins implement the API of some existing ecosystem, but pulp_file creates its own (the PULP_MANIFEST).

Since this API does not exist elsewhere, a section needs to be added to the pulp_file docs to explain what this manifest is, how it is structured (and that it is not ordered).

Actions #9

Updated by bmbouter about 5 years ago

  • Tags deleted (Pulp 3)
Actions #10

Updated by pulpbot over 2 years ago

  • Description updated (diff)
  • Status changed from NEW to CLOSED - DUPLICATE

Also available in: Atom PDF