Issue #1618
closed--checksum-type is broken
Added by jluza almost 9 years ago. Updated over 5 years ago.
Description
Because rhel-5 doesn't support sha256 checksum type, we need to pulp to be able to generate repodata with different checksum type than default one.
This is not a trivial fix and involves several significant changes. For detailed discussion of these problems, see the notes below, but the following outlines tasks that probably need to be done for this issue:
Modifying the data models¶
There is currently a story (#1647) that tracks properly modeling content. This involves having a table that has a record for each and every file managed by Pulp. That's probably not something we want to bite off for this issue. It is probably best to just change the parent class of RPM, DRPM, etc. to contain several checksum fields, but maybe adding it to the platform somewhere would be easier.
In addition to the checksums, the current implementation contains XML snippets with the checksum and checksum type. These need to be turned to templates that are filled in with the appropriate checksum. This rendering should probably live as a method on the model(s).
Creating migrations¶
Both the new checksum fields (wherever they are) and the XML snippets need to be migrated/populated with the existing checksum types.
Create Task to Checksum Files¶
We need a way to generate these new checksums. As part of #1647, we'll want a task to "scrub" Pulp for corrupted files, and we might be able to lay the groundwork here. Perhaps not, but it's worth thinking about during implementation. In this particular instance the task wouldn't be dispatched (just called synchronously inside the publish task), but we'd be able to share code.
Ensure Publish Handles Edge Cases¶
With lazy syncing in the mix, we might not have the files available to checksum. We need to make sure we fail gracefully in cases where a file isn't available.
Related issues
Updated by jortel@redhat.com almost 9 years ago
- Priority changed from Normal to High
- Platform Release set to 2.8.0
- Triaged changed from No to Yes
Please verify that this is not fixed in latest 2.7.
Updated by jcline@redhat.com almost 9 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to jcline@redhat.com
Updated by jcline@redhat.com almost 9 years ago
In both 2.7 and 2.8 the repomd.xml has the configured hash algorithm, but the primary.xml always has sha256.
Updated by jcline@redhat.com almost 9 years ago
Okay, so here is what I've discovered thus far:
- There is an RCM patch for this issue: https://github.com/release-engineering/pulp_rpm/commit/5ded67b4954395cb040af2f03065dc10a6ad0188
- The ``--checksum-type`` configuration flag for repositories only applies to the metadata files themselves. That is to say, repodata/repomd.xml uses that checksum type and all the repo metadata files are named <checksum-type>-<metadata-type>.xml.gz. The package checksums in those metadata files does not change based on the checksum type specified for a repository.
- When uploading a content unit to a repository, you can provide a checksum type to use when generating the package metadata. If one is not provided, the default appears to always be sha256 (so it doesn't honor the repository setting, which is somewhat surprising to me).
The workflow thata led to this issue and patch is as follows:
- Upload all RPMs to one repository. When uploading, do not specify a checksum type.
- Copy RPMs into the desired repository and have the checksum type configured on the repository.
- Publish the repository.
- The metadata for that repository uses the checksum type configured, and only that checksum type.
The expectation (which is very reasonable) is that at the repo metadata uses the checksum type throughout, not just for the metadata files themselves. I'm not certain how the ``--checksum-type`` repo flag should behave, but my understanding based on the docs (or rather, the single line of text) is that the checksum type specified should be used across the board. This, however, is almost certainly not possible with deferred downloading.
The options as I see it are:
- Ensure the ``--checksum-type`` is honored for all metadata. This means we need to have every file at publish time, which in turn means if the checksum type specified doesn't match upstream's checksum type, you can't use deferred downloading with that repository. Of course, we won't know that until we download the metadata during the first sync. This is probably an edge case situation, though.
- Keep things the way they are and see if we can work with RCM to find a different workflow that meets their needs.
Updated by jcline@redhat.com almost 9 years ago
- Tracker changed from Issue to Story
- Status changed from ASSIGNED to NEW
- Assignee deleted (
jcline@redhat.com) - Platform Release deleted (
2.8.0) - Groomed set to No
- Sprint Candidate set to No
Updated by bmbouter almost 9 years ago
- Related to Story #1647: Unify checksum management to the platform and add some features added
Updated by rbarlow over 8 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to rbarlow
- Platform Release set to 2.9.0
Updated by rbarlow over 8 years ago
I've discussed this with jcline and we came up with the following plan to work around the issues with publishing lazy repositories:
- We will not allow users to set --checksum-type on lazy repositories.
- We will not allow users to copy files into a lazy repository if Pulp does not have the file at the moment of copy.
In both of the above scenarios, we will need to give the user a helpful error message if they attempt to perform these operations. By ensuring that we have the files on disk before publish time, we will be able to generate alternative checksums if requested.
A problem with this approach is that the publish currently uses pregenerated XML snippets that are stored in MongoDB that contain the checksums. There are a couple of options I can think of around this:
- We can use the snippets when the user has not requested a different checksum. The plus side to this approach is that it's a less disruptive change to Pulp. The downside is that publishes will go slowly if the user requests a different checksum type than the unit's XML snippet uses.
- We can alter the XML snippets to be a template that allows us to inject the checksum at publish time. The plus side here is that we should be able to operate at the same speed for all checksum types. The downsides would be that all publishes may go a little slower due to having to render templates, and that we will have to write a migration to update all the snippets to be a template.
Another element to consider is when the unit checksums should be calculated. We probably don't want to calculate them for every publish, so we'll want to add a new field on the units to store alternative checksums. I think there are only really two times that it makes sense to calculate alternative checksums if they don't already exist:
- At publish time, which could be slow. Since we will be storing the checksums on the unit after calculating it, the publish would only be slow due to this effect the first time any given unit is being published with a checksum type that it has not been published with before. This means that repeat publishes with the same (or mostly the same) units should be faster.
- When the unit is being added to a repository (upload, copy, sync). The downside to this is that copies could become slow, and they have historically been a quick operation in Pulp.
I think I lean towards the second option, but only really because it seems odd for a publish operation to modify units.
Updated by mhrivnak over 8 years ago
Sorry I didn't get to this until now. As promised, I'll add the thoughts that Brian and I came up with in a brief brainstorm session. It's not a complete solution, but may be useful to consider.
It sounds like RCM wants the ability to specify a checksum type and have all checksums in the publish repo metadata use that type. That seems reasonable.
The only types currently in use are sha1 and sha256, and we don't know of any plans to use more.
One way to make this happen:
The XML snippets stored in the database become templates, which requires a migration. The publish operation would render those templates.
We would add two new fields to the model called "sha1" and "sha256" or similar, and put the corresponding checksum values in there. It would duplicate data potentially from the unit key, but is worth it so we can preserve the duplicate units already present; if we tried to identify rpms that are duplicates, and consolidate, that raises other problems we might not want to solve right now. For example, if we replace RPMs that have sha1 in the unit key with their sha256 counterparts, if the sha1 unit gets orphan-removed, the file would disappear and published repos that used to have the sha1 unit might end up with broken links.
A migration would just add those two new fields and populate them. For lazy content, that's a problem as you've already pointed out, because we only know one of the checksums.
For new RPMs, both checksums can be calculated at sync or upload time. But as you point out, the lazy workflow complicates it.
In any case, calculating checksums at publish time would not likely be received well by users, so finding another way would be best.
Adding restrictions at copy time, as you've already suggested, seems like a reasonable approach. One option is for each rpm repo to have a chosen checksum type, either by user choice or by default, and refuse to add units if they don't have the required checksum. Although that wouldn't help if a user changes the repo's checksum type afterward. As an additional or separate guard, pulp could refuse to publish a repo if there are units that lack the requested checksum type. For a user who runs into that scenario, the simple solution is to call the download-repo task, which would get the files and populate all the checksums.
When the lazy workflow gets an rpm, either through the download-repo task, or the download-deferred task, we would want to calculate the missing checksum. Hooking into that might be challenging.
It would be ideal to expand this checksum storage approach, or whatever we implement, to all units that have files. It may or may not make sense to try doing that now. Perhaps doing it just in pulp_rpm is a decent proving ground, and we can later expand it to all content.
I think those are all the thoughts we had. Hopefully some of that is helpful, and please let me know if it would be valuable to have more discussion on it.
Updated by rbarlow over 8 years ago
- Tracker changed from Story to Issue
- Subject changed from as user, I can generate repodata with different checksum than sha256 to --checksum-type is broken
- Severity set to 2. Medium
- Triaged set to No
Updated by rbarlow over 8 years ago
- Status changed from ASSIGNED to NEW
- Assignee deleted (
rbarlow)
I have more security issues to deal with and haven't made much progress on this anyway. Putting it down for now.
Updated by jortel@redhat.com over 8 years ago
I propose the following as a blend of the preceding proposals.
- Add sha1 and sha256 as attributes on the ContentUnit model object.
- Alter the XML snippets to be a template and inject the checksum at publish time.
- Calculate missing checksums needed for publishing - at publish time and store them on the unit.
- No additional restrictions on "lazy" repositories.
I'm thinking that if pulp has imported RPM units with SHA256 and a user want to publish as SHA-1 they (the user) should endure the extra overhead (time) during publish. The overhead is in direct support of a publish operation and so this is the correct place to bare the burden. Since the new checksum is stored on the unit, the additional overhead is only incurred once. We could also add a low priority background task that periodically calculates missing checksums. This could be enabled/controlled by a configuration setting.
Obviously, we'll need a migration to:
- convert XML snippets to templates
- populate the new sha1 and sha256 attributes using the metadata.
Updated by jcline@redhat.com over 8 years ago
- Blocks Issue #1619: as user, I can export repo groups with different checksum than sha256 added
Updated by jcline@redhat.com over 8 years ago
- Description updated (diff)
- Status changed from NEW to ASSIGNED
- Assignee set to jcline@redhat.com
Updated by jcline@redhat.com over 8 years ago
Alright, so here's what I've found that's take a bit of wind out of my sails. These XML snippets are quite large, and reference the checksum a lot. I haven't taken the time to fully understand the schema, but what jumped out at me is all the primary.xml files I looked at had something like
<checksum pkgid="YES" type="sha1">733033d4ba6761c30fbd1086a70784f4fb317687</>
and then everywhere else uses the checksum as the pkgid. So this means our templates will be very unwieldy and probably very slow to process.
Here's what I propose:
- Use createrepo_c to generate the repository metadata if possible. I think this will also make doing things like DRPMs, fast incremental updates (it has a --update flag and if we know where the previous publish lives we can use that), etc much easier since we can just use the library. The current version in EPEL6 is 0.9 (the latest release is 0.10).
- When it isn't possible (when a repository contains a lazy unit), publish using the existing snippets we have, unless the requested repo checksum doesn't match the upstream repo metadata type. In those cases we can fail and inform the user to either download the content, or change the checksum type to <type>
Updated by mhrivnak over 8 years ago
I see that each rpm is referenced once in other.xml, and once in filelists.xml. Both references are by pkgid. Are you seeing more than those two references?
Updated by rbarlow over 8 years ago
Hey Jeremy,
I too am not a fan of how we store XML in our database this way. I had been looking at the template approach too but I also found it to be unwieldy and the code was becoming even more hacky than it already is. IMO, the right approach is to go back to generating the XML as we did before. Using createrepo_c to do it seems fine to me, but I don't have any first hand experience with it to speak of.
I'd rather not have two ways we generate the XML, so I think eliminating the XML snippets in the database would be good. For lazy, I think we should just publish with the checksum we know in the database, rather than using the snippets.
Above, I had proposed disabling changing the checksum type for lazy fetching repos as a way to stop this from happening at publish time. I think that approach might be worth considering.
Updated by jcline@redhat.com over 8 years ago
createrepo_c on a 10K RPM repository (no sqlite DBs):
[vagrant@dev os]$ time createrepo_c --no-database .
Directory walk started
Directory walk done - 10572 packages
Temporary output repo path: ./.repodata/
Pool started (with 5 workers)
Pool finished
real 0m29.418s
user 1m0.550s
sys 0m6.642s
[vagrant@dev os]$
Now, as you'd probably guess, since this is calculating the checksum of every RPM this is an I/O bound process (I apologize, I can't get Redmine to do a monospace font):
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read writ| recv send| in out | int csw
0 0 100 0 0 0| 0 0 | 184B 308B| 0 0 | 132 256
4 0 96 0 0 0| 24k 0 | 476B 924B| 0 0 | 300 302
43 6 29 22 0 0| 317M 0 | 316B 728B| 0 0 | 13k 6922
51 9 15 25 0 0| 423M 896k| 66B 178B| 0 0 | 17k 10k
47 7 3 43 0 0| 466M 0 | 118B 194B| 0 0 | 12k 6984
61 7 0 32 0 0| 692M 0 | 66B 194B| 0 0 | 13k 4453
44 6 16 33 0 0| 511M 0 | 118B 194B| 0 0 | 10k 4764
38 5 7 49 0 0| 414M 48k| 66B 178B| 0 0 | 10k 5526
39 5 3 52 0 0| 423M 0 | 118B 178B| 0 0 |9643 5256
48 6 6 39 0 0| 409M 0 | 66B 194B| 0 0 | 13k 6411
48 7 4 40 0 0| 520M 0 | 232B 514B| 0 0 | 14k 6233
34 5 8 53 0 0| 427M 0 | 66B 66B| 0 0 |8231 4659
44 6 9 41 0 0| 521M 12k| 118B 194B| 0 0 |9607 4471
40 4 16 41 0 0| 405M 0 | 66B 178B| 0 0 |8870 4878
49 6 25 20 0 0| 439M 0 | 118B 178B| 0 0 | 11k 4805
77 8 2 13 0 0| 477M 0 | 66B 178B| 0 0 | 11k 12k
69 7 8 16 0 0| 397M 0 | 118B 178B| 0 0 |9024 3738
86 8 1 6 0 0| 472M 0 | 66B 178B| 0 0 | 11k 1911
85 5 2 7 0 0| 277M 15M| 118B 178B| 0 0 |8208 2594
66 5 1 28 0 0| 242M 0 | 66B 178B| 0 0 |8484 3545
51 9 9 31 0 0| 338M 0 | 118B 210B| 0 0 | 20k 9698
51 8 5 36 0 0| 372M 0 | 66B 178B| 0 0 | 14k 6742
44 6 6 44 0 0| 397M 0 | 118B 178B| 0 0 |9864 5881
37 8 5 51 0 0| 318M 24k| 66B 178B| 0 0 | 14k 7869
46 8 12 34 0 0| 369M 0 | 118B 178B| 0 0 | 15k 8857
50 7 12 31 0 0| 416M 0 | 66B 178B| 0 0 | 15k 8502
54 9 7 30 0 0| 458M 0 | 118B 178B| 0 0 | 15k 8898
52 8 14 26 0 0| 418M 0 | 66B 178B| 0 0 | 16k 8789
49 8 15 27 0 0| 397M 12k| 118B 178B| 0 0 | 15k 9508
24 4 63 10 0 0| 320M 0 | 66B 178B| 0 0 |6623 3078
43 7 2 48 0 0| 476M 0 | 118B 178B| 0 0 | 11k 6478
11 2 81 6 0 0| 105M 0 | 330B 518B| 0 0 |3242 2024
0 0 100 0 0 0| 0 0 | 118B 486B| 0 0 | 135 251
I've got an SSD, so your mileage will vary. However, we could probably optimize things by using their Python bindings to build the repodata without it touching any of the files (for the cases when we either already have the checksum from upstream or we are publishing a lazy repository when changing the checksum isn't allowed).
Now, the whole publish operation (on the same host) takes about 1 minute and 30 seconds:
[vagrant@dev lib]$ time pulp-admin rpm repo publish run --repo-id el7-copy
+----------------------------------------------------------------------+
Publishing Repository [el7-copy]
+----------------------------------------------------------------------+
This command may be exited via ctrl+c without affecting the request.
Initializing repo metadata
[-]
... completed
Publishing Distribution files
[-]
... completed
Publishing RPMs
[==================================================] 100%
10572 of 10572 items
... completed
Publishing Delta RPMs
... skipped
Publishing Errata
[==================================================] 100%
1133 of 1133 items
... completed
Publishing Comps file
[==================================================] 100%
86 of 86 items
... completed
Publishing Metadata.
[-]
... completed
Closing repo metadata
[-]
... completed
Generating sqlite files
... skipped
Publishing files to web
[\]
... completed
Writing Listings File
[-]
... completed
Task Succeeded
real 1m27.178s
user 0m1.000s
sys 0m0.130s
[vagrant@dev lib]$
My guess is that on slow hardware publishes will take quite a bit longer than they currently do. However, no matter what we do we cannot simply jam XML snippets in the database and spit them back out. We have to modify the XML depending on distributor settings.
Since the whole point of the snippets seems to be about avoiding parsing and generating XML, I think we should get rid of them. We can either store them, then parse and modify them before spitting them out (and we'll need to handle checksumming all the files and so on and so forth), or generate it in a C library written expressly for this purpose.
Updated by mhrivnak over 8 years ago
Interesting findings. I tried to reproduce just for comparison, and to have another data point. My hardware is apparently slower, which seems to have produced a substantially different comparison. Also, I copied only the RPMs into a repo and published that. createrepo_c does not help us with errata, comps.xml, distribution, etc, so I factored those out of the publish scenario by leaving them out of the repo.
My dstat output looked like this when running createrepo_c:
16 11 16 57 0 0| 144M 0 | 118B 126B| 0 0 |8676 4832
24 10 7 59 0 0| 170M 0 | 66B 126B| 0 0 |7462 4026
27 7 6 59 0 0| 183M 0 | 118B 126B| 0 0 |7097 3652
16 13 4 67 0 0| 134M 612k| 66B 126B| 0 0 |7952 5030
27 11 22 40 0 0| 168M 0 | 118B 126B| 0 0 |8274 4320
21 10 39 29 0 0| 150M 0 | 66B 134B| 0 0 |8223 4854
29 16 7 48 0 0| 145M 0 | 118B 126B| 0 0 |8612 4951
It's also an SSD, but disk I/O is roughly 3x slower than yours. Maybe it's that I ran in vagrant with NFS? Maybe your SSD is just faster? Or maybe mine is actually CPU-bound, doing the calculations? In any case...
I'm seeing publish times of about 90s for just the RPMs.
createrepo_c took 117s.
dstat from the pulp publish looked like this:
25 1 74 0 0 0|2684k 2596k| 330B 534B| 0 0 |2053 1094
25 1 74 0 0 0|2788k 0 | 382B 338B| 0 0 |1531 1221
24 1 73 1 0 0|2544k 18M| 330B 534B| 0 0 |1860 1306
24 1 73 1 0 0|4016k 7828k| 382B 636B| 0 0 |3185 2231
25 1 74 0 0 0|1408k 0 | 132B 330B| 0 0 |1402 1166
24 1 74 1 0 0|3064k 3164k| 184B 228B| 0 0 |1509 1434
25 1 74 0 0 0|2816k 0 | 132B 228B| 0 0 |1403 1105
25 1 74 1 0 0|4408k 0 | 382B 440B| 0 0 |1636 1448
Our two data points at least hint that slower hardware will have a bigger impact on the createrepo_c option. That fits what we know about the work load, that createrepo_c is bottlenecked on hardware, whereas the current publish model is presumably limited by python's ability to create model instances, loop over them, and write text to files, and/or mongo's ability to deliver the data stream.
Of course this isn't rendering templates yet. It would be interesting to do a quick proof of concept to see how template rendering affects performance.
But one other factor to consider carefully is that we're doing these tests on mostly-idle systems. If createrepo_c wants to use all of my disk IO or 250% CPU, no problem. But on a busy server, that could have a big impact on other processes, and createrepo_c might only get a fraction of the resources it wants. Consider multiple concurrent publishes, and how that might scale, not to mention other operations or API queries that need to hit the database, etc.
I suspect that the current model of keeping pre-calculated XML (although soon as templates), and rendering them at publish time will continue to be a very compelling solution due to the much lower impact on system resources. But it's valuable to have the comparison, and to continue evaluating the pros and cons of each option.
The potential to stop worrying about the XML entirely and let something else make it for us does sound compelling.
Updated by jcline@redhat.com over 8 years ago
- Status changed from ASSIGNED to NEW
- Assignee deleted (
jcline@redhat.com)
I'm putting this back down since I need to start on 1769.
I wrote out a story (https://pulp.plan.io/issues/1877) that summarizes the problems I found with the RPM model as part of the investigation for this issue. I've also made a PR documenting some of the RPM model (https://github.com/pulp/pulp_rpm/pull/857).
Updated by bmbouter over 8 years ago
- Related to Story #1878: Support for choosing the checksum type in updateinfo added
Updated by mhrivnak over 8 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to mhrivnak
Updated by mhrivnak over 8 years ago
I wired in template rendering at publish time, as a PoC. It renders a template for each entry made to primary.xml, filelist.xml, or other.xml, adding the checksum where appropriate. It's the real work being proposed, just not in a polished form.
Publishing a RHEL 7 repo with about 10,000 packages, the publishes that render templates take 9% longer than before they did template rendering. There isn't a decisive difference in performance between using django vs. jinja2 as the template engine.
In all cases, even when not rendering templates, the publish is CPU-bound.
Updated by bmbouter over 8 years ago
I'm OK with the performance impact because (1) it provides a necessary fix and (2) horizontal scalability allows users to mostly reach their performance goals so I'm less concerned with the impact of any single Pulp operation's performance.
Long term, I think we should move to letting createrepo_c do all the RPM metadata maintenance and generation, but I won't suggest that now is the right time for that.
Updated by mhrivnak over 8 years ago
- Status changed from ASSIGNED to POST
Updated by bmbouter over 8 years ago
- Has duplicate Issue #627: --checksum-type does not affect the checksum used in primary.xml added
Added by mhrivnak over 8 years ago
Added by mhrivnak over 8 years ago
Revision 7094ec37 | View on GitHub
Adds a function to calculate checksums of multiple types at once.
Calculation of checksums was moved out of the "verification" module, because it is useful in many more cases than just in the process of verification. The only plugin using the moved code is pulp_rpm, and corresponding changes to that plugin will be in a separate PR.
Added by mhrivnak over 8 years ago
Revision d1491184 | View on GitHub
yum_distributor now uses configured checksum type for all metadata.
A lot of model-related code was moved into models.py from other places. In addition to being more object-oriented, it made that code accessible from multiple places instead of being isolated somewhere, such as in the upload code. Being able to use the code from multiple places was the primary reason for moving the code in this PR.
Updated by mhrivnak over 8 years ago
Added by mhrivnak over 8 years ago
Revision ccf4c941 | View on GitHub
adding a new error code for missing unit file
re #1618
Added by mhrivnak over 8 years ago
Revision ccf4c941 | View on GitHub
adding a new error code for missing unit file
re #1618
Updated by mhrivnak over 8 years ago
Updated by pthomas@redhat.com over 8 years ago
- Status changed from MODIFIED to ASSIGNED
Seems like checksum types can be updated for the repos with download policy set to on_demand
[root@ibm-x3550m3-11 ~]# rpm -qa pulp-server
pulp-server-2.9.0-0.3.beta.el7.noarch
[root@ibm-x3550m3-11 ~]#
[root@ibm-x3550m3-11 ~]# pulp-admin rpm repo create --repo-id rhel7-os --feed http://cdn.rcm-internal.redhat.com/content/dist/rhel/rhui/server/7/7Server/x86_64/os/ --download-policy on_demand
Successfully created repository [rhel7-os]
[root@ibm-x3550m3-11 ~]# pulp-admin rpm repo sync run --repo-id rhel7-os +----------------------------------------------------------------------+
Synchronizing Repository [rhel7-os]
+----------------------------------------------------------------------+
This command may be exited via ctrl+c without affecting the request.
Downloading metadata...
[-]
... completed
Downloading repository content...
[|]
[==================================================] 100%
RPMs: 11051/11051 items
Delta RPMs: 0/0 items
... completed
Downloading distribution files...
[==================================================] 100%
Distributions: 0/0 items
... completed
Importing errata...
[/]
... completed
Importing package groups/categories...
[\]
... completed
Cleaning duplicate packages...
[-]
... completed
Task Succeeded
Initializing repo metadata
[-]
... completed
Publishing Distribution files
[-]
... completed
Publishing RPMs
[==================================================] 100%
11053 of 11053 items
... completed
Publishing Delta RPMs
... skipped
Publishing Errata
[==================================================] 100%
1231 of 1231 items
... completed
Publishing Comps file
[==================================================] 100%
87 of 87 items
... completed
Publishing Metadata.
[-]
... completed
Closing repo metadata
[-]
... completed
Generating sqlite files
... skipped
Generating HTML files
... skipped
Publishing files to web
[\]
... completed
Writing Listings File
[-]
... completed
Task Succeeded
[root@ibm-x3550m3-11 ~]# pulp-admin rpm repo update --repo-id rhel7-os --checksum-type sha256
This command may be exited via ctrl+c without affecting the request.
[\]
Running...
Updating distributor: yum_distributor
Task Succeeded
[\]
Running...
Updating distributor: export_distributor
Task Succeeded
[root@ibm-x3550m3-11 ~]# pulp-admin rpm repo publish run --repo-id rhel7-os +----------------------------------------------------------------------+
Publishing Repository [rhel7-os]
+----------------------------------------------------------------------+
This command may be exited via ctrl+c without affecting the request.
Copying files
[\]
... completed
Initializing repo metadata
[-]
... completed
Publishing Distribution files
[-]
... completed
Publishing RPMs
[/]
... completed
Publishing Delta RPMs
... skipped
Publishing Errata
[==================================================] 100%
1231 of 1231 items
... completed
Publishing Comps file
[==================================================] 100%
87 of 87 items
... completed
Publishing Metadata.
[-]
... completed
Closing repo metadata
[-]
... completed
Generating sqlite files
... skipped
Generating HTML files
... skipped
Publishing files to web
[\]
... completed
Writing Listings File
[-]
... completed
Task Succeeded
[root@ibm-x3550m3-11 ~]# e
Updated by mhrivnak over 8 years ago
- Status changed from ASSIGNED to MODIFIED
That is expected behavior.
I assume you have in mind the scenario where the rpm hasn't been downloaded, and the user wants to publish with a checksum type pulp doesn't already have. In that case the publish will fail gracefully. Here is the documentation for the distributor setting which hopefully explains that clearly:
Checksum type to use for metadata generation. For any units where the checksum of this type is not already known, it will be computed on-the-fly and saved for future use. If any such units have not been downloaded, then checksum calculation is impossible, and the publish will fail gracefully.
Updated by semyers over 8 years ago
- Status changed from 6 to CLOSED - CURRENTRELEASE
Adds a function to calculate checksums of multiple types at once.
Calculation of checksums was moved out of the "verification" module, because it is useful in many more cases than just in the process of verification. The only plugin using the moved code is pulp_rpm, and corresponding changes to that plugin will be in a separate PR.
re #1618 https://pulp.plan.io/issues/1618