Story #7696
closedAs a plugin developer, I have the Artifacts checked at pre-save time against the ALLOWED_CONTENT_CHECKSUMS instead of at __init__
Added by bmbouter about 4 years ago. Updated about 4 years ago.
100%
Description
Background¶
Right now the Artifact.init checks to see if checksums that are not allowed are being used and raises an error. That happens here.
This is has a problem when plugin code (like in pulp_rpm for example) does this:
foo = Artifact()
foo.md5 = 'somemd5string'
foo.save()
Improvement¶
Move the checking currently in Artifact.__init__
to a Artifact.pre_save hook using django-lifecycle.
Alternative¶
Add a property to each of the checksums on Artifact that checks if it was set post construction by init. This keeps the check in two places and it's more code to carry/maintain. These downsides make it the alternative.
Related issues
Updated by daviddavis about 4 years ago
- Related to Story #5216: As a user, I can configure which checksum types I want to use in Pulp added
Updated by daviddavis about 4 years ago
- Groomed changed from No to Yes
- Sprint Candidate changed from No to Yes
Updated by daviddavis about 4 years ago
- Related to Issue #7774: `podman push` leads to missing checksums on the Artifacts added
Updated by ggainey about 4 years ago
We need to do more than the current init behavior of disallowing-FORBIDDEN. We need the presave to make sure that the Artifact has provided values for everything in ALLOWED_ as well, to fix the issue in #7774
Updated by ipanova@redhat.com about 4 years ago
ggainey wrote:
We need to do more than the current init behavior of disallowing-FORBIDDEN. We need the presave to make sure that the Artifact has provided values for everything in ALLOWED_ as well, to fix the issue in #7774
In addition to this, a migration should be provided which would check for the artifacts that miss checksums specified in the ALLOWED_ and populate missing data.
Updated by bmbouter about 4 years ago
Agreed with Comment 4, +1 to the pre-save approach.
For Comment 5, I agree users need some way to populate this data when it's already on-disk, but if we ship a migration they can only adjust it once before the migration runs. We likely need a django command that will "recompute missing checksums" or something like that. Something an administrator can run when Pulp refuses to start because their settings and their db's checksums do not match.
Updated by daviddavis about 4 years ago
For Comment 5, I agree users need some way to populate this data when it's already on-disk, but if we ship a migration they can only adjust it once before the migration runs. We likely need a django command that will "recompute missing checksums" or something like that. Something an administrator can run when Pulp refuses to start because their settings and their db's checksums do not match.
I think this is a separate use case. Yes, we need a way for users to populate this data when they make changes to ALLOWED_CONTENT_CHECKSUMS.
However, currently in 3.7 and 3.8, plugins can write null values to artifact checksum columns. Then when pulp restarts, it encounters the exception in #7487. This is what is happening in #7774. I think we need a data migration to fix this use case. The question though is whether pulpcore or the plugin is responsible for this.
Updated by ipanova@redhat.com about 4 years ago
daviddavis wrote:
For Comment 5, I agree users need some way to populate this data when it's already on-disk, but if we ship a migration they can only adjust it once before the migration runs. We likely need a django command that will "recompute missing checksums" or something like that. Something an administrator can run when Pulp refuses to start because their settings and their db's checksums do not match.
I think this is a separate use case. Yes, we need a way for users to populate this data when they make changes to ALLOWED_CONTENT_CHECKSUMS.
In my understanding once you have pulp running you cannot make additive changes but only remove checksums.
However, currently in 3.7 and 3.8, plugins can write null values to artifact checksum columns. Then when pulp restarts, it encounters the exception in #7487. This is what is happening in #7774. I think we need a data migration to fix this use case. The question though is whether pulpcore or the plugin is responsible for this.
It will probably be easier to have this migration in core since it can filter through the all artifacts only those that miss checksums. If a plugin would do that he would need to figure out through all the content it has, the artifacts that belong to it and then look for those that miss checksums.
Updated by daviddavis about 4 years ago
ipanova@redhat.com wrote:
daviddavis wrote:
For Comment 5, I agree users need some way to populate this data when it's already on-disk, but if we ship a migration they can only adjust it once before the migration runs. We likely need a django command that will "recompute missing checksums" or something like that. Something an administrator can run when Pulp refuses to start because their settings and their db's checksums do not match.
I think this is a separate use case. Yes, we need a way for users to populate this data when they make changes to ALLOWED_CONTENT_CHECKSUMS.
In my understanding once you have pulp running you cannot make additive changes but only remove checksums.
That's the current state today but see https://pulp.plan.io/issues/7561. Allowing users to populate missing checksums would have to be part of #7561.
However, currently in 3.7 and 3.8, plugins can write null values to artifact checksum columns. Then when pulp restarts, it encounters the exception in #7487. This is what is happening in #7774. I think we need a data migration to fix this use case. The question though is whether pulpcore or the plugin is responsible for this.
It will probably be easier to have this migration in core since it can filter through the all artifacts only those that miss checksums. If a plugin would do that he would need to figure out through all the content it has, the artifacts that belong to it and then look for those that miss checksums.
My only misgiving here is that users would only run this migration once and a plugin might have a bug where it continues to insert blank/null checksums (even if we add this pre-save check as things like bulk_create wouldn't hit it). If a plugin fixes such a bug, it should probably fix the data too.
That said, if we fix https://pulp.plan.io/issues/7561, plugins can just ask users to run whatever solution we come up with to fix their missing data so maybe we should address #7561 with this story.
Updated by ipanova@redhat.com about 4 years ago
daviddavis wrote:
ipanova@redhat.com wrote:
daviddavis wrote:
For Comment 5, I agree users need some way to populate this data when it's already on-disk, but if we ship a migration they can only adjust it once before the migration runs. We likely need a django command that will "recompute missing checksums" or something like that. Something an administrator can run when Pulp refuses to start because their settings and their db's checksums do not match.
I think this is a separate use case. Yes, we need a way for users to populate this data when they make changes to ALLOWED_CONTENT_CHECKSUMS.
In my understanding once you have pulp running you cannot make additive changes but only remove checksums.
That's the current state today but see https://pulp.plan.io/issues/7561. Allowing users to populate missing checksums would have to be part of #7561.
However, currently in 3.7 and 3.8, plugins can write null values to artifact checksum columns. Then when pulp restarts, it encounters the exception in #7487. This is what is happening in #7774. I think we need a data migration to fix this use case. The question though is whether pulpcore or the plugin is responsible for this.
It will probably be easier to have this migration in core since it can filter through the all artifacts only those that miss checksums. If a plugin would do that he would need to figure out through all the content it has, the artifacts that belong to it and then look for those that miss checksums.
My only misgiving here is that users would only run this migration once and a plugin might have a bug where it continues to insert blank/null checksums (even if we add this pre-save check as things like bulk_create wouldn't hit it). If a plugin fixes such a bug, it should probably fix the data too.
if we deliver migration together with #7696 and #7561 it will 1) fix/populate all existing artifacts with missing checksums and 2) make it impossible to create further artifacts with missing checksums, so even if the plugin has the bug, he would need to first fix it in the plugin and only after that he would be able to create artifact
Forgot about bulk_create(). At this point it makes sense that plugins would provide the migration, because as you have stated even if we provide the core migration, plugin would need to fix the issue eventually and then again provide another migration.
That said, if we fix https://pulp.plan.io/issues/7561, plugins can just ask users to run whatever solution we come up with to fix their missing data so maybe we should address #7561 with this story.
Updated by ipanova@redhat.com about 4 years ago
- Related to Story #7561: As a user, I can add checksums to ALLOWED_CONTENT_CHECKSUMS added
Updated by ipanova@redhat.com about 4 years ago
- Sprint/Milestone set to 3.9.0
- Sprint set to Sprint 85
Updated by ipanova@redhat.com about 4 years ago
We should provide django command that will recompute and add missing checksums, data migration whether written in core or plugin is not a viable solution
Updated by pulpbot about 4 years ago
- Status changed from NEW to POST
Added by ggainey about 4 years ago
Updated by ggainey about 4 years ago
- Status changed from POST to MODIFIED
- % Done changed from 0 to 100
Applied in changeset pulpcore|1f003206393c33bc451d9ab64601105099bbd087.
Updated by pulpbot about 4 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Added pre_save hook to Artifact to enforce checksum rules.
Closes #7696