Project

Profile

Help

Story #7696

As a plugin developer, I have the Artifacts checked at pre-save time against the ALLOWED_CONTENT_CHECKSUMS instead of at __init__

Added by bmbouter 8 months ago. Updated 7 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

100%

Estimated time:
Platform Release:
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
Sprint:
Sprint 86
Quarter:

Description

Background

Right now the Artifact.init checks to see if checksums that are not allowed are being used and raises an error. That happens here.

This is has a problem when plugin code (like in pulp_rpm for example) does this:

foo = Artifact()
foo.md5 = 'somemd5string'
foo.save()

Improvement

Move the checking currently in Artifact.__init__ to a Artifact.pre_save hook using django-lifecycle.

Alternative

Add a property to each of the checksums on Artifact that checks if it was set post construction by init. This keeps the check in two places and it's more code to carry/maintain. These downsides make it the alternative.


Related issues

Related to Pulp - Story #5216: As a user, I can configure which checksum types I want to use in PulpCLOSED - CURRENTRELEASE

<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>
Related to Container Support - Issue #7774: `podman push` leads to missing checksums on the ArtifactsCLOSED - CURRENTRELEASE<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>
Related to Pulp - Story #7561: As a user, I can add checksums to ALLOWED_CONTENT_CHECKSUMSCLOSED - CURRENTRELEASE

<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

Associated revisions

Revision 1f003206 View on GitHub
Added by ggainey 7 months ago

Added pre_save hook to Artifact to enforce checksum rules.

Closes #7696

History

#1 Updated by daviddavis 8 months ago

  • Related to Story #5216: As a user, I can configure which checksum types I want to use in Pulp added

#2 Updated by daviddavis 8 months ago

  • Groomed changed from No to Yes
  • Sprint Candidate changed from No to Yes

#3 Updated by daviddavis 8 months ago

  • Related to Issue #7774: `podman push` leads to missing checksums on the Artifacts added

#4 Updated by ggainey 8 months ago

We need to do more than the current init behavior of disallowing-FORBIDDEN. We need the presave to make sure that the Artifact has provided values for everything in ALLOWED_ as well, to fix the issue in #7774

#5 Updated by ipanova@redhat.com 8 months ago

ggainey wrote:

We need to do more than the current init behavior of disallowing-FORBIDDEN. We need the presave to make sure that the Artifact has provided values for everything in ALLOWED_ as well, to fix the issue in #7774

In addition to this, a migration should be provided which would check for the artifacts that miss checksums specified in the ALLOWED_ and populate missing data.

#6 Updated by bmbouter 8 months ago

Agreed with Comment 4, +1 to the pre-save approach.

For Comment 5, I agree users need some way to populate this data when it's already on-disk, but if we ship a migration they can only adjust it once before the migration runs. We likely need a django command that will "recompute missing checksums" or something like that. Something an administrator can run when Pulp refuses to start because their settings and their db's checksums do not match.

#7 Updated by daviddavis 8 months ago

For Comment 5, I agree users need some way to populate this data when it's already on-disk, but if we ship a migration they can only adjust it once before the migration runs. We likely need a django command that will "recompute missing checksums" or something like that. Something an administrator can run when Pulp refuses to start because their settings and their db's checksums do not match.

I think this is a separate use case. Yes, we need a way for users to populate this data when they make changes to ALLOWED_CONTENT_CHECKSUMS.

However, currently in 3.7 and 3.8, plugins can write null values to artifact checksum columns. Then when pulp restarts, it encounters the exception in #7487. This is what is happening in #7774. I think we need a data migration to fix this use case. The question though is whether pulpcore or the plugin is responsible for this.

#8 Updated by ipanova@redhat.com 8 months ago

daviddavis wrote:

For Comment 5, I agree users need some way to populate this data when it's already on-disk, but if we ship a migration they can only adjust it once before the migration runs. We likely need a django command that will "recompute missing checksums" or something like that. Something an administrator can run when Pulp refuses to start because their settings and their db's checksums do not match.

I think this is a separate use case. Yes, we need a way for users to populate this data when they make changes to ALLOWED_CONTENT_CHECKSUMS.

In my understanding once you have pulp running you cannot make additive changes but only remove checksums.

However, currently in 3.7 and 3.8, plugins can write null values to artifact checksum columns. Then when pulp restarts, it encounters the exception in #7487. This is what is happening in #7774. I think we need a data migration to fix this use case. The question though is whether pulpcore or the plugin is responsible for this.

It will probably be easier to have this migration in core since it can filter through the all artifacts only those that miss checksums. If a plugin would do that he would need to figure out through all the content it has, the artifacts that belong to it and then look for those that miss checksums.

#9 Updated by daviddavis 8 months ago

wrote:

daviddavis wrote:

For Comment 5, I agree users need some way to populate this data when it's already on-disk, but if we ship a migration they can only adjust it once before the migration runs. We likely need a django command that will "recompute missing checksums" or something like that. Something an administrator can run when Pulp refuses to start because their settings and their db's checksums do not match.

I think this is a separate use case. Yes, we need a way for users to populate this data when they make changes to ALLOWED_CONTENT_CHECKSUMS.

In my understanding once you have pulp running you cannot make additive changes but only remove checksums.

That's the current state today but see https://pulp.plan.io/issues/7561. Allowing users to populate missing checksums would have to be part of #7561.

However, currently in 3.7 and 3.8, plugins can write null values to artifact checksum columns. Then when pulp restarts, it encounters the exception in #7487. This is what is happening in #7774. I think we need a data migration to fix this use case. The question though is whether pulpcore or the plugin is responsible for this.

It will probably be easier to have this migration in core since it can filter through the all artifacts only those that miss checksums. If a plugin would do that he would need to figure out through all the content it has, the artifacts that belong to it and then look for those that miss checksums.

My only misgiving here is that users would only run this migration once and a plugin might have a bug where it continues to insert blank/null checksums (even if we add this pre-save check as things like bulk_create wouldn't hit it). If a plugin fixes such a bug, it should probably fix the data too.

That said, if we fix https://pulp.plan.io/issues/7561, plugins can just ask users to run whatever solution we come up with to fix their missing data so maybe we should address #7561 with this story.

#10 Updated by ipanova@redhat.com 8 months ago

daviddavis wrote:

wrote:

daviddavis wrote:

For Comment 5, I agree users need some way to populate this data when it's already on-disk, but if we ship a migration they can only adjust it once before the migration runs. We likely need a django command that will "recompute missing checksums" or something like that. Something an administrator can run when Pulp refuses to start because their settings and their db's checksums do not match.

I think this is a separate use case. Yes, we need a way for users to populate this data when they make changes to ALLOWED_CONTENT_CHECKSUMS.

In my understanding once you have pulp running you cannot make additive changes but only remove checksums.

That's the current state today but see https://pulp.plan.io/issues/7561. Allowing users to populate missing checksums would have to be part of #7561.

However, currently in 3.7 and 3.8, plugins can write null values to artifact checksum columns. Then when pulp restarts, it encounters the exception in #7487. This is what is happening in #7774. I think we need a data migration to fix this use case. The question though is whether pulpcore or the plugin is responsible for this.

It will probably be easier to have this migration in core since it can filter through the all artifacts only those that miss checksums. If a plugin would do that he would need to figure out through all the content it has, the artifacts that belong to it and then look for those that miss checksums.

My only misgiving here is that users would only run this migration once and a plugin might have a bug where it continues to insert blank/null checksums (even if we add this pre-save check as things like bulk_create wouldn't hit it). If a plugin fixes such a bug, it should probably fix the data too.

if we deliver migration together with #7696 and #7561 it will 1) fix/populate all existing artifacts with missing checksums and 2) make it impossible to create further artifacts with missing checksums, so even if the plugin has the bug, he would need to first fix it in the plugin and only after that he would be able to create artifact

Forgot about bulk_create(). At this point it makes sense that plugins would provide the migration, because as you have stated even if we provide the core migration, plugin would need to fix the issue eventually and then again provide another migration.

That said, if we fix https://pulp.plan.io/issues/7561, plugins can just ask users to run whatever solution we come up with to fix their missing data so maybe we should address #7561 with this story.

#11 Updated by ipanova@redhat.com 8 months ago

  • Related to Story #7561: As a user, I can add checksums to ALLOWED_CONTENT_CHECKSUMS added

#12 Updated by ipanova@redhat.com 8 months ago

  • Checklist item add django command that will recompute missing checksums added

#13 Updated by ipanova@redhat.com 8 months ago

  • Sprint/Milestone set to 3.9.0
  • Sprint set to Sprint 85

#14 Updated by ipanova@redhat.com 8 months ago

We should provide django command that will recompute and add missing checksums, data migration whether written in core or plugin is not a viable solution

#15 Updated by ipanova@redhat.com 8 months ago

  • Checklist item deleted (add django command that will recompute missing checksums)

#16 Updated by rchan 7 months ago

  • Sprint changed from Sprint 85 to Sprint 86

#17 Updated by pulpbot 7 months ago

  • Status changed from NEW to POST

#18 Updated by daviddavis 7 months ago

  • Assignee set to ggainey

#19 Updated by ggainey 7 months ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

#20 Updated by pulpbot 7 months ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Please register to edit this issue

Also available in: Atom PDF