Project

Profile

Help

Issue #6815

Pulp - Story #6134: [EPIC] Pulp import/export

Not all advisory models are defined for import/export

Added by ttereshc 5 months ago. Updated about 2 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 79
Quarter:

Description

Advisory models:

class UpdateRecord(Content)

class UpdateCollection(BaseModel):
    update_record = models.ManyToManyField(UpdateRecord, related_name="collections")

class UpdateCollectionPackage(BaseModel):
   update_collection = models.ForeignKey(UpdateCollection, related_name='packages', on_delete=models.CASCADE) 

class UpdateReference(BaseModel):
    update_record = models.ForeignKey(UpdateRecord, related_name="references", on_delete=models.CASCADE)

Currently only UpdateRecord is considered.


Related issues

Related to RPM Support - Task #7195: Enforce correct uniqueness for UpdateCollection at the Model levelCLOSED - CURRENTRELEASE

<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

Associated revisions

Revision db0a7926 View on GitHub
Added by ggainey 2 months ago

Add remaining RPM-entities to import/export.

Includes UpdateReference, UpdateCollection, and UpdateCollectionPackage.

Added basic test-skeleton for import/export of pulp_rpm

Updated travis-configs to enable ALLOWED_(IMPORT|EXPORT)_PATHS to let CI run the import/export testcases successfully.

fixes #6815

History

#1 Updated by ttereshc 5 months ago

  • Description updated (diff)

#2 Updated by ggainey 5 months ago

  • Parent task set to #6134

#3 Updated by ggainey 5 months ago

  • Triaged changed from No to Yes

#4 Updated by ggainey 5 months ago

@ttereshenc - A number of these have no 'natural key' supported at the DB level via uniqueness-constraints. I can work around that, but I think it might just be an oversight.

  • UpdateCollection: is 'name' expected to be unique? And what about not-null - as currently defined, all fields except pulp_id/created are allowed to be null - is that right?
  • UpdateCollectionPackage: is 'sum,sum_type' expected to be unique? sum_type can be null - is that right? What about 'filename' or 'name'?
  • UpdateReference: is 'href,ref_type' expected to be unique?

Once I have answers to these, I can make the necessary changes and then make them all exportable.

#5 Updated by ttereshc 5 months ago

Yes, all those auxiliary models have no uniqueness constraint and are created/removed for every advisory. digest on the UpdateRecord model reflects all content of advisory, including those models.

  • UpdateCollection: is 'name' expected to be unique? And what about not-null - as currently defined, all fields except pulp_id/created are allowed to be null - is that right?

In theory, it should be unique within an advisory with a specific id. It's not checked/enforced in pulp3 at the moment but we have a story filed. At a DB level they are not required to be unique. As for the null constraints, advisory can have anything/nothing, you know ;) I can check why we chose null for the name as well, probably name is always present but it definitely can be an empty string. It also depends how createrepo_c works with it. Let me know if it's critical and I can dig deeper.

  • UpdateCollectionPackage: is 'sum,sum_type' expected to be unique? sum_type can be null - is that right? What about 'filename' or 'name'?

sum and sum_type are often absent. filename is potentially a good candidate. name is not enough, should be all NEVRA then. FWIW, if we choose the path of introducing uniqueness constraints, I think it should be for all 3 models. It also affects sync code significantly and requires data migration which needs to remove duplicated entries and create correct relations for all affected advisories.

  • UpdateReference: is 'href,ref_type' expected to be unique?

Probably within the same advisory. I don't think there is any guarantee that it's this way and they can also be absent.

#6 Updated by ggainey 3 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to ggainey
  • Sprint set to Sprint 77

#7 Updated by ggainey 3 months ago

There are two issues with the current UpdateCollection model:

  • name is nullable. Because all nulls are unique in postgres, this means that you could have two UpdateCollections null/update-id-1 and null/update-id-1, and the would be considered 'unique'. If a collection can truly be unnamed, we need to make "not-named' be an empty string instead of null.
  • update_collection is a many-to-many relationship instead of a foreign-key relationship. This looks like just an error tha needs to be fixed.

Fixing this in existing databases will require:

  1. for all rpm_updatecollection where name is null, set name = ""
  2. create a foreign-key column, update_record_id, in rpm_updatecollection
  3. find all existing update-collections that are related to exactly one update-record
  4. for all elements found in step-2:
    1. fill in update_record_id-column with the update-record-id they are associated with
    2. remove these entries from rpm_updatecollection_update_record
  5. for all rpm_updatecollection rows where update-collection-id is still null:
    1. if none, break
    2. for each UpdateCollection, find all the UpdateRecords it is associated with
    3. create a duplicate UpdateCollection (and associated UpdateCollectionPackages), assigned each to a specific UpdateRecord
    4. remove records from rpm_updatecollection_update_record for the originating UpdateRecord
  6. if rpm_updatecollection_update_record is not empty - ERROR
  7. update rpm_updatecollection.name to NOT be nullable
  8. set a uniqueness-constraint on rpm_updatecollection of (name, update-record-id)

Definitely a Fun migration...

A thought : if the API doesn't expose creating/linking updatecollection to updaterecords directly, then we just have to understand how advisory-upload/create and sync use this model - the data may not be as broken as the db-schema allows it to be. It would let us skip steps 5-6 above.

#8 Updated by ggainey 3 months ago

  • Related to Task #7195: Enforce correct uniqueness for UpdateCollection at the Model level added

#9 Updated by rchan 3 months ago

  • Sprint changed from Sprint 77 to Sprint 78

#10 Updated by rchan 3 months ago

  • Sprint changed from Sprint 78 to Sprint 79

#11 Updated by pulpbot 2 months ago

  • Status changed from ASSIGNED to POST

#12 Updated by ggainey 2 months ago

  • Status changed from POST to MODIFIED

#13 Updated by ttereshc 2 months ago

  • Sprint/Milestone set to 3.6.0

#14 Updated by pulpbot 2 months ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Please register to edit this issue

Also available in: Atom PDF