Project

Profile

Help

Story #4049

As a user, I can know if an RPM is modular or not

Added by ttereshc about 1 year ago. Updated 7 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

100%

Platform Release:
2.18.0
Blocks Release:
Backwards Incompatible:
No
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
Pulp 2
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 44

Description

All modular RPMs have DISTTAG tag (not to be confused with the %{dist} tag) set to the module they have been built for.

>>> headers[rpm.RPMTAG_DISTTAG]
'module(nodejs:10:20180813130636:9edba152)'

Presence of 'module(...)' indicates that an RPM is a modular one.
Detailed info (NSVC) about a module the RPM was built for is not reliable and can't be used to identify the module the RPM belongs to. The NSVC indicates which module that RPM was built for originally, but the same RPM can potentially be used in different modules and the DISTTAG won't be updated.

Pulp needs this information to perform reliable filtering of modular RPMs.
This is required for applicability calculation.
It can also be helpful for:
  • checking module consistency
  • finding modular RPMs
  • upload of modular RPMs if/when Pulp needs to create a reference to a module

Suggested solution:
Add a new field "modular" to the RPM model which will indicate if RPM is modular or not.

In case of on_demand sync, RPM headers can't be analysed, so the "modular" field can be set only by analyzing modules metadata in modules.yaml file. For all its artifacts, "modular" field should be set to True. We will rely on the repodata information provided in the modules.yaml during sync whether policy is immediate or on_demand.

In case of upload, a DISTTAG tag from header can be used as described above.

Migration is needed. "modular" filed can be set by analysing existing modules in Pulp. In case there are modular RPMs in Pulp which don't belong to any module, the "modular" flag would be set incorrectly. At this moment the likelihood of that is low:
  • there are no production bits for modularity content at the moment, only F29 beta content.
  • if module is removed from Pulp, its artifacts/RPMs are removed as well.

Checklist


Related issues

Related to Pulp - Test #4146: Test is_modular flag in RPM units CLOSED - COMPLETE Actions

Associated revisions

Revision 56b7bd07 View on GitHub
Added by ttereshc about 1 year ago

Mark modular RPMs during sync and upload

  • Add new field `is_modular` for Rpm model
  • Inspect headers to mark RPMs as modular during upload
  • Mark RPMs as modular during sync based on artifacts info in modules.yaml
  • Inspect headers to mark RPMs as modular ones upon successful download
  • Migrate all RPMs, set `is_modular` to True based on data in Modulemd units

closes #4049
https://pulp.plan.io/issues/4049

Revision 28f4d726 View on GitHub
Added by ttereshc about 1 year ago

Update progress with a custom number of processed items

re #4049
https://pulp.plan.io/issues/4049

Revision 82b9e28a View on GitHub
Added by kersom 12 months ago

Add pulp version required to run CheckIsModularFlagTestCase

Add the minimum Pulp version required in order to run `is_modular` test case.
Pulp 2.18.

#4049
https://pulp.plan.io/issues/4049

History

#1 Updated by ttereshc about 1 year ago

  • Checklist item changed from Set this field during upload by analyzing a dist tag to Set this field during upload by analyzing a DISTTAG tag
  • Description updated (diff)

#2 Updated by ipanova@redhat.com about 1 year ago

  • Description updated (diff)

#3 Updated by ipanova@redhat.com about 1 year ago

during the migration - do we plan to look just into modules information?in case we do so, then the rest of non-modular rpms will miss the 'modular' field, right?
how do we plan to identify ursine modular rpms? analysing each header for rpm ( which is present on fs) does not seem like a good approach.
we could add a validation later to figure out any ursine content

#4 Updated by ttereshc about 1 year ago

I suggest for migration to rely on modulemd content only and set a new field for every RPM either to True or to False.
We can have false negatives but at the moment the probability is very low + we can always verify or fix it with a validation tool.
So +1 to ignore that problem.
+1 for a validation tool later.
For huge repos with an immediate policy it could be too expensive migration if we look at every RPM on a FS.

Any thoughts from someone else?

#5 Updated by milan about 1 year ago

LGTM;

FWIW I was able to check 150k unit sizes in about 8min with pulp-integrity, checking checksums took about 45 minutes.
Reading the rpm header could take longer indeed than checking the unit sizes; the stats are cached in-memory thru i-nodes but to figure out the flag, each rpm unit header would have to be read and parsed by the rpm lib but will probably take less than the checksums calculation as only the rpm headers need processing. We could still experiment and measure this way of migration, if not in too much hurry, the code might be simpler too (reusing the upload bits).

#6 Updated by ttereshc about 1 year ago

  • Sprint/Milestone set to 2.18.0

#7 Updated by jortel@redhat.com about 1 year ago

Sounds like a reasonable approach/plan to me. The migration should be a best effort using the manifests in the DB. Further migration or correctness should be provided by running a CLI tool at customer request.

#8 Updated by jortel@redhat.com about 1 year ago

  • Groomed changed from No to Yes
  • Sprint Candidate changed from No to Yes

#9 Updated by ttereshc about 1 year ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to ttereshc
  • Sprint set to Sprint 44

#10 Updated by ttereshc about 1 year ago

  • Checklist item Add a boolean field to the RPM model set to Done
  • Checklist item Set this field during sync by analying the modules.yaml set to Done
  • Checklist item Set this field during upload by analyzing a DISTTAG tag set to Done
  • Checklist item Write a migration to add the field by analyzing existing modules set to Done
  • Status changed from ASSIGNED to POST

#11 Updated by milan about 1 year ago

Just a thought: do we want a new check function (plugin) for pulp-integrity for the migration check?

#12 Updated by ttereshc about 1 year ago

Probably not. We no longer can reliably do any checks. Not all modular RPMs are with modular metadata in a repo. Not all modular RPMs have DISTTAG set :( E.g. F28 modular repo.

#13 Updated by ttereshc about 1 year ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

#14 Updated by kersom about 1 year ago

  • Related to Test #4146: Test is_modular flag in RPM units added

#15 Updated by ttereshc about 1 year ago

  • Platform Release set to 2.18.0

#16 Updated by ttereshc about 1 year ago

  • Status changed from MODIFIED to ON_QA

#17 Updated by ttereshc 12 months ago

  • Status changed from ON_QA to CLOSED - CURRENTRELEASE

#18 Updated by bmbouter 7 months ago

  • Tags Pulp 2 added

Please register to edit this issue

Also available in: Atom PDF