Story #4049
closedAs a user, I can know if an RPM is modular or not
100%
Description
All modular RPMs have DISTTAG tag (not to be confused with the %{dist} tag) set to the module they have been built for.
>>> headers[rpm.RPMTAG_DISTTAG]
'module(nodejs:10:20180813130636:9edba152)'
Presence of 'module(...)' indicates that an RPM is a modular one.
Detailed info (NSVC) about a module the RPM was built for is not reliable and can't be used to identify the module the RPM belongs to. The NSVC indicates which module that RPM was built for originally, but the same RPM can potentially be used in different modules and the DISTTAG won't be updated.
Pulp needs this information to perform reliable filtering of modular RPMs.
This is required for applicability calculation.
It can also be helpful for:
- checking module consistency
- finding modular RPMs
- upload of modular RPMs if/when Pulp needs to create a reference to a module
Suggested solution:
Add a new field "modular" to the RPM model which will indicate if RPM is modular or not.
In case of on_demand sync, RPM headers can't be analysed, so the "modular" field can be set only by analyzing modules metadata in modules.yaml file. For all its artifacts, "modular" field should be set to True. We will rely on the repodata information provided in the modules.yaml during sync whether policy is immediate or on_demand.
In case of upload, a DISTTAG tag from header can be used as described above.
Migration is needed. "modular" filed can be set by analysing existing modules in Pulp. In case there are modular RPMs in Pulp which don't belong to any module, the "modular" flag would be set incorrectly. At this moment the likelihood of that is low:
- there are no production bits for modularity content at the moment, only F29 beta content.
- if module is removed from Pulp, its artifacts/RPMs are removed as well.
Related issues
Updated by ipanova@redhat.com about 6 years ago
during the migration - do we plan to look just into modules information?in case we do so, then the rest of non-modular rpms will miss the 'modular' field, right?
how do we plan to identify ursine modular rpms? analysing each header for rpm ( which is present on fs) does not seem like a good approach.
we could add a validation later to figure out any ursine content
Updated by ttereshc about 6 years ago
I suggest for migration to rely on modulemd content only and set a new field for every RPM either to True or to False.
We can have false negatives but at the moment the probability is very low + we can always verify or fix it with a validation tool.
So +1 to ignore that problem.
+1 for a validation tool later.
For huge repos with an immediate policy it could be too expensive migration if we look at every RPM on a FS.
Any thoughts from someone else?
Updated by milan about 6 years ago
LGTM;
FWIW I was able to check 150k unit sizes in about 8min with pulp-integrity, checking checksums took about 45 minutes.
Reading the rpm header could take longer indeed than checking the unit sizes; the stats are cached in-memory thru i-nodes but to figure out the flag, each rpm unit header would have to be read and parsed by the rpm lib but will probably take less than the checksums calculation as only the rpm headers need processing. We could still experiment and measure this way of migration, if not in too much hurry, the code might be simpler too (reusing the upload bits).
Updated by jortel@redhat.com about 6 years ago
Sounds like a reasonable approach/plan to me. The migration should be a best effort using the manifests in the DB. Further migration or correctness should be provided by running a CLI tool at customer request.
Updated by jortel@redhat.com about 6 years ago
- Groomed changed from No to Yes
- Sprint Candidate changed from No to Yes
Updated by ttereshc about 6 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to ttereshc
- Sprint set to Sprint 44
Updated by ttereshc about 6 years ago
- Status changed from ASSIGNED to POST
Updated by milan about 6 years ago
Just a thought: do we want a new check function (plugin) for pulp-integrity for the migration check?
Added by ttereshc about 6 years ago
Added by ttereshc about 6 years ago
Revision 28f4d726 | View on GitHub
Update progress with a custom number of processed items
Updated by ttereshc about 6 years ago
Probably not. We no longer can reliably do any checks. Not all modular RPMs are with modular metadata in a repo. Not all modular RPMs have DISTTAG set :( E.g. F28 modular repo.
Updated by ttereshc about 6 years ago
- Status changed from POST to MODIFIED
- % Done changed from 0 to 100
Applied in changeset 56b7bd07bde22acc01ebae31b6ee2d4f86009588.
Updated by kersom about 6 years ago
- Related to Test #4146: Test is_modular flag in RPM units added
Added by kersom about 6 years ago
Revision 82b9e28a | View on GitHub
Add pulp version required to run CheckIsModularFlagTestCase
Add the minimum Pulp version required in order to run is_modular
test case.
Pulp 2.18.
Updated by ttereshc about 6 years ago
- Status changed from 5 to CLOSED - CURRENTRELEASE
Mark modular RPMs during sync and upload
is_modular
for Rpm modelis_modular
to True based on data in Modulemd unitscloses #4049 https://pulp.plan.io/issues/4049