Story #5216
closedStory #3778: [Epic] As a user, I can run Pulp 3 in a FIPS-enabled environment
As a user, I can configure which checksum types I want to use in Pulp
100%
Description
Background¶
Some users would like to disallow the use of certain checksums now determined to be insecure, e.g. md5 or sha1. It is desirable to allow users to configure which checksum types they want to use with Pulp.
When does Pulp call checksums?¶
When computing the Artifacts themselves a variety of checksums are computed here and then stored on the Artifact model's checksum fields.
Feature plan¶
Introduce a new setting called CONTENT_CHECKSUMS
which would identify the set() of CHECKSUMS that Pulp should be using. Here's an example of the default:
CONTENT_CHECKSUMS = set("md5", "sha1", “sha224”, “sha256”, “sha384”, “sha512”)
In this case, all checksums would be computed and stored as they do today.
If a user configured this with:
CONTENT_CHECKSUMS = set("sha1", “sha224”, “sha256”, “sha384”, “sha512”)
Then all checksums would be computed and used except md5.
If a user configured this with:
CONTENT_CHECKSUMS = set(“sha224”, “sha256”, “sha384”, “sha512”)
Then all checksums would be computed and used except md5 and sha1.
sha256 cannot be removed¶
sha256 cannot be removed and must always be present in CONTENT_CHECKSUMS
because Pulp's content addressable storage requires sha256 to lay the files out on disk.
All Pulp processes should refuse to start if sha256 is not present in CONTENT_CHECKSUMS
by emitting a django.exceptions.ImproperlyConfigured
exception indicating that sha256 is required in CONTENT_CHECKSUMS
.
Model changes¶
The model changes should likely become:
md5 = models.CharField(max_length=32, null=True, unique=False, db_index=True)
sha1 = models.CharField(max_length=40, null=True, unique=False, db_index=True)
sha224 = models.CharField(max_length=56, null=True, unique=False, db_index=True)
sha256 = models.CharField(max_length=64, null=False, unique=True, db_index=True)
sha384 = models.CharField(max_length=96, null=True, unique=True, db_index=True)
sha512 = models.CharField(max_length=128, null=True, unique=True, db_index=True)
Class attribute re-work¶
The DIGEST_FIELDS
, COMMON_DIGEST_FIELDS
, and RELIABLE_DIGEST_FIELDS
should become properties which are memoized computations that are built from the configured CONTENT_CHECKSUMS
.
Docs¶
The new setting should have documentation on this page in the Pulp Settings area.
NOTE: this setting can never be changed once it's set prior to any data loaded into Pulp. We do not validate this; it's difficult to validate. Please document with a .. warning::
block at the settings documentation.
An additional check at Artifact instantiation time¶
The stages pipeline creates in-memory Artifacts, and these are later used to query the db if those Artifacts exist or not. We need to add a new Artifact.__init__
which checks that all checksum values being set are in the set of CONTENT_CHECKSUMS
available. If they are not raise a TypeError
.
Related issues
Added support for specifying/limiting content-checksums used by Pulp.
settings.ALLOWED_CONTENT_CHECKSUMS now drives the other checksum-related fields of Artifact (DIGEST_FIELDS, COMMON_DIGEST_FIELDS, RELIABLE_DIGEST_FIELDS)
closes #5216