Task #7178
closedRecommended installation layout
100%
Description
While trying to address https://projects.theforeman.org/issues/30423 I was looking at the recommended layout. There are a few things I'd suggest to do different.
https://docs.pulpproject.org/settings.html#media-root suggests /var/lib/pulp with mode 750 and SELinux context var_lib_t. I disagree with all of these.
The directory should be its own directory (so /var/lib/pulp/media) because it means all files within it are owned by Pulpcore. This means you can use unreferenced_files from django-extensions without adjustment.
If the directory is /var/lib/pulp then mode 750 means Apache can't read files from the directory. This conflicts with Pulp 2 and serving assets directly via Apache. This is not a concern when /var/lib/pulp/media is used.
The SELinux context should be pulpcore_var_lib_t so Apache and other services are denied access to media files. Again, this can only be done when it's in a subdirectory.
Looking at https://github.com/pulp/pulpcore/blob/09d05da6f38c74e3574d3d256f890b23d76cb3d0/pulpcore/app/settings.py#L36-L43 there are various settings that are derived from MEDIA_ROOT, which is IMHO incorrect. https://docs.djangoproject.com/en/2.2/ref/settings/#media-root states
Absolute filesystem path to the directory that will hold user-uploaded files
STATIC_ROOT should be part of MEDIA_ROOT. A common pattern is to introduce a setting (like ROOT_DIR or similar) and derive all locations based on that. Then in production mode you can set ROOT_DIR to /var/lib/pulp and automatically get all recommended directories correct.
Then there is the SELinux policy. As mentioned previously, it doesn't set the SELinux type for all the MEDIA_ROOT, but only the artifact directory. This is IMHO incorrect.
It also sets the SELinux type for assets (which is the default directory for STATIC_ROOT) to pulpcore_var_lib_t but Apache isn't allowed to serve that. However, it is the most efficient way to serve these files. Perhaps this could use the type httpd_sys_content_t but I don't know if that's appropriate. A more experience SELinux dev should weigh in on this.
Lastly there's the SELinux types for bin/gunicorn and bin/rq. In the Katello RPM packaging these live in /usr/bin and don't get a label. I don't think it's appropriate to label those files pulpcore_exec_t, but the result is that the Pulp services run unconfined in the Katello deployment. A common pattern is to use /usr/libexec to create wrappers. The systemd services could then call these wrappers with the correct context.
Related issues
Change the default deployment layout
This changes the default deployment layout. The main change is that MEDIA_ROOT gets its own directory. This allows limiting the file permissions in a shared Pulp 2 + Pulp 3 deployment and the SELinux file contexts. Another benefit is compatibility with django_extensions' unreferenced_files command which lists all files in MEDIA_ROOT that are not in the database.
Other paths are kept on the same absolute paths, but the diff looks bigger because they used derive from MEDIA_ROOT.
The documentation is updated to show the latest best practices.
fixes #7178