Issue #4602

/var/lib/pulp/tmp/ seems to violate the FHS

Added by 7 months ago. Updated 6 months ago.

Start date:
Due date:
2. Medium
Platform Release:
Blocks Release:
Backwards Incompatible:
Sprint Candidate:
QA Contact:
Smash Test:
Verification Required:
Sprint 51


Pulp 3 currently uses /var/lib/pulp/tmp/ . This seems to be configurable as WORKING_DIRECTORY .

However, I re-reviewed the FHS (Filesystem hierarchy standard, which is part of the Linux Standard Base & approximately followed by flavors of unix). It should probably not be that directory. It should probably be /var/tmp/pulp/ .

A good question to ask "can a system administrator, at their discretion, delete the contents of WORKING_DIRECTORY whlle Pulp is stopped without breaking/resetting their Pulp application state?" If "yes", then it should be moved to /var/tmp/pulp/ .

Note that Pulp previously stopped using /var/cache/pulp (a different PATH, but FHS compliant) to consolidate on /var/lib/pulp/tmp in issue 3406.

However, this task is about the entire usage of the directory.

Associated revisions

Revision eee07e36 View on GitHub
Added by bmbouter 6 months ago

Deployment note on volumes for working dir

There was not enough info for users to understand the relationship
between the WORKING_DIRECTORY and MEDIA_ROOT settings. This note
clarifies this.
closes #4602


#1 Updated by bmbouter 7 months ago

One consideration to help inform a decision in this area. Subdirectories inside of the temporary directory is where downloaders running inside tasks write their data to. In almost all cases after saving the file there it is then moved in to place. Typically this place is on the local filesystem in the /var/lib/pulp/artifacts/... area. One performance consideration is that the copying of the file to the second location should not require the bits to be written out again. That is desirable at least.

One other FYI is that the final location can't be known until after the data's sha256 hash is calculated which produces the filepath where it is to be stored at. So we can't fully trust the final location until we already have saved all of the data to disk in the tmp area.

Just some things to think about along with this change.

#2 Updated by 7 months ago

When using S3, the temporary location doesn't matter. However, when using local storage, you would definitely get better performance if the temp directory was on the same physical device.

#3 Updated by 7 months ago

What if the default location for the temp directory was /var/tmp/pulp? At the same time the installer could create a symlink from the rest of pulp storage to there to improve performance.

#4 Updated by bmbouter 7 months ago

The great thing about keeping them both inside /var/lib/pulp/ is that the installer doesn't need the extra responsibility so it will provide the performance benefit natively.

The best thing I can think of to do is to document this in the docs. Perhaps with a ..note sphinx option that describes that hosting the working directory on the same volume as your backend storage when hosting artifacts locally on that storage?

Isn't it that any python code that wants to use temp directories will still get /var/tmp/ right? So in that way Pulp is using the Django temp directory feature to save staged data there and not really as the FHS temp dir. Is this right?

#5 Updated by 6 months ago

  • Triaged changed from No to Yes
  • Sprint set to Sprint 51

#6 Updated by bmbouter 6 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to bmbouter
  • Tags Pulp 3 added

#7 Updated by bmbouter 6 months ago

  • Status changed from ASSIGNED to POST

#8 Updated by bmbouter 6 months ago

  • Status changed from POST to MODIFIED

#9 Updated by bmbouter 6 months ago

  • Status changed from MODIFIED to POST

#10 Updated by bmbouter 6 months ago

  • Status changed from POST to MODIFIED

#11 Updated by bmbouter 6 months ago

  • Tags deleted (Pulp 3)

Please register to edit this issue

Also available in: Atom PDF