Project

Profile

Help

Issue #2709

closed

saved files may not be safely persisted to disk, even after being "verified"

Added by mhrivnak almost 7 years ago. Updated almost 5 years ago.

Status:
CLOSED - WONTFIX
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

Thanks to Tim Waugh for proposing this. While we aren't sure if this has ever caused a real problem, his team did encounter a problem for which this is a theoretically possible cause.

When Pulp adds content to /v/l/p/content/, it does not call flush() or fsync() on the destination. Here is the current copy call, which uses shutil.copy():

https://github.com/pulp/pulp/blob/2.12-release/server/pulp/server/content/storage.py#L134

This could be unsafe. In theory, these events could unfold:

  • file is copied into place. some of the bits are in a buffer
  • file is read back to verify size and/or checksum, and some of the bits are transparently read straight out of the buffer. Verification passes.
  • something goes wrong with the storage, and the bits never all make it from the buffer onto safe persistent storage.

So even though the file was "verified", it ended up being corrupted later. The way to do this safely is after writing bits, call the flush() method of the file object to flush bits out to kernel space, and then call os.fsync() on the file descriptor, which blocks until the kernel writes the bits onto persistent storage.

I verified that shutil.copy() does not call flush() or fsync(). You can trace the code here:

https://github.com/python/cpython/blob/2.7/Lib/shutil.py#L125

This may have a performance impact during sync, especially when a large number of relatively small files are being sync'd. For example, rpm sync could be impacted, as it sometimes can save dozens of RPMs per second. If performance is impacted, we should consider making this an optional behavior controllable via a setting.

Testing this on multiple filesystems is a good idea, especially over NFS.

Actions #1

Updated by ttereshc almost 7 years ago

  • Triaged changed from No to Yes
Actions #2

Updated by bmbouter almost 5 years ago

  • Status changed from NEW to CLOSED - WONTFIX
Actions #3

Updated by bmbouter almost 5 years ago

Pulp 2 is approaching maintenance mode, and this Pulp 2 ticket is not being actively worked on. As such, it is being closed as WONTFIX. Pulp 2 is still accepting contributions though, so if you want to contribute a fix for this ticket, please reopen or comment on it. If you don't have permissions to reopen this ticket, or you want to discuss an issue, please reach out via the developer mailing list.

Actions #4

Updated by bmbouter almost 5 years ago

  • Tags Pulp 2 added

Also available in: Atom PDF