Task #1181
closedStory #1150: As a user, I can lazily fetch repositories
Create a download_unit platform task
0%
Description
A platform Celery task needs to exist which allows a single unit to be downloaded and saved. This task does not need a TaskStatus but it does need a reservation. The task needs to know which unit to download and will simply download it and put it on disk. Any existing symlinks will already be pointing to its final location so moving it in place is all that is necessary for all repos associated with the unit to begin using it.
The lock requirement ensures that if multiple users request a file, the created download_unit tasks will be processed serially ensuring that the copy into place will not have any corruption due to concurrent writes. If those writes are from multiple machines to the same NFS backend bad things could happen. The lock tag for this type of task should be enough to guarantee it is unique in the catalog. I expect this to be the "unit id + the unit path". Together these should handle both ContentUnits and SharedContentUnits uniquely.
Prior to downloading the unit the code should check if the file already exists. If the file does exist the task should exit immediately doing no work (a noop).
Related issues
Updated by bmbouter over 9 years ago
- Blocks Story #1180: As a user, I can identify a lazy-loader using server.conf and Pulp will redirect for content it does not have added
Updated by rbarlow over 9 years ago
I have an implementation detail proposal to consider:
If multiple clients request the same file, we could queue one task per request with resource locking as you suggested in the description. However, each task could check to see if the file has been downloaded already before it does anything. If it finds that the file already exists, it could exit and simply become a NOOP. This is nice because it is simple, and Celerly/qpid are known to be able to handle lots of tasks per second so I believe it would still perform well.
Updated by bmbouter over 9 years ago
- Description updated (diff)
rbarlow wrote:
I have an implementation detail proposal to consider:
If multiple clients request the same file, we could queue one task per request with resource locking as you suggested in the description. However, each task could check to see if the file has been downloaded already before it does anything. If it finds that the file already exists, it could exit and simply become a NOOP. This is nice because it is simple, and Celerly/qpid are known to be able to handle lots of tasks per second so I believe it would still perform well.
This sounds like a good improvement, let's do it. I considered the same idea, but I've fixed two bugs before regarding a "file check" that incorrectly reported a file existed when it didn't. This was due to stale inode pointers cached on a client node in an NFS environment where the file was already deleted from the server. This is a different case because nothing is going to delete the file. I've updated the proposal to have it check the file and NOOP if it exists.
Updated by jortel@redhat.com about 9 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to jortel@redhat.com
Added by jortel@redhat.com about 9 years ago
Added by jortel@redhat.com about 9 years ago
Revision f5a83095 | View on GitHub
ref #1181 - add content download task.
Updated by jcline@redhat.com about 9 years ago
- Status changed from ASSIGNED to CLOSED - WONTFIX
- Assignee deleted (
jortel@redhat.com)
Since we no longer plan to have a task that is designed to download a unit, I'm going to close this.
ref #1181 - add content download task.