Story #4488

Updated by daviddavis 5 months ago

We're currently using drf-chunked-uploads[0] but it seems like the library has become unmaintained[1] since we adopted. It has some other quirks and missing features too. So I think we should move off of it and roll our code as part of this story.

h2. Solution

This will probably require a design which supports sha256 and parallel uploads of chunks.

h3. Models

h4. Upload

id = UUID
file = File
size = BigIntegerField
user = FK
created_at = DateTimeField
completed_at = DateTimeField

h4. UploadChunk

id = UUID
upload = FK
offset = BigIntegerField
size = BigIntegerField

h3. Workflow

# create the upload session
http POST :24817/pulp/api/v3/uploads/ size=10485759 # returns a UUID (e.g. 345b7d58-f1f8-45d9-d354-82a31eb879bf)
export UPLOAD='/pulp/api/v3/uploads345b7d58-f1f8-45d9-d354-82a31eb879bf/'

# note the order doesn't matter here
http PUT :24817$UPLOAD file@./chunkab 'Content-Range:bytes 6291456-10485759/32095676'
http PUT :24817$UPLOAD file@./chunkaa 'Content-Range:bytes 0-6291455/32095676'

# view the upload and its chunks
http :24817${UPLOAD}

# complete the upload
http PUT :24817${UPLOAD}commit md5=037a47d93670e64f2b1038e6f90e4cfd

# create the artifact from the upload
http --form POST :24817/pulp/api/v3/artifacts/ upload=$UPLOAD

h3. Additional references

PR against drf-chunked-uploads.