Project

Profile

Help

Issue #7316

closed

Files are not being deleted from storage when calling the method delete()

Added by lmjachky over 3 years ago. Updated almost 3 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

According to the docs, FileField does not handle the removal of files: https://docs.djangoproject.com/en/3.1/releases/1.3/#deleting-a-model-doesn-t-delete-associated-files

PulpTemporaryFile contains such a field. When calling the method delete(), the file is not removed from the storage. We need to create our custom handlers for that.


Related issues

Related to Pulp - Issue #7676: django-cleanup can silently delete files being used by existing artifacts in some circumstancesCLOSED - CURRENTRELEASEdalleyActions
Related to Pulp - Issue #8295: Disc Usage during Repository SyncCLOSED - CURRENTRELEASElmjachkyActions
Related to Pulp - Backport #8757: backport 7316 to 3.11: Files are not being deleted from storage when calling the method delete()CLOSED - CURRENTRELEASEbmbouter

Actions
Actions #1

Updated by pulpbot over 3 years ago

  • Status changed from NEW to POST
Actions #2

Updated by fao89 over 3 years ago

  • Triaged changed from No to Yes
  • Sprint set to Sprint 79
Actions #3

Updated by daviddavis over 3 years ago

We should avoid cherry picking this issue to a z-stream branch and instead align it with the 3.7 release since it involves a major change in functionality for FileFields.

Added by Lubos Mjachky over 3 years ago

Revision f508d6b2 | View on GitHub

Delete associated files from the storage

closes #7316

Actions #4

Updated by Anonymous over 3 years ago

  • Status changed from POST to MODIFIED
Actions #5

Updated by bmbouter over 3 years ago

  • Sprint/Milestone set to 3.7.0
Actions #6

Updated by pulpbot over 3 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Actions #7

Updated by daviddavis over 3 years ago

  • Related to Issue #7676: django-cleanup can silently delete files being used by existing artifacts in some circumstances added
Actions #8

Updated by dalley over 3 years ago

  • Status changed from CLOSED - CURRENTRELEASE to NEW
  • Sprint/Milestone deleted (3.7.0)
  • Sprint changed from Sprint 79 to Sprint 84

This change had some unintended consequences which unfortunately our test suite didn't catch. Due to the severity of the problem, I am reverting this commit immediately to fix the releases - so we will need to find another approach to fixing this issue (or modify it such that the problem does not recur)

https://pulp.plan.io/issues/7676#note-8 https://github.com/pulp/pulpcore/pull/985

Actions #9

Updated by rchan over 3 years ago

  • Sprint changed from Sprint 84 to Sprint 85
Actions #10

Updated by rchan over 3 years ago

  • Sprint changed from Sprint 85 to Sprint 86
Actions #11

Updated by rchan over 3 years ago

  • Sprint changed from Sprint 86 to Sprint 87
Actions #12

Updated by rchan about 3 years ago

  • Sprint changed from Sprint 87 to Sprint 88
Actions #13

Updated by rchan about 3 years ago

  • Sprint changed from Sprint 88 to Sprint 89
Actions #14

Updated by rchan about 3 years ago

  • Sprint changed from Sprint 89 to Sprint 90
Actions #15

Updated by daviddavis about 3 years ago

  • Sprint/Milestone set to 3.12.0
  • Sprint deleted (Sprint 90)
Actions #16

Updated by dalley about 3 years ago

  • Related to Issue #8296: Pulp worker directories not cleaned up added
Actions #17

Updated by dalley about 3 years ago

  • Related to deleted (Issue #8296: Pulp worker directories not cleaned up)
Actions #18

Updated by dalley about 3 years ago

  • Related to Issue #8295: Disc Usage during Repository Sync added
Actions #19

Updated by ipanova@redhat.com almost 3 years ago

  • Sprint/Milestone changed from 3.12.0 to 3.13.0
Actions #20

Updated by daviddavis almost 3 years ago

I believe we may be handling this already in our code (see HandleTempFilesMixin). It would be good to test this out and confirm though.

Actions #21

Updated by lmjachky almost 3 years ago

daviddavis wrote:

I believe we may be handling this already in our code (see HandleTempFilesMixin). It would be good to test this out and confirm though.

This was the main reason for introducing django-cleanup to our project: https://github.com/pulp/pulpcore/pull/844#discussion_r469379057. Still, I agree with the reassessment.

Actions #22

Updated by lmjachky almost 3 years ago

If the files are not being deleted for real, we will probably need to implement a custom signal handler, as I did here: https://github.com/pulp/pulp_container/pull/133/files.

Actions #23

Updated by bmbouter almost 3 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to bmbouter
Actions #24

Updated by bmbouter almost 3 years ago

I tested this and I can say that at least when you delete an Artifact it also deletes the data from the storage backend.

Here's a simple reproducer:

import os

import django
django.setup()

from pulpcore.plugin.models import Artifact

path = '/tmp/asdfsfsdfsdfs'
with open(path, 'w') as f:
    f.write('asdfasdfasdf')


artifact = Artifact.init_and_validate(path)
artifact.save()

storage_path = '/var/lib/pulp/media/artifact/42/1c76d77563afa1914846b010bd164f395bd34c2102e5e99e0cb9cf173c1d87'
if os.path.exists(storage_path):
    print(f"{storage_path} exists!")

artifact.delete()

if not os.path.exists(storage_path):
    print(f"{storage_path} was deleted!")

When I run that I get:

/var/lib/pulp/media/artifact/42/1c76d77563afa1914846b010bd164f395bd34c2102e5e99e0cb9cf173c1d87 exists!
/var/lib/pulp/media/artifact/42/1c76d77563afa1914846b010bd164f395bd34c2102e5e99e0cb9cf173c1d87 was deleted!
Actions #25

Updated by bmbouter almost 3 years ago

Additionally, if I sync a file repo against an empty Pulp system and tree the /var/lib/pulp/media/ folder I get:

tree /var/lib/pulp/media/
/var/lib/pulp/media/
└── artifact
    ├── 8a
    │   └── 46213a107d6dde3c5f0e74f2ded19a1a9767be2ec0d14d274f4908db8cb836
    ├── 98
    │   └── 420a2eb7a86766e2e8a0ec4461b4022f754359cd6a3f5f1192b2eaf7c62672
    ├── d4
    │   └── 1dd7b1699638bf81ee78f197f761ec18ec61d7bf566880f9c1c088e946d8b7
    └── de
        └── 1e4813fedff7ee0223d63c4aaba1b9b3399cbff0edf059083be7ab1a157f9d

5 directories, 4 files

If I then delete the publication and run orphan cleanup with:

pulp file publication destroy /pulp/api/v3/publications/file/file/f8bc0232-2dbd-472b-9f48-660fb648ad09/
pulp orphans delete

tree then shows me:

/var/lib/pulp/media/
└── artifact
    ├── 8a
    │   └── 46213a107d6dde3c5f0e74f2ded19a1a9767be2ec0d14d274f4908db8cb836
    ├── 98
    │   └── 420a2eb7a86766e2e8a0ec4461b4022f754359cd6a3f5f1192b2eaf7c62672
    ├── d4
    └── de
        └── 1e4813fedff7ee0223d63c4aaba1b9b3399cbff0edf059083be7ab1a157f9d

5 directories, 3 files

After that if I delete the repository and run orphan cleanup with:

pulp file repository destroy --name repo3692
pulp orphans delete

tree shows me all files are deleted:

/var/lib/pulp/media/
└── artifact
    ├── 8a
    ├── 98
    ├── d4
    └── de

5 directories, 0 files
Actions #26

Updated by daviddavis almost 3 years ago

Here is a test script to check whether deleting uploads are deleting upload chunk files:

curl -O https://fixtures.pulpproject.org/file-large/1.iso
split --bytes=6M 1.iso chunk
export UPLOAD=$(http POST :/pulp/api/v3/uploads/ size=`ls -l 1.iso | cut -d ' ' -f5` | jq -r '.pulp_href')
http --form PUT :$UPLOAD file@./chunkab 'Content-Range:bytes 6291456-10485759/*'
http --form PUT :$UPLOAD file@./chunkaa 'Content-Range:bytes 0-6291455/*'
http DELETE :$UPLOAD
ls /var/lib/pulp/media/upload
Actions #27

Updated by bmbouter almost 3 years ago

Using the reproduer from comment 26, I've confirmed that this is an issue. After the DELETE the /var/lib/pulp/media/upload/ dir does contain:

/var/lib/pulp/media/upload
├── 72df4f1f-9c92-45ad-9b9b-90f4e2a9b86a
└── 76ac4feb-9b61-4a24-88a3-e748c4563886

I'll try to resolve the deletion of those uploads also now.

Actions #28

Updated by pulpbot almost 3 years ago

  • Status changed from ASSIGNED to POST

Added by bmbouter almost 3 years ago

Revision b38df5ff | View on GitHub

Fixes deletion of UploadChunk when Upload deletes

According to the Django docs, the on_delete=models.CASCADE option does not call Model.delete(), but it does call the Model.post_delete() signal handler.

This replaces the UploadChunk.delete() with an UploadChunk.post_delete() signal handler which correctly handles the UploadChunk.file file deletion.

closes #7316

Actions #29

Updated by bmbouter almost 3 years ago

  • Status changed from POST to MODIFIED
Actions #30

Updated by daviddavis almost 3 years ago

  • Related to Backport #8757: backport 7316 to 3.11: Files are not being deleted from storage when calling the method delete() added
Actions #31

Updated by pulpbot almost 3 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF