-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When copying Model-Instances as per django documentation, adjusting either deletes the File for the other #98
Comments
I'm currently considering adding this pre_save signal: def pre_save_file_field_containing_model(
sender: Type[Model], instance: Model, **kwargs):
"""
Handle the pre-save signal of a Model with a FileField.
These Models should all check, if the FileField they contain contains
a non-unique FileField and enforce uniqueness.
"""
if instance.pk is not None:
return
file_fields = list(get_fields_for_model(get_model_name(sender)))
for file_field in file_fields:
if getattr(instance, file_field):
other_exist = sender.objects.filter(
**{file_field: getattr(instance, file_field)}).exists()
if other_exist:
file = getattr(instance, file_field)
new_path = file.storage.get_available_name(file.path)
shutil.copyfile(file.path, new_path)
new_path = path.relpath(new_path, file.storage.location)
setattr(instance, file_field, new_path) Propably it's not the best way to write this in a long shot but it gets the job done, enforcing uniqueness in future copy operations |
Thank you :) A specific mention of "unique" or that this happens when copying instances would have been awesome to stop this in future from happening but a bit of text on this topic allready helps a lot :) For anyone finding this Issue later down the line because they duplicated a model as per django documentation and now noticed that a file is missing: What we did after figuring this outWe ran this code to find out how many hundreds of files were affected across the years: from django_cleanup import cache
from django.db.models import Count, Q
import os
import shutil
count_not_found = 0
filenames = []
for model in cache.cleanup_models():
model_name = cache.get_model_name(model)
file_fields = list(cache.get_fields_for_model(model_name))
for file_field in file_fields:
instances = list(model.objects.all().exclude(Q(**{file_field:''})|Q(**{file_field:None})))
for instance in instances:
file = getattr(instance, file_field)
if not os.path.exists(file.path):
count_not_found += 1
print(f'{model_name} | {file_field} | {str(instance.pk)} | {file.path}') We attached the above mentioned Signal to one of our apps, that helps stop this from happening. carefull on this one, this can result in more file-lossWe regenerated our few hundred files from years of backups and after trying it on local testing copies of the live system because the warnings are shown after the file is already gone (renamed) ran this code to get the few hundred duplicate files unique: from django_cleanup import cache
from django.db.models import Count, Q
import os
import shutil
count = 0
count_files = 0
filenames = []
for model in cache.cleanup_models():
model_name = cache.get_model_name(model)
file_fields = list(cache.get_fields_for_model(model_name))
for file_field in file_fields:
files = model.objects.values(file_field).annotate(count=Count('id')).order_by(file_field).filter(count__gt=1).exclude(Q(**{file_field:''})|Q(**{file_field:None}))
for file in files:
filename = file.get(file_field)
instances = list(model.objects.filter(**{file_field: filename}))
first_file = getattr(instances[0], file_field)
if filename in filenames:
print('warning: found file twice')
print(filename)
print(instances)
else:
filenames.append(filename)
if not os.path.exists(first_file.path):
continue
print(filename)
count_files += 1
save_objects = True
for instance in instances:
file_field_instance = getattr(instance, file_field)
storage = file_field_instance.storage
current_base_path = file_field_instance
new_path = storage.get_available_name(file_field_instance.path)
shutil.copyfile(file_field_instance.path, new_path)
updated_path = os.path.relpath(new_path, storage.location)
if len(updated_path) < 100:
setattr(instance, file_field, updated_path)
else:
save_objects = False
count += 1
print(f'{model_name} | {file_field} | {str(instance.pk)}')
if save_objects:
for instance in instances:
instance.save() |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Hi,
The Django documentation mentions this method for copying Model-Instance: https://docs.djangoproject.com/en/4.0/topics/db/queries/#copying-model-instances as per default Django behaviour this results in a non-unique filename. If one of them is afterwards adjusted, #51 takes place and The file is deleted from the other copy. Could you please mention in your documentation/enforce in some way Uniqueness in FileFields so that Files with the same name are not deleted as easily? :)
The text was updated successfully, but these errors were encountered: