Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: Refactor Filesystem Operations to Use pathlib Instead of os.path #1301

Closed
kremnik opened this issue Aug 7, 2024 · 5 comments
Closed
Labels
enhancement New feature or request

Comments

@kremnik
Copy link
Contributor

kremnik commented Aug 7, 2024

Description

I would like to suggest refactoring the codebase to utilize the pathlib library for all filesystem operations instead of os.path. This change will provide improved readability, code consistency, and platform independence.
If this proposal is approved, I could work on it.

Additional Info

No response

@kremnik kremnik added the enhancement New feature or request label Aug 7, 2024
@serengil
Copy link
Owner

serengil commented Aug 7, 2024

Please explain the pros. How it improve the readability, consistency and how it makes the lib platform independent.

@kremnik
Copy link
Contributor Author

kremnik commented Aug 8, 2024

@serengil Yes, of course.

  1. In the codebase, both pathlib and os.path are used. It would be better to use just one library consistently, especially in the files deepface/commons/folder_utils.py and deepface/commons/image_utils.py;
  2. Constructing paths like this output = home + "/.deepface/weights/" + file_name (there are a lot of cases of such path constructing style in the codebase) is not safe due to platform-dependent differences in path separators (/ and \);
  3. The code in deepface/commons/constant.py could be rewritten in a more readable way;
  4. There is inconsistency in how existence checks are performed. Sometimes it is written as if os.path.exists(exact_file_path) is False: and other times as if not os.path.exists(weights_path):. The same inconsistency exists with os.path.isfile;
  5. There is inconsistent use of os.path.join and path concatenation, as mentioned in point 2.

Of course, these points are not crucial, and the code is effective as it stands. However, bringing file operations to a consistent style would improve the code even further.

@serengil
Copy link
Owner

serengil commented Aug 8, 2024

1- we are doing all file related things with os.path. Pathlib is just being used to find the home folder, and given image is Path object (this is added with a PR from community, not really required feature). So, it would be better to give up Pathlib instead of os.path.
2- constructing paths like output = home + "/.deepface/weights/" + file_name makes java programs platform dependent. In python, this is not true. Even in windows, you have to use slash not backslash. If you say this usage is platform dependent, then it wouldn't work in windows but it is working. if you say that using join is more readable, that is fine but i disagree with platform independence thing.
3- don't know how you will do it better
4- I prefer to use 1st usage when I am coding, 2nd usage is most probably coming from a community PR. But that is still totally fine.
5- I am not a fan of join, prefer using slashes when I am coding. These are coming from community PRs.

So, PRs are more than welcome!

@kremnik
Copy link
Contributor Author

kremnik commented Aug 8, 2024

  1. I agree with you; replacing Pathlib with os.path will also result in consistent code;
  2. Yes, you are right if we use these paths only in the Python script. However, if I want to log these paths or use them in shell scripts later, it might be encountered potential problems while moving code from Windows to Linux and vice versa. Meanwhile, Pathlib returns platform-adapted paths. I have provided an example in Snippet1;
  3. An example is provided in Snippet2;
  4. It's totally fine; it's just about code style;
  5. One of the biggest advantages of os.path.join instead of concatenation is handling trailing slashes in directories paths. Consider the following example: output = home + deepface_dir + "/weights/" + file_name. Here, if the home variable doesn't include a trailing slash, path concatenation will fail, but os.path.join(home, deepface_dir, "weights", file_name) will work correctly.

Snippet1:

import os
from pathlib import Path

root = "c:\\f1\\"
path = "f2/data.txt"

print(os.path.join(root, path)) # output: c:\f1\f2/data.txt can't be used in logs, shell scripts, etc.
print(Path(root) / Path(path)) #  output: c:\f1\f2\data.txt correct

Snippet2:

# Now
import os

SRC_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
ROOT_DIR = os.path.dirname(SRC_DIR)

# Suggested
from pathlib import Path

SRC_DIR = Path(__file__).resolve().parent.parent
ROOT_DIR = SRC_DIR.parent

@serengil
Copy link
Owner

serengil commented Aug 9, 2024

@kremnik as i mentioned, PRs are more than welcome

@kremnik kremnik mentioned this issue Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants