Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REFACTOR: From os.path to pathlib #5495

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

REFACTOR: From os.path to pathlib #5495

wants to merge 2 commits into from

Conversation

jvela018
Copy link
Contributor

Description

Refactoring pyaedt to use python's documentation suggested replacement of os.path, namely pathlib

Issue linked

#5180

Checklist

  • I have tested my changes locally.
  • I have added necessary documentation or updated existing documentation.
  • I have followed the coding style guidelines of this project.
  • I have added appropriate tests (unit, integration, system).
  • I have reviewed my changes before submitting this pull request.
  • I have linked the issue or issues that are solved by the PR if any.
  • I have agreed with the Contributor License Agreement (CLA).

@ansys-reviewer-bot
Copy link
Contributor

Thanks for opening a Pull Request. If you want to perform a review write a comment saying:

@ansys-reviewer-bot review

@jvela018 jvela018 added the enhancement New features or code improvements label Nov 28, 2024
@jvela018 jvela018 changed the title REFACTOR!: From os.path to pathlib REFACTOR: From os.path to pathlib Nov 28, 2024
@gmalinve gmalinve requested review from jsalant22 and removed request for jsalant22 November 28, 2024 12:00
Copy link
Collaborator

@SMoraisAnsys SMoraisAnsys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please Path from pathlib unless there is a reason not to ?
This would require to add imports in files such as from pathlib import Path.
Also, could you join path instances with / as follow:

path = Path("some/path/i/work")
other_path = path / "with" 

other_path will lead to "some/path/i/work/with"

@jvela018
Copy link
Contributor Author

jvela018 commented Nov 29, 2024

@SMoraisAnsys ,

Please not that the purpose of this draft was two fold:

  • Expose my branch and PR so that in the case that I can't continue, or someone else wants to assist in the migration process is enabled to do so
  • Test against CI before the final PR

Having said that, this is PR is nowhere near to be reviewed - that's why I hadn't assigned reviewers.

With respect to your review:

First point

Could you please Path from pathlib unless there is a reason not to ? This would require to add imports in files such as from pathlib import Path.

  • I was planning on doing so. The reason why I didn't do it initially was because I wasn't sure what data types I'd need. Note that for the most part, I'd be using PurePath and Path, not only Path.

  • The reasoning for using PurePath is because PurePath is exactly that, a path without IO operations, so in the case of path definitions it is reasonable to not check with the OS unless prompted. Concrete Path(s), inherit from PurePath but also provide IO operations. See: https://docs.python.org/3/library/pathlib.html

  • To make it simpler to anyone who wants to add to this PR (not needed at the moment though). There's a table for the mapping from os.path to pathlib in the python documentation

image

  • In summary, I'll do

from pathlib import Path PurePath

which is simply an extension of what you requested. I just wanted to clarify the use of PurePath as well.

  • A last thing to note is that we need to be very careful with the use of pathlib as it returns a Path object, whereas the os.path operation returns a string . I'm only mentioning this because this change will affect the way we define certain parameters. For instance, in the following image:

image

the project name needs to be converted to a string, otherwise it will break the examples. We should decide if we want to make the str casting in the example itself or in the member of the object itself.

Second point

Also, could you join path instances with / as follow:

path = Path("some/path/i/work")
other_path = path / "with" 

other_path will lead to "some/path/i/work/with"

  • I am joining paths using PurePath.joinpath() for two reasons:

    • Readeability
    • Consistency
  • Although, the most pythonic way of doing things would be to use a slash '/', in our context it becomes unreadable. With more than 3 path members in the way that paths have been defined becomes hard to read. Granted, this could also be my preference and we can agree on changing it if most people agree but to me one looks more readable than the other:

PurePath(edt_root) .joinpath("commonfiles", "CPython", python_version_new, "winx64", "Release", "python", "python.exe" )

versus

PurePath(edt_root) /"commonfiles"/ "CPython"/ python_version_new/ "winx64"/ "Release"/ "python"/ "python.exe"

In this example it's easier to read because it's a single line, but toss it into the rest of the code and sometimes it's hard to read, even more if you need to break the line and need to add . Granted you can also let pycharm add a line break but it would wrap things extra parentheses. Once again, it's about readability.

  • On the consistency front, it is because in the case of using '/' as the separator, it will only work as long as you don't have concatenation operations. So even if we agree on using slash as our separator, I'd rather use .joinpath() because then every path definition is defined the same

Let's take the following example
PurePath(self.toolkit_directory).joinpath(name + ".results")

The alternative would be

PurePath(self.toolkit_directory) / str( name + ".results")

  • This is also a simple case, but in some cases there are variations of string literals and variables in the same path definition. So we can choose the '/' operator but it would involve also casting every case like the example above. My main argument is that I rather have all paths defined the same way, than having two separate syntax styles. On a similar note, for those coming from os.path, would be able to read it easier as well because it's the same notation by the use of commas ',' as path separator

os.path.join(edt_root, "commonfiles", "CPython", python_version_new, "winx64", "Release", "python", "python.exe")

pathlib.PurePath(edt_root) .joinpath("commonfiles", "CPython", python_version_new, "winx64", "Release", "python", "python.exe")

One last thing I want to mention is that as I said in the beginning, this PR is nowhere to be reviewed. The reason is because once I finish the trickier cases with .replace and such, I believe the best practice would be to refactor the way paths are defined all together with reusable functions/methods get/set path, extension, file name. Otherwise it'll be quite hard to maintain for any future changes that may occur adding to our technical debt.

Please let me know if this addresses the points made, and you agree with my coding style. Otherwise, I'll just go ahead and make the requested changes. Let me know what you think.

/Jon

FYI: @gmalinve @Samuelopez-ansys @maxcapodi78

@SMoraisAnsys
Copy link
Collaborator

I'll go with the general feedback but I clearly prefer using "/" and avoiding joinpath. The main reason is that my brain easily classify the associated code as a path while it's not the case for the other.
As for the case of string literal and variables, I don't see the problem
image

Let me know if I'm missing something. Note that I'm biased toward "/" usage because it's something I've come across in multiple projects / repos. However, I'll go with the majority's choice :)

@jvela018
Copy link
Contributor Author

@SMoraisAnsys ,

Honestly, what I'm thinking is to redo everything in reusable functions. Then the implementation would be ok either way. Some of the os.path methods are just very elaborate ways to do very simple tasks- like fetching the name of the file, or the extension of the file. I think it would be better if I just create some generic functions that can be used by others in the future. This way it's easier to maintain. That would imply closing this PR for now, and working on a new PR with a fresh commit. If you agree with this approach, where should I put these methods? Any thoughts on it?

@SMoraisAnsys
Copy link
Collaborator

I'm not sure to understand everything about the method you mentioned. Could you elaborate ?
Following your last message I think this PR should be closed and multiple "simple" PR should be performed. Otherwise you'll struggle to keep that up to date with the main branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New features or code improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants