Skip to content

file iteration #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kylerlaird opened this issue Jun 26, 2020 · 4 comments
Open

file iteration #12

kylerlaird opened this issue Jun 26, 2020 · 4 comments

Comments

@kylerlaird
Copy link

I'm enjoying your document! I've been moving away from Bash to Python scripting for years. I'm always looking for better ways to do it.

I'm curious about the section on iterating through lines of a file. You provide some relatively complicated ways to do it using "with", readline(), etc. Why not simply "for line in open(file):"?

@ghost
Copy link

ghost commented Jun 26, 2020

Nice work @kylerlaird that you are moving away from Bash.
I am copying from this Python doc and please read the details there.

"It is good practice to use the with keyword when dealing with file objects. The advantage is that the file is properly closed after its suite finishes, even if an exception is raised at some point. Using with is also much shorter than writing equivalent try-finally blocks..."

@kylerlaird
Copy link
Author

The use of with is superfluous in this case. The use of "my_file" is pointless.

These should be equivalent.

with open('my_file.txt') as my_file:
    for line in my_file:
        do_stuff_with(line.rstrip())
for line in open('my_file.txt'):
    do_stuff_with(line.rstrip())

@kylerlaird
Copy link
Author

I agree that with is a good practice and a good tool to have (when needed for an object which is used multiple times). The pattern of "loop through all of the lines in a file" is so common, though, that it's a shame to add extra crud to what should be simple in Python.

@ninjaaron
Copy link
Owner

It depends what you're doing. If you're simply looping over a single file, that's fine. However, if you're looping over multiple files, depending what happens with the garbage collector, you could run out of file descriptors from the OS. This would rarely happen in practice in CPython because it uses reference counting, but it's difficult to guarantee that it wouldn't happen in Pypy, which uses tracing GC.

Additionally, while the current implementation of io objects does clean up file descriptors when the object is destructed, it's not part of the documented semantics and shouldn't be relied upon.

You're right that with is frequently unnecessarily, and I don't always use it in my personal scripts. However, the whole point of this tutorial is help people write scripts that are safe and maintainable by replacing Bash with Python. Running out of file descriptors on a corner case is exactly the kind of nonsense I'm trying to help people avoid. I'm not going to skip on a best-practice to avoid one extra line of code---especially if I'm teaching others.

Repository owner deleted a comment from snahtajain Feb 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants