-
-
Notifications
You must be signed in to change notification settings - Fork 46.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change space complexity of linked list's __len__ from O(n) to O(1) #8183
Conversation
@@ -72,7 +72,7 @@ def __len__(self) -> int: | |||
>>> len(linked_list) | |||
0 | |||
""" | |||
return len(tuple(iter(self))) | |||
return sum(1 for _ in self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sum will call iter(self) inside, so this code is still O(n).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the classical linked list, the best length is O(n).
PS. collections.deque
is not a real linked list, it's a wrapper around it, so it basically counts how many append/delete methods were called. https://github.com/python/cpython/blob/main/Modules/_collectionsmodule.c#L196
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you talking about "Time" complexity? This line would fix the "Space" complexity as it only fetches one item at a time and adds 1 to the final result as opposed to tuple()
which loads all the items into the memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed that, you definitely right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fun exercise... Go to https://pyodide.org/en/latest/console.html to get a current Python repl running on WASM.
Paste in the following code and hit return.
>>> from timeit import timeit
setup="from itertools import product; from string import ascii_letters"
timeit("sum(1 for _ in product(ascii_letters, repeat=4))", number=10, setup=setup)
timeit("len(tuple(product(ascii_letters, repeat=4)))", number=10, setup=setup)
5.0610000000000355
4.121999999999957
sum()
is slower than len()
for 7,311,616 items.
Refresh the webpage to clear out any clutter in memory...
Paste in the following code and hit return.
>>> from timeit import timeit
setup="from itertools import product; from string import ascii_letters"
timeit("sum(1 for _ in product(ascii_letters, repeat=5))", number=1, setup=setup)
timeit("len(tuple(product(ascii_letters, repeat=5)))", number=1, setup=setup)
26.686000000000035
Traceback (most recent call last):
...
MemoryError
sum()
delivers an answer for 380,204,032 items while len()
raises a MemoryError
.
These numbers are for long iterators but still good to know.
I added this change to |
Describe your change:
Following #5315 and #5320, I was convinced that it's better to calculate
__len__
each time on demand. But current implementation has "space" complexity problem when dealing with a huge linked list. That temporarytuple
is unnecessary. A one-line change using built-insum()
and a generator expression would solve it without breaking existing codes.Checklist:
Fixes: #{$ISSUE_NO}
.