Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add itertoolz.flatten #547

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open

Add itertoolz.flatten #547

wants to merge 17 commits into from

Conversation

groutr
Copy link
Contributor

@groutr groutr commented Aug 18, 2022

An implementation of javascript's Array.flat() (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/flat#use_generator_function)
More flexible than the flatten recipe in itertools.



def flat(level, seq):
""" Flatten a possible nested sequence by n levels """
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fill out this docstring soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Don't forget to also point to concat(seq), which flattens a sequence one level.

@eriknw
Copy link
Member

eriknw commented Aug 18, 2022

Cool! yield from syntax from Python 3.3 definitely makes this nicer.

Some questions:

  • Isn't flatten a more common name?
  • Can you flatten without limit? Should you be able to?
  • What about strings, should they be flattened? Should we have some types that we exclude by default or a function to determine what to exclude?
  • Alternatively, should we allow types that to recurse into? A function to determine whether to traverse or not would also work.

flatten seems like a useful, well-named function.

if level < 0:
raise ValueError("level must be >= 0")
for item in seq:
if level == 0 or not hasattr(item, '__iter__'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to have outside the for loop:

if level == 0:
    yield from seq
    return

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! That works really well.

@groutr
Copy link
Contributor Author

groutr commented Aug 18, 2022

Yes, a very good use case for yield from.
Some answers:

  • Yes, flatten is a good name too. I don't have any strong opinions about the naming. I simply borrowed the name that javascript uses.
  • Yes, you actually can flatten without limit. If you choose a high enough integer, or do as javascript does and pass infinity for the level. This particular implementation also has the neat behavior that negative values also behave like infinity. We could allow -1 as a special value to flatten all levels. Technically, we are limited by the recursion limit in Python (so a depth of 3000). If that proves a problematic limit, we can look at a stack based implementation.
  • That would be a good addition. I have to admit I didn't test on strings since my use case was flattening what was essentially an ndarray (but without numpy).
    I have a version that accepts a function to determine if the traverse into an iterable. A nice default would be to exempt str, bytes, bytearray, and dict.
def example_descend(x):
    return not isinstance(x, (str, bytes, bytearray, dict))
def flatten(level, seq, descend=None):
    """ Flatten a possible nested sequence by n levels """
    if not callable(descend):
        raise ValueError("descend must be callable boolean function")

    if level < -1:
        # -1 flattens infinitely.
        raise ValueError("Level must be >=0 or -1")

    def flat(level, seq):
        if level == 0:
            yield from seq
            return

        for item in seq:
            if isiterable(item) and descend(item):
                yield from flat(level - 1, item)
            else:
                yield item
    
    yield from flat(level, seq)

The descend function is only called on iterable items and returns True if the iterable should be unpacked. If this looks better to you, I can go ahead and commit it.

@eriknw I really appreciate the feedback.

@groutr
Copy link
Contributor Author

groutr commented Aug 23, 2022

@eriknw I think this is ready for another review.

@groutr groutr changed the title Add itertoolz.flat Add itertoolz.flatten Aug 23, 2022
@groutr
Copy link
Contributor Author

groutr commented Dec 3, 2022

ping @eriknw

"""
if level < -1:
raise ValueError("Level must be >= -1")
if not callable(descend):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe let users pass None for descend to say "I want the default value"?

Suggested change
if not callable(descend):
if descend is None:
descend = _default_descend
if not callable(descend):

This is something I almost always recommend for default arguments, especially when the default value is private, like it is here.

Because

  1. It can help code that programmatically selects the descend argument.

  2. It can help with writing wrappers of flatten which themselves let users pass an optional descend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants