Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add possibility to chain pipes before execution #88

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

hubald
Copy link

@hubald hubald commented Apr 24, 2023

  • additionally allow subclassing Pipe by using self.__class__ instead of Pipe as constructor

- additionally allow subclassing Pipe by using self.__class__ instead of Pipe as constructor
@JulienPalard
Copy link
Owner

Interesting!

This probably completly changes the way a long Pipe is built by the interpreter, before it was "left to right", as there was only __ror__s, now it builds right to left with __or__s, up to the data wich is eaten by the ror? I've not tried it yet.

Do you have any real use case for this?

@hubald
Copy link
Author

hubald commented Apr 26, 2023

No, it's evaluated like before - from left to right. I stepped through the code. OR-operator is used only if no iterable was passed.
My real-life use case is that I pass different Pipe methods to an evaluation function as parameter - and here comes the problem that the Pipes have to be assigned to a variable and hence are "evaluated" without the iterable first. Something like the following:

class Database:
...
  def filter_lines_by(predicate):
    filtered_lines = list(self.lines | predicate) # __ror__ is used
    #further processing

db = Database()
errors = db.filter_lines_by(has_error_code | reported_by_end_customer | older_than_days(30)) # __or__ is used

@JulienPalard
Copy link
Owner

JulienPalard commented Apr 29, 2023

I understand the use case.

Can you please add some explanations in the README about it? Maybe with your use case as an example, or someting inspired from your use case?

I'll have to run it step by step too to correctly ingest how it works before merging, I'll have the time next week I hope.

I still think it changes the "building" order (as __or__ should be picked with a higher priority than __ror__), which is not an issue: I also think that it changes nothing at runtime.

I mean, in a|b|c I think it is currently built via c.__ror__(b).__ror__(a), while with your PR I think it's built via: b.__or__(c).__ror__(a). Anyway, I'll inspect this closely after holidays :)

@jenisys
Copy link

jenisys commented May 28, 2023

Interesting (now I understand why sspipe provided operator.ror and operator.or).

There is at least one other solution to provide pipeline recipes (sorry, I needed to come up with a name for it).
A pipeline recipe is a prepared, but often complex pipeline that can be easily be applied to any iterable.
But the solution here covers only part of what is suggested above.

def test_pipeline_recipe_with_many_pipes():
    @Pipe
    def select_even_numbers_and_multiply_by(iterable, factor):
        yield from (iterable | select_even_numbers | multiply_by(factor))

    numbers = range(8)
    results = list(numbers | select_even_numbers_and_multiply_by(3))
    expected = [0, 6, 12, 18]
    assert results == expected

@JulienPalard
Maybe you could add a section in the README with this information (or I provide a pull-request).

lambda iterable, *args2, **kwargs2: other.function(
self.function(iterable, *args2, **kwargs2)
)
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just wondering, what if other is a pipeable object but not an instance of Pipe ?!?
Would it not be better to use:

def __or__(self, other):
    cls = self.__class__
    if isinstance(other, Pipe):
        return cls(lambda iterable, *args2, **kwargs2: (
                   other.function(self.function(iterable, *args2, **kwargs2))))

    # -- CASE: other is pipeable only
    return cls(lambda iterable, *args2, **kwargs2: (
               self.function(iterable, *args2, **kwargs2) | other))

jenisys added a commit to jenisys/pipe that referenced this pull request May 29, 2023
* ADDED: Pipe.__or__() function
* USE: self.__class__ instead of Pipe class
  REASON: Simlify to derive from Pipe class and extend functionality
* Add test for pipeline recipe based on or-pipes.

SEE ALSO:

* JulienPalard#88
@JulienPalard
Copy link
Owner

Took the time to review your code, looks like this PR needs love...

It feel strange that your __or__ returns a Pipe that accepts arguments. And it feels arbitrary to pass those arguments to the first Pipe of the named chain.

In other words, with your code, this works:

batch_any = batched | take(2)
batch_two = batch_any(2)

print(list("hello world" | batch_two))  # Gives [('h', 'e'), ('l', 'l')]

Here batched is a "partial Pipe awaiting arguments", so batch_any is still partial on its first part, and batch_two is created by fulfilling the missing arg.

Why not giving the arguments to the last one:

batch_any = batched(2) | take
batch_two = batch_any(2)

?

Or why not "dispatching the right named argument to the right pipe"?

In the face of ambiguity, refuse the temptation to guess.

so I propose simplifying a bit and refuse extra arguments:

    def __or__(self, other):
        return self.__class__(lambda iterable: other.function(self.function(iterable)))

how does it looks like to you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants