Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discuss the value of lift in the 6.0 world #2911

Closed
benlesh opened this issue Oct 6, 2017 · 16 comments
Closed

Discuss the value of lift in the 6.0 world #2911

benlesh opened this issue Oct 6, 2017 · 16 comments
Assignees

Comments

@benlesh
Copy link
Member

benlesh commented Oct 6, 2017

I'd like to talk about removing lift and the ceremony around that.

Why we had lift

lift is a really cool architecture for a couple of reasons:

  1. It allowed us to control the return type of operators for better dot-chaining. This meant that users could create custom operators on a class that extends Observable, then override lift to return the custom class, allowing the custom operators to mix in the same dot-chains as standard operators.
  2. It gave us a centralized point that we could hook to affect (almost) all operators. This was going to enable us to do fancy reporting or build tooling around RxJS Observables.

Lift In The World Of Pipeable Operators

With Pipeable operators, we're no longer dot-chaining everything. This means that controlling the return type no longer matters, and developing a custom operator is as simple as writing a higher-order function. This calls into question the value of lift, since that eliminates benefit number 1 listed above.

Given that all operators will likely go through the pipe method, or otherwise carry a similar signature. It seems like there would be ample ways to hook our operators for any sort of tooling we might want to do. Either way, the promise of lift as a centralized point to hook everything never came through, and in fact there were always some operators that just didn't use lift so it wouldn't have been perfect.

Proposed change for 6.0

Remove lift process entirely.

I think the benefit here will be a sharp reduction in the size of the library. As there will no longer be a need for all of the Operator classes in the library, nor will there be calls to lift or implementations of lift on Observable or Subject.


Interested in all of your thoughts... especially @trxcllnt

@kwonoj
Copy link
Member

kwonoj commented Oct 6, 2017

What's end-user-consumer side changes once lift is gone? ergonomics around dot operator chaining will have changes? Also I recall there was some performance reason to choose lift for first time - with all higher order, what'll be differences around those as well?

@benlesh
Copy link
Member Author

benlesh commented Oct 6, 2017

There would be no ergonomics changes around dot-chaining for the 90% use-case.

For other use cases, it would mean that you couldn't build a custom Observable and have it compose through standard operators. So (new MyObservable()).customOp().map(fn).customOp() would not work... but (new MyObservable()).customOp().map(fn) would. pipe or let would of course work at any level.

@felixfbecker
Copy link
Contributor

felixfbecker commented Oct 6, 2017

This meant that users could create custom operators on a class that extends Observable, then override lift to return the custom class

I am probably missing something, but what was the benefit of lift() over
new this.constructor()? (that's basically what Array.prototype.map() etc. do when you override Array)

@benlesh
Copy link
Member Author

benlesh commented Oct 6, 2017

@felixfbecker it controls the creation of the Subscriber, which is what you really end up hooking into to get visibility into any operator.

@cartant
Copy link
Collaborator

cartant commented Oct 6, 2017

@benlesh Regarding your point 1, to what extent was using lift to return instances with additional methods or properties encouraged? Is it likely to be a widespread practice?

It's interesting to note that prior to its 5.0.0-rc, AngularFire2 used classes that extended Observable, adding methods for mutating the Firebase database, etc. and used lift to return instances of said classes from operator chains.

In its 5.0.0-rc, AngularFire2 has been significantly refactored and no longer (as far as I can see) includes such classes.

I'm curious as to whether such classes not playing super nice with TypeScript was a factor. Perhaps we could ask @davideast?

@staltz
Copy link
Member

staltz commented Oct 6, 2017

I like lift and it indeed promises some incredible opportunities that we haven't yet leveraged, but I think we could still leverage it, specially when it comes to debugging.

@benlesh Why do pipeable operators eliminate benefit number 1? As far as I see, this use case would still work with pipe because operators right now are using lift. In the case of a custom operator called through let or pipe, there are two possibilities:

(A) it's built as a composition of other common operators
(B) it's actually custom built using class MyOperator implements Operator

lift won't break in case of (A), and in case of (B), we can instruct people to use lift. Anyway we need to instruct how to do (B) properly since it's an advanced use case.

@felixfbecker
Copy link
Contributor

felixfbecker commented Oct 6, 2017

@staltz can't any operator that can be implemented with (B) be imlpemented with

export const myOperator = () => (source: Observable<T>) => new Observable<R>(observer => {
  // operator code
})

?

@benlesh
Copy link
Member Author

benlesh commented Oct 6, 2017

Is it likely to be a widespread practice?

@cartant We've documented the approach, however given how I've seen some pretty clever people struggle a little with that approach, I sincerely doubt it's wide spread. Also, given that not all operators are implemented with lift, I think this approach is still going to result in confusion. The pipeable operators mean we have none of these problems anymore.

Why do pipeable operators eliminate benefit number 1?

@felixfbecker is talking about what I mean, @staltz... if all operators are just functions that return (source: Observable<T>) => Observable<R>, and are composed functionally, then we don't care about instance types because nothing has to be a member to dot-chain any more, because dot-chaining isn't a thing.

@staltz
Copy link
Member

staltz commented Oct 6, 2017

Oh, so maybe I misunderstood benefit 1, if it's all about "users could create custom operators on a class that extends Observable". I think I was talking the whole time about benefit 2, because I don't see benefit 1 as being that important.

It seems like there would be ample ways to hook our operators for any sort of tooling we might want to do.

Which ways?

in fact there were always some operators that just didn't use lift so it wouldn't have been perfect.

Which operators? And is it out of impossibility or just oversight?

@trxcllnt trxcllnt self-assigned this Oct 7, 2017
@jasonaden
Copy link
Collaborator

Also agreed on getting rid of it. This along with #3057 should make the library simpler to contribute to, make the generated code smaller, and improve treeshaking.

@trxcllnt
Copy link
Member

trxcllnt commented Nov 15, 2017

The Operator classes were an optimization tailored for what JIT'd well in v8 at the time (more details in the original Lifted Observable proposal, and below). If operator functions JIT (or tree-shake, etc.) better now, then we should do that.

In fact, the Operator interface was explicitly designed to mimic the runtime signature of regular functions, but for perf we implemented all of them as classes:

interface Operator<T, R> {
  call(subscriber: Subscriber<R>, source: any): TeardownLogic;
}

// these are both valid:
class OperatorA() {
  call(sink, source) { return source.subscribe(sink) }
}
source.lift(new OperatorA());

function OperatorB(source) {
  return source.subscribe(this);
}
source.lift(OperatorB);

I've even proposed updating the Operator interface to reflect this (at the time the interface was written, TS didn't support hybrid types), but got push back that it could be confusing.

What's the deal with lift?

@felixfbecker @staltz can't any operator that can be implemented with (B) be imlpemented with

export const myOperator = () => (source: Observable<T>) => new Observable<R>(observer => {
  // operator code
})

?

Totally! That's how Rx < 5 was implemented (until ~1.5 years ago or so, when Matt switched it to Observable/Observer pairs). The issue is that (at the time) JITs weren't great at inlining anonymous function invocations, so no matter how much we optimized things like inner disposables, there was always a performance barrier for throughput.

JITs did perform well on the linked-list-of-subscribers-style (as opposed to the "linked-list-of-closures" style), which was a big reason we went with that in 5. We hypothesized the linked-list-of-subscribers style is faster because vtable lookup for the Subscriber methods is very fast, and v8 can represent them as constant_function descriptors since they're on the prototype. For comparison:

function map<T, R>(source: Observable<T>, sel: (x: T) => R) {
  return new Observable<R>((sink: Observer<R>) => {
    return source.subscribe({
      next(x) { sink.next(sel(x)); }, // <-- a new function on each subscription
      error: sink.error.bind(sink),
      complete: sink.complete.bind(sink)
    });
  });
}
// vs.
class MapSubscriber<T, R> extends Subscriber<T> {
  constructor(sink: Observer<R>, sel: (x: T) => R) {
    super(sink);
    this.sel = sel;
  }
  next(x) { this.sink.next(this.sel(x)); } // <-- represented as a v8 `constant_function`
}
class Observable {
  map(sel: (x: T) => R) {
    return new Observable<R>((sink: Observer<R>) => {
      return this.subscribe(new MapSubscriber(sink, sel));
    })
  }
}

The same goes for operator classes vs. operator closures, with the added bonus that v8 optimizes lookups of properties assigned in the constructor. At the time these lookups were faster than accessing values from closure scope. I'm not sure if this is still true in turbofan, or whether turbofan does something similar as other VM's (like .NET), optimizing closures to anonymous inner classes?

If we strip away all the ceremony around JIT optimizations, lift is just a function that maps a source Observable to a result Observable using an operator function that maps a sink Observer to a source Observer:

function lift<T, R>(source: Observable<T>, operator: (sink: Observer<R>) => Observer<T>): Observable<R> {
  return new Observable<R>((sink: Observer<R>) => {
    return source.subscribe(operator(sink));
  });
}

Another way to think about it is you can create an Observable chain going "down":

srcObservable -> mapObservable -> filterObservable -> subscribe: (sink) => Subscription

...by composing operator functions that create Subscriber chains going back "up":

src <- mapSubscriber <- filterSubscriber <- sink (aka the finalSubscriber)

In a sense the Observable chain is really just a factory for creating Subscriber chains, which is where the real logic lives.

In practice, we found it was necessary for the operator function to control the source.subscribe() call in a few cases (skip/take, multicast, etc.), so the real definition of lift is more like this:

function lift<T, R>(source: Observable<T>, operator(source: Observable<T>, sink: Observer<R>) => Subscription<T>): Observable<R> {
  return new Observable<R>((sink: Observer<R>) => {
    return operator(source, sink);
  });
}

But since we know the VM can represent functions on the prototype as constant_functions (and can embed the function pointer into optimizable functions), we can transform the lift function into its prototype form:

class Observable<T> {
  lift<R>(operator: Operator<T, R>): Observable<R> {
    return new Observable<R>((sink: Observer<R>) => {
      return operator(this, sink);
    });
  }
}

We can also get rid of that non-constant anonymous function that binds the operator function to the source and sink. All we have to do is broaden the scope of our design to include the Observable prototype's subscribe method. We can assign the source and operator to the Observable being returned, then invoke the operator function inside the subscribe method:

class Observable<T> {
  lift<R>(operator: Operator<T, R>): Observable<R> {
    const observable = new Observable<R>();
    // lookups for these properties later would be faster if
    // these assignments happened in the Observable constructor
    observable.source = this;
    observable.operator = operator;
    return observable;
  }
  subscribe(observerOrNext?: PartialObserver<T> | ((value: T) => void), error?: (error: any) => void, complete?: () => void): Subscription {
    const { operator } = this;
    const sink = toSubscriber(observerOrNext, error, complete);
    if (operator) {
      operator.call(sink, this.source);
    } else {
      sink.add(this.source ? this._subscribe(sink) : this._trySubscribe(sink));
    }
    return sink;
  }

Even if turbofan is faster than crankshaft at invoking anonymous functions, the closure approach has 1 more allocation (aka 2x as many total allocations) than assigning the source and operator to the Observable instance being returned.

lift vs. pipe

Now let's compare the "pure" lift and pipe signatures:

// I'm intentionally ignoring pipe's multiple operator function args,
// since we could redefine lift to also take multiple operator functions
type Operator = <T, R>(sink: Observer<R>) => Observer<T>
type lift = <T, R>(src: Observable<T>, op: Operator) => Observable<R>;
type pipe = <T, R>(op: Operator) => (src: Observable<T>) => Observable<R>

Looking closely, pipe is just the partially-applied form of lift! Pipe just re-arranges the position of the source Observable and operator function, allowing the source Observable to be supplied at the end instead of the beginning. They're so close that we wouldn't even need two methods if we were in a functional language that partially applies functions by default.

From this perspective, the whole "lift + operators implemented in terms of Observers, + Operator classes" scheme is a thing we found that let us efficiently implement the pipe-style for operators internally before the VMs started helping us out.

I should note, I did take some liberties with the operator definitions above:

  • pipe's operator function maps an Observable<T> to an Observable<R>
  • lift's operator function maps an Observer<R> to an Observer<T>

This is just another way to represent the idea of either:

  • building an Observable chain down from the source to the sink
  • or building an Observer chain up from the sink to the source

I guess I'm unclear on what removing lift entirely would mean? Taking it off the prototype? Inlining the contents of the lift method into every operator? Reimplementing operators in terms of Observables (and anonymous Observers + Disposables)?

While I believe it's always valuable to re-evaluate our assumptions on performance, my intuition is that implementing operators in terms of Observers (or rather, Subscribers) is still faster than implementing them in terms of Observables. All the operators are currently still implemented in terms of Observers and pipe is just a wrapper around lift, so today we see no difference. With the obvious caveat that JITs are spooky, I'm dubious about any plans to reimplement the operators without some form of lift.

@benlesh Either way, the promise of lift as a centralized point to hook everything never came through, and in fact there were always some operators that just didn't use lift so it wouldn't have been perfect.

Unless something's changed, I don't think this has been true since #1941 landed?

@cartant to what extent was using lift to return instances with additional methods or properties encouraged? Is it likely to be a widespread practice?

I use it in a number of public projects, and a good number of proprietary projects as well. I know a number of other folks who rely on it to extend Observable and build custom asynchronous DSLs, but they're understandably not very active in our GH issues.

@david-driscoll
Copy link
Member

The one thing that I always liked about lift was that at the end of the day, we could in theory capture metadata about the subscriber chains. I'd love to be able to take an observable and be able to produce a string that says what the source is, what operators it's been shuffled through map -> filter -> merge -> map -> switch. Similar to how the marble rendering is handled today.

My understanding (however limited I will admit) is that lift allows us to do that easier because we could simply monkey patch the lift method with whatever we wanted at run time and continue about our day. If we can do the same thing with pipe then I'm fine with that too.

@trxcllnt
Copy link
Member

trxcllnt commented Nov 15, 2017

@david-driscoll theoretically we can monkey-patch pipe and do something similar, except since the signature is Observable<T> => Observable<R>, we have to capture at two levels:

Observable.prototype.pipe = function customPipe(...operators) {
  return operators.reduce((source, operator) => {
    return operator(new Observable((sink) => {
      return source.subscribe({
        next(x) { console.log('captured:', x); sink.next(x); }
        error: sink.error.bind(sink),
        complete: sink.complete.bind(sink),
      });
    });
  }, this);
}

edit: I linked this in my answer above, but we also use lift in the unit tests to spy on the internal details of TimeoutSubscriber to validate it cancels its scheduled actions on disposal.

@benlesh
Copy link
Member Author

benlesh commented Nov 15, 2017

Thanks for commenting on this @trxcllnt, I remember almost all of this from our original work, given our concerns with size, I'm trying to look at every avenue I can to reduce the footprint of the library. That's the major catalyst here.

Another concern I have is that at some point, a standardized Observable will land, and it will not include a lift method, so any operators we create will not work with those Observables, because all of our operators rely on lift existing on the observable instance. I suppose we could make lift a standalone function, it could cost perf, but only when you first set up the observable, not per-next or even on per-subscription.

As of even a week ago, playing around with this, I'm pretty sure I want to keep lift regardless.

What do you think about moving off as a standalone function?

pipe is a different, and unfortunate animal. It's mostly on the prototype for ergonomic reasons. I couldn't think of anything prettier (that wasn't dot-chaining). I'd love it if we could have the pipeline operator, but the TC39 seems rather anti-functional programming syntax from what I've been told.

@benlesh
Copy link
Member Author

benlesh commented Jan 25, 2018

I think lift is staying as is for now

@benlesh benlesh closed this as completed Jan 25, 2018
@lock
Copy link

lock bot commented Jun 6, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jun 6, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants