Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@cacheFor decorator to read last-cached value on a @cached decorator - RFC #656

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

DLiblik
Copy link

@DLiblik DLiblik commented Aug 18, 2020

Rendered

Allow a conventional way for RFC 566's @cached getters to retrieve the last-returned value from within the next execution of the getter to reduce computation load for complex results that are incremental in nature.

@rwjblue
Copy link
Member

rwjblue commented Aug 18, 2020

FWIW, the thing you want is possible if you drop down a level and use the primitive caching API. Given that @cached is basically going to end up being implemented as (see @pzuraq's tracked-toolbox):

import { createCache, getValue } from '@glimmer/tracking/primitives/cache';

function cached(target, key, value) {
  assert('@cached can only be used on getters', value && value.get);

  let { get, set } = value;

  let caches = new WeakMap();

  return {
    get() {
      let cache = caches.get(this);

      if (cache === undefined) {
        cache = createCache(get.bind(this));
        caches.set(this, cache);
      }

      return getValue(cache);
    },

    set,
  };
}

You can slightly modify how that works for your specific scenario, it doesn't feel horrible to me:

import { createCache, getValue } from '@glimmer/tracking/primitives/cache';

const bidSourceCaches = new WeakMap();

function calculateBidSources(bidders, _prevs = []) {
  const bidders = this.bidders;

  // Generate a BidSource from each bidder, but avoid doing
  // it more than once per bidder.
  return bidders.map(bidder => {
    return _prevs.find(prev => prev.bidder === bidder) ||
      // expensive as it downloads initial state from server
      new BidSource(bidder);
  });
}

class OnlineItemAuction {
  /**
   Bidders who have joined this particular item auction - this array
   is set from elsewhere and grows item by item over time.
   */
  @tracked bidders = [];

  get bidSources() {
    let cache = bidSourceCaches.get(this);

    if (cache === undefined) {
      let lastValue;
      cache = createCache(() => {
        lastValue = calculateBidSources(this.bidders, lastValue);
        return lastValue;
      });
      bidSourceCaches.set(this, cache);
    }

    return getValue(cache);
  }
}

@DLiblik
Copy link
Author

DLiblik commented Aug 18, 2020

Here is what we have done so far which extends that implementation - idea was to submit this as a PR
on @pzuraq's 566 polyfill (in line with what you posted above).

The code below works very well and is concise and conventional, which helps a lot on our team (8 ember devs of a variety of skill sets). Our projects are small to very large, so folks can do a lot of damage quickly without conventional
approaches and code reviews.

export const CACHED = Symbol('@cached cache');

export function cached(...args) {
  const [target, key, descriptor] = args;
  const {get:getter} = descriptor;
  const caches = new WeakMap();

  descriptor.get = function () {
    if (!caches.has(this)) {
      caches.set(this, createCache(getter.bind(this)));
    }

    const cacheGets = this[CACHED] || (this[CACHED] = {});

    return cacheGets[key] = getFromCache(caches.get(this));
  };
}

Given the simplicity and the convenience, it feels like this is worthwhile.

@rwjblue
Copy link
Member

rwjblue commented Aug 18, 2020

I vaguely think I would prefer an implementation that didn't rely on mutating the object, something like:

  @cached
  get bidSources() {
    const _prevs = getCachedValue(this, 'bidSource') || [];
    const bidders = this.bidders;

    // Generate a BidSource from each bidder, but avoid doing
    // it more than once per bidder.
    return bidders.map(bidder => {
      return _prevs.find(prev => prev.bidder === bidder) ||
        // expensive as it downloads initial state from server
        new BidSource(bidder);
    });
  }

I don't think that access style is super fundamental to your proposal though, so it probably doesn't matter too much.

@DLiblik
Copy link
Author

DLiblik commented Aug 18, 2020

Access style is not a concern at all; convention and efficiency of
both development and review is the priority (which is why we love
ember in the first place). I'll see what I can come up with here in
the next day.

Thanks for taking the time on this btw - much appreciated.

@DLiblik
Copy link
Author

DLiblik commented Aug 19, 2020

Here is a slightly different approach. In this approach, the @cached decorator can be used two ways:

  • @cached - as described in 566
  • @cached(propName) - as desired here, where propName is the name (on the object) of a property that will be used to store the last-computed value and can be used by the developer to access it in the getter.

Benefits

  • Colloquial and intuitive - builds on understood functionality of 566 with a new option
  • Simple to teach and understand
  • Puts object mutation into developer's hands in a way that is consistent with normal JS properties
  • Allows for simple tracing (i.e. you can define a get/set on the cache property name if you
    want to see 'what's going on')

Example

Here is an example of its use:

@cached('_bidSources')
get bidSources() {
  const _prevs = this._bidSources || [];
  const bidders = this.bidders;

  // Generate a BidSource from each bidder, but avoid doing
  // it more than once per bidder.
  return bidders.map(bidder => {
    return _prevs.find(prev => prev.bidder === bidder) ||
      // expensive as it downloads initial state from server
      new BidSource(bidder);
  });
}

Sample Implementation

Here is an implementation built on top of @pzuraq 's code:

/**
 Decorator function callable as:
 - `@cached` - per RFC 566
 - `@cached(propName)` - per this RFC
 */
export function cached(cachePropName) {
  // Handle unbracketed form
  if (arguments.length === 3) {
    return decorator(undefined, ...arguments);
  }

  assert([
      '@cached cannot be called as a function without arguments,',
      'i.e. as @cached() - just use @cached.',
    ].join(' '),
    arguments.length);

  assert([
      '@cached takes a single string parameter which is the name of the',
      'property to store the last-computed cached value in.',
    ].join(' '),
    arguments.length === 1 && typeof cachePropName === 'string');

  return (...args) => decorator(cachePropName, ...args);
}

/**
 Actual decorator code which caches, and optionally stores the
 cached value in `cachePropName` if provided
 */
function decorator(cachePropName, target, key, descriptor) {
  const {get:getter} = descriptor;

  assert(`@cached only works on property getters. '${key}' is not a getter.`,
    typeof getter === "function");

  const caches = new WeakMap();

  descriptor.get = function () {
    if (!caches.has(this)) {
      caches.set(this, createCache(getter.bind(this)));
    }

    const result = cacheGetValue(caches.get(this));

    return cachePropName ?
      (this[cachePropName] = result) :
      result;
  };
}

@DLiblik
Copy link
Author

DLiblik commented Aug 19, 2020

...if better differentiation is preferred between caching without a storage property (RFC 566) and with (this RFC), then this could also be done with two separate decorators:

  • @cached - RFC 566
  • @cachedAs(propName) - this RFC

Personal experience tells me performance issues that drive going from the first to the second are not always immediately visible and therefore it's more convenient to simply add the parameter to progress into the second case from the first (without having to change your imports possibly), then again, differentiation may help 'memoize' the two concepts into the developer brain a little better 😄

@DLiblik
Copy link
Author

DLiblik commented Aug 19, 2020

@rwblue FWIW, I challenged myself on why I wanted to avoid calling getCachedValue in the getter itself as you suggested in your first reply. The reason I landed on is that, as everyone recognizes, Ember requires a new adopter (even if capable with JS) to assimilate a lot, and so keeping the feature "inside the decorator" lowers the "assimilation load" (haha there's a term for you...).

If the person understands decorators, then the leap from there to this decorator (as 566 or 656) is small and requires no knowledge of what I agree is a more advanced layer of Ember (calling the caching/tracking methods directly). Doing it this way provides access to this feature set without anything more than application of a basic decorator - which saves time on teaching and adoption.

@DLiblik
Copy link
Author

DLiblik commented Aug 19, 2020

One other thought: in the above parameterized version, providing a parameter to the decorator creates a new property on the target object - this may be considered too 'magical' - i.e. @cached('_bidSources') in the example creates a _bidSources property on the target that is not very visible and is not in line with other declaration conventions like @tracked someProp.

For conventional reasons then, the pattern could be inverted by introducing @cacheFor('bidSources') _bidSources - in this mode, the code would look like below, which is probably more congruent. Implementation-wise this is very simple once 566 is in place.

Example

@cacheFor('bidSources') _bidSources = [];

@cached
get bidSources() {
  const _prevs = this._bidSources;
  const bidders = this.bidders;

  // Generate a BidSource from each bidder, but avoid doing
  // it more than once per bidder.
  return bidders.map(bidder => {
    return _prevs.find(prev => prev.bidder === bidder) ||
      // expensive as it downloads initial state from server
      new BidSource(bidder);
  });
}

@DLiblik
Copy link
Author

DLiblik commented Aug 20, 2020

After running through a number of scenarios trying out the various ergonomics, the @cacheFor() approach seems the most convenient, conventional, intuitive and idiomatic to Ember.

I am updating the RFC text to reflect this.

@DLiblik DLiblik changed the title Initial submission of read-cached-last-value RFC @cacheFor decorator to read last-cached value on a @cached decorator - RFC Aug 20, 2020
@buschtoens
Copy link
Contributor

I would also prefer not mutating the host object. @rwjblue's getCachedValue(host, key) from #656 (comment) looks great to me.

Due to the synchronous, single-threaded nature of JS we could even make the parameters of getCachedValue() optional. I don't know if that magic would be worth the convenience though, or whether it might just end up being more confusing.

I'm not really a fan of adding another decorator to access the currently cached value and personally think that'd be more confusing than the getCachedValue function.

btw, I'm not sure, if you saw, but I already made a polyfill: ember-cached-decorator-polyfill
You could try test-driving your proposal(s) as draft PRs there.

@buschtoens
Copy link
Contributor

buschtoens commented Aug 20, 2020

One issue I see with the extra @cacheFor() decorator is TypeScript support. The property type could not be inferred, so users need to explicitly type it, whereas getCachedValue(host, key) could infer it:

function getCachedValue<H extends object, K extends keyof H>(host: H, key: K): H[K] | undefined;

@DLiblik
Copy link
Author

DLiblik commented Aug 20, 2020

@buschtoens, thank you - we are using your polyfill and have extended it to support both decorators (@cached and @cacheFor) locally - I was the one that referenced "John Dies at the End". I will PR on top of that except that right now we are unable to get your code to import (I will open a separate issue there) - building dies on not being able to find the extra imports (we npm'd your package in, then copied the code into a local file to get it up and running).

Re: TypeScript - agreed, but the explicit typing is no different than that required on any @tracked property that is not initialized to a representative value. Requiring an explicit type will catch a type mismatch in the @cached getter when it does the read itself, so this seems ok in exchange for the readability (and providing an initialization value, like with @tracked would inform the inference). For review purposes, it also makes it very evident (obvious presence of the decorator) when this use case is being applied. Also, having the extra property gives a single point of retrieval for the value - if the mechanics of the cache itself evolve, this abstracts code from those changes, creating greater freedom for advancement.

Re: dropping parameters: I would prefer to avoid magic (or, where needed, hide it into semi-magical constructs like decorators themselves), and so contextualizing the call to getCachedValue such that no parameters are required based on a sync assumption to me, though convenient, feels like it would increase the bewilderment factor for end-developers.

As a teaching trial, yesterday I tried explaining it both ways and found that introducing the scenario where this is beneficial was all that was needed to then introduce both decorators, whereas getCachedValue involved continuing the explanation to describe the plumbing of what this method does in the background and how it finds the value, etc. For some reason, the decorator syntax comes along with some kind of "that's just how it works" feeling. Whether that's beneficial or not is another matter, but it sure accelerates adoption and acceptance and therefore reduces time to productive output - and that helps adoption/morale.

We then applied the two decorators to a large-scale project here as a test PoC and it dramatically improved the readability of the target classes (7 cases). It reads very well from a code-review perspective, and so again I'm ok to trade the missing inference on the decorator for the clarity of the code and not having to teach our junior devs about what getCachedValue does - emberizing JS developers quickly is challenging already 😄 and getting folks to productivity fast is important I believe.

@DLiblik
Copy link
Author

DLiblik commented Sep 13, 2020

An update - we have been running with the @cacheFor() decorator now in a large project for close to a month and the results have been very positive. The simple ergonomics seem to help with both adoption and readability of code, and as expected nobody needs to be directly aware of the cache implementation (i.e. versus using direct call to the more advanced underlying auto-tracking cache which creates implementation variability.

@rwjblue (and others of course) would be interested in whether you think this fits into Ember's general goals, otherwise I don't want to press on people's time.

import Ember from "ember";
import { assert } from "@ember/debug";

const { _createCache: createCache, _cacheGetValue: cacheGetValue } = Ember;

const caches = new WeakMap();

/**
 Modifications to cached decorator to lift caches variable.
 */
export function cached(target, key, descriptor) {
  const {get:getter} = descriptor;

  descriptor.get = function () {
    let cacheForObject;

    if (caches.has(this)) {
      cacheForObject = caches.get(this);
    }
    else {
      cacheForObject = {};
      caches.set(this, cacheForObject);
    }

    const cacheForProp = cacheForObject[key] ||
      (cacheForObject[key] = {
        cache: createCache(getter.bind(this)),
        // could be replaced with call to internal cache property
        lastValue: undefined,
        // could be replaced with check on internal cache state version
        hasBeenCalled: false,
      });

    const value = cacheForProp.lastValue =
      cacheGetValue(cacheForProp.cache);

    // Necessary to track so it's known if the initial value should be
    // returned by cacheFor below
    cacheForProp.hasBeenCalled = true;

    return value;
  };

  delete descriptor.writable;

  return descriptor;
}

/**
 Creates a read-only property that returns the last cached value for
 another property that is decorated with @cached.

 _@cacheFor('bidders') _bidders = [];

 _@cached
 get bidders() {
   // Return the prior value along with a new value.
   return [
     ...this._bidders, // gets previous value of bidders property from _bidders
     new BidderWhichIsExpensiveToCreate(),
   ];
 }

 @param cachedPropName Name of the property this decorator tracks
 the last value of
 */
export function cacheFor(cachedPropName) {
  assert([
    '@cacheFor takes a single string parameter which is the name of the',
    'property to store the last-computed cached value for.',
  ].join(' '), arguments.length === 1 && typeof cachedPropName === 'string');

  return function(targetPrototype, key, descriptor) {
    const {initializer} = descriptor;

    descriptor.get = function () {
      let cacheForObject;

      if (caches.has(this)) {
        cacheForObject = caches.get(this);
      }
      else {
        cacheForObject = {};
        caches.set(this, cacheForObject);
      }

      const cacheForProp = cacheForObject[cachedPropName] &&
        (cacheForObject[cachedPropName] = {});

      if (!cacheForProp.hasBeenCalled) {
        Object.assign(cacheForProp, {
          hasBeenCalled: true,
          lastValue: initializer && initializer(),
        });
      }

      return cacheForProp.lastValue;
    };

    // Class field has been converted to a getter
    delete descriptor.writable;
    delete descriptor.initializer;
  }
}

@buschtoens
Copy link
Contributor

buschtoens commented Sep 13, 2020

I still think that either exposing the low-level createCache and cacheGetValue functions or a getCachedValue function is superior. It would allow you to very easily build a @cacheFor decorator on top of it in user-land code. Besides the better TypeScript support, it would also give more power to library authors: @cacheFor only works, if it's been purposefully added to the class by the user. If you're dealing with foreign objects that did not apply the @cacheFor decorator, you cannot access the cache.

I just wanna toss in another idea, that is orthogonal to getCachedValue. Technically, we could allow the following:

class {
  @tracked bidders = [];

  @cached
  get bidSources() {
    const {
      // Returns cached value or `undefined` on first computation.
      bidSources: previousBidSources = [], 
      bidders
    } = this; 

    // Generate a BidSource from each bidder, but avoid doing
    // it more than once per bidder.
    return bidders.map(bidder => {
      return previousBidSources.find(prev => prev.bidder === bidder) ||
        // expensive as it downloads initial state from server
        new BidSource(bidder);
    });
  }
}

Since getters are evaluated synchronously, we can detect that the access to this.bidSources in L4 happens during a recomputation of the cache and thus we could return the previously cached value instead of entering an infinite loop.

This is kinda similar to how @tracked works under the hood. Neatly this would also work well with TypeScript. The only weirdness is during the initial computation, where the return value is undefined, which does not match the types.

@buschtoens
Copy link
Contributor

I've done a PoC implementation of this here: ember-polyfills/ember-cached-decorator-polyfill#22

@DLiblik
Copy link
Author

DLiblik commented Sep 13, 2020

@buschtoens Hmmmm. Something about asking the same question again and getting a different answer reminds me of the adage of doing something again and expecting a different result...

...that said, it's pretty easy to assimilate this, and it has the benefit of containing the two-position mechanism (the property and it's prior-cached value) in a single place (the getter). It also blocks accidental misuse of the prior-cached-value property by any other code.

Quite literally "it's crazy but it works"... oh man... couldn't resist... sorry...

I'll give this a try for a couple weeks with our group and see how it plays - will report back then.

@buschtoens
Copy link
Contributor

Regarding getCachedValue from #656 (comment), I think implementation-wise the RFC #615 "Autotracking Memoization" (cache primitive) should be amended to add a peekValue(cache) function that allows to peek into the current cache value without triggering a (re-)computation.

There should then be a cacheFor(object, key) function that returns the underlying Cache instance for a property that was decorated with @cached. getCachedValue would then just be:

function getCachedValue<H extends object, K extends keyof H>(host: H, key: K): H[K] | undefined {
  const cache = cacheFor(host, key);
  assert(`${host} does not have a property '@cached ${key}.'`, cache);
  return peekValue(cache);
}

@DLiblik
Copy link
Author

DLiblik commented Sep 13, 2020

One quick thing - bidSources 'exists' on the class, so will the default array initializer in your example above ever hit? i.e. does the array get assigned if the return value of the assignment is undefined or only if the property itself does not exist? Not a big deal, but does mean that the convenience of the assignment approach may not be so simple.

@buschtoens
Copy link
Contributor

This is what I was referring to with:

Neatly this would also work well with TypeScript. The only weirdness is during the initial computation, where the return value is undefined, which does not match the types.

During the very first computation, this.bidSources will return undefined, as the cache has not been populated yet. This definitely is weird and also kind of an issue for TS, as TS would not include undefined in the type.

@DLiblik
Copy link
Author

DLiblik commented Sep 13, 2020

Sorry, I mean (even in JS):

  const {
      // Returns cached value or `undefined` on first computation.
      bidSources: previousBidSources = [], // will this default ever work given that bidSources exists (even if undefined)?
      bidders
    } = this; 

@buschtoens
Copy link
Contributor

Yes. Doesn't make a difference. 😄
We are "hijacking" the getter behind the scenes and can return a different value, if we like. The getter written by the user is not re-evaluated again, when the property accesses itself.

But considering how confusing this already appears to be, I don't think that this is a good solution. 😅

@DLiblik
Copy link
Author

DLiblik commented Sep 13, 2020

Nope it's good - you are right - ES5 is this:
var _a = this.bidSources, previousBidSources = _a === void 0 ? [] : _a;

I was concerned (and had never had occasion to confirm) whether the test on assignment to _a was done with an equality to undefined (void 0) vs. Object.keys(this).contains('bidSources') ? _a : [];

@DLiblik
Copy link
Author

DLiblik commented Sep 13, 2020

...to expand, I was concerned that on the first get, using destructuring as in your example would result in bidSources being undefined because the property exists - i.e. I was under the wrong (or at least suspicious) impression that if the target being destructured has the desired key, then even if the returned value is undefined, the initializer would not be used, forcing this instead:

  const previousBidSources = this.bidSources || [];

...which would create a confusing barrier to when destructuring could be used... that's clearly not the case - even if the property exists and returns undefined, the initializer will be used. My bad.

@DLiblik
Copy link
Author

DLiblik commented Sep 13, 2020

(your implementation is perfectly clear to me - I am more concerned with the ergonomics - i.e. the fantastic ways our developers will unintentionally invent to throw themselves off a cliff - our code review process can only catch so many falling bodies on any given day...)

@DLiblik
Copy link
Author

DLiblik commented Sep 14, 2020

@buschtoens haha ok so that went badly which I did not expect... it seems that the self-property-reading explanation was not straightforward. Though I personally like that approach better for the reasons stated above, the "recursion behaves differently" aspect was not easy to convey (or rather it lead to a lot of "how does that work??" which is contrary to the objective here - easy teachability - vs. a leap directly to a rather advanced element of how memoization works).

So... I think the separate decorator is the way to go. This is unfortunately a not-trivial pattern that yet needs to be understood by junior devs for efficiency of work assignments in a team. It's not always clear that this pattern will be needed until a dev is into the work (otherwise assignment could be reserved to intermediate+ devs), and so the mechanism for this needs to be straight ahead.

Teaching this as a solution to a performance scenario as a separate tool seems straightforward: "have this issue? Use this tool." The alternative of "have this issue? Recurse yourself again and build on the result" is a bit mind bending it seems, despite its elegance.

@pzuraq
Copy link
Contributor

pzuraq commented Sep 27, 2020

Just to chime in here, so far we've been pretty hesitant to expose any kind of peekValue functionality directly in the framework. While it is possible to layer on top of the existing primitives, via duplicating the value whenever it is set and reading the duplicated, non-tracked value, it is a pattern that can very easily introduce bugs. The worry is essentially that users will reach for it as soon as they encounter the "you've attempted to update this value, but it was already consumed" assertion, just to get around the assertion, when normally that means there is something very wrong with what you are trying to do.

Maybe it is alright to add something like that at the primitive level, but I would also like to try to find another way to address this specific use case. My feeling is that this is something that is very valid - using the previously computed value to build the next one - and something we should support either way. I have to think about this a bit more, just saw this RFC for the first time today!

@DLiblik
Copy link
Author

DLiblik commented Sep 27, 2020

@pzuraq I definitely would not want to try and circumnavigate consumption tracking of values - that is a very different situation than what we are focused on here and if that is possible here, that's certainly not the purpose.

Our issue is that in the world of "stuff happening at almost the same time but not quite", like the bidders joining in, or an online meeting, etc., there are expensive setups that get done over and over. This was not an issue in Ember Classic because the @computed process worked no matter where the request came from. Now, in glimmer, a very significant performance degradation has occurred with expensive properties that are used directly by other properties (i.e. bidSources.length would be a good one - recalculating the whole thing just to get its length).

I love where glimmer is at right now, but this shortcoming has been a very challenging case for us to deal with because it initially was very hard to see what went sideways. Now that we understand it, we use the @cacheFor decorator to ensure that recalcs are limited in scope. My hope is that glimmer catches up to where computed(...) is - that who reads the property (template engine vs. other code) does not change how the property is read (from cache because nothing changed vs. running it again).

I'm very confident glimmer will get there, but in the meanwhile, we can't hold ourselves back to pre-tracking, but to get our entire dev team a conventional pattern to address this has proved challenging - this decorator has made that transition very workable.

@pzuraq
Copy link
Contributor

pzuraq commented Jan 26, 2021

So thinking more on this proposal, I think it opens up some tricky problems about derived state. Currently, whenever we trigger a recomputation of a value, the idea is we rerun the full computation, and it accesses all of the values that made up that value, consuming them (e.g. making them signals, so that we know when the state changes).

The issue with using the previous state as input state to the next state is, what do you consume? The two options seem to be:

  1. Consume all of the values that made up the previous state again, even if they aren't actually used
  2. Do not consume previous state, only new state that is accessed the second time

Neither of these seem obviously correct to me. I think we need to spend some time thinking through this.

Coming back to the original motivating example, however, I actually think that there is an alternative to do this than by using the previous value:

import { tracked } from '@glimmer/tracking';
import { cached, cacheFor } from '...';

class OnlineItemAuction {
  @tracked bidders = [];
  
  bidSources = new WeakMap();

  @cached
  get bidSources() {
    const { bidders, bidSources } = this;

    // Generate a BidSource from each bidder, but avoid doing
    // it more than once per bidder.
    return bidders.map(bidder => {
      let bidSource = bidSources.get(bidder);

      if (bidder === undefined) {
        bidSource = new BidSource(bidder);
        bidSources.set(bidder, bidSource);
      }

      return bidSource;
    });
  }
}

This solution requires more memory, since we are now storing the BidSources in a WeakMap as well as the array, but it also would be less CPU intensive, since we no longer have to find the array for each bidder. What do you think?

@DLiblik
Copy link
Author

DLiblik commented Jan 26, 2021

I believe it's 2 (consume only new state) based on reading the implementation code for the cache, glimmer VM, etc. and it seems to make sense: just by being there the signalling you are talking about has fired at least to the degree that this property originally required - and if existing items must be recomputed versus just adding new then that will update the consumption tree - if they don't, then like a property that on second run uses different factors (because of a if/else split when the 2 branches use different inputs) the consumption will change. That includes if the items in the array themselves are updated, but that only matters if this getter reads them (in which case their alteration will re-fire the dependency). I.e. the getter is deterministic only up to the properties that it must read to establish its new state, not the properties that it could read or has read.

The pattern you show above works - in fact that's what we used to do - but all you've done is create an in situ @cacheFor - i.e. it might be the box and not the contents that is the struggle (and I'm all good to put a different wrapper on it if you who are closer to the kernel of Ember see reasons to).

Empirically/anecdotally, we've been using the @cacheFor in production code now for 4 months on a few systems and it's been great. Performance improved along with code simplified. I recognize that it smells like an anti-pattern but it walks and talks like a reliable device so we keep using it.

@pzuraq
Copy link
Contributor

pzuraq commented Jan 26, 2021

I believe it's 2 (consume only new state) based on reading the implementation code for the cache, glimmer VM, etc.

Right, that is the way it would work based on the way autotracking works today. But I don't believe this is necessarily correct.

It's possible that it does work for some situations, but it doesn't work for many that we haven't considered yet. I could imagine, for instance, that the user builds up the initial state based on an array like so:

import { tracked } from '@glimmer/tracking';
import { cached, cacheFor } from '...';

class OnlineItemAuction {
  @tracked bidders = [];

  @service locale;

  @cacheFor('bidSources') _bidSources = [];

  @cached
  get bidSources() {
    const _prevs = this._bidSources;

    if (_prevs) {
      _prevs.forEach(bidSource => bidSource.updateLocale(locale.current));

      return _prevs;
    } else {
      return this.bidders.map(bidder => new BidSource(bidder, locale.current));
    }
  }
}

This logic contains a bug, because on subsequent reads we don't re-access the bidders to recompute them, so the list will update if we update the locale (and correctly update the bid sources) but it won't create new bid sources.

It's possible this is a non issue, this example is contrived. But I would like to make sure that this is not a common case, or is incorrect in general for other reasons. In addition, doing 1 will only make us possibly recompute more often, which isn't necessarily a bad thing.

The pattern you show above works - in fact that's what we used to do - but all you've done is create an in situ @cacheFor

Right, and this is actually a very valid pattern. Caching like this is something I would encourage in general, regardless of whether or not we introduce @cacheFor or some other way of accessing the previous cache value, as it's a very solid pattern for deduplication and optimization when necessary. I think the only downside is potentially the extra memory usage.

@DLiblik
Copy link
Author

DLiblik commented Jan 26, 2021

...actually @pzuraq I think I see what you are questioning and you may be right: the subsequent dependency state won't include the initial if it is not re-read on the re-computation. We don't use it that way, but it would, I think, fail in that circumstance. Here is the difference as I understand it from how we're using it and what you are getting at:

  1. (undefined) => use X=[A,B] to create C,D => result [C,D]
  2. [C,D] => use X=[A,B,E,F] to add G,H => result [C,D,G,H]

In the above, if any change of A,B would require altering C,D being there, then after 2, that won't occur. But that's a bug, and is no different than another bug type that doesn't read a dependency that it needs to check based on bad logic.

To put it a different way, if A,B are determinants of C,D, then # 2 would re-read them, as in:

  1. (undefined) => use X=[A,B] to create C,D => result [C,D]
  2. [C,D] => use X=[A,B,E,F] to confirm C,D (causes read of A,B) to ensure they still get to be here, use E, F to create G,H => result [C,D,G,H]

If C,D get an infinite free ride in the result after first eval, then by definition they are not dependent on A,B and so it's not a bug. If they are dependent on A,B then that must be re-established as it would in "regular" - i.e. non @cacheFor code.

@DLiblik
Copy link
Author

DLiblik commented Jan 26, 2021

Yes - we've arrived at the same point - but your code is a bug because as written, you are establishing by declaration that only one round of creation is required - i.e. your code is not buggy if the process is that the initial bidders are established, and then updates are done only when locale changes. It is a bug if that's not the case, but that's not a mechanical gap in @cacheFor, it's a procedural shortcoming in that code.

And if written to fix that bug, then the consumption tree will properly update as well 👍

@DLiblik
Copy link
Author

DLiblik commented Sep 29, 2021

We're another 9 months into using @cacheFor and it's been very successful for us - but I want to check in here because, if there is a "more glimmer way" to do this now, then I suggest this RFC be retired and we'll switch. On the other hand, this decorator has become an essential part of our conventions and, though not used often, when it is used it is literally critical to performance.

@NullVoxPopuli
Copy link
Contributor

Are there other use cases for cacheFor? I don't mean to be dense, but the RFC's example is just one example, and I get that it's an appendily updated list. So.. maybe a currency trading UI? Are there others?

To me, it seems specific enough to fit best in library-space? Do we have the primitives for that?

@DLiblik
Copy link
Author

DLiblik commented Mar 21, 2022

These are the use cases where we are currently using it:

  • Virtual online communications (the array of communications ports as people join/leave)
  • Bidding
  • Cryptocurrency trading
  • Joining multiple discussion rooms
  • Monitoring apps (security, live drone feeds, etc.)
  • Remote medical applications (remote ultrasound feeds and/or specialists joining from various locations around the world during an operation)
  • Join annotation sessions (think people all editing a document at once)

...basically anything where there are "expensive" resources in a slowly-changing list... but it turns out that, to some degree, many not-so-expensive resources in lists benefit from not being recycled. Ember does a lot of this on the UI side for you via glimmer, but at the service/model layer this technique works very, very well.

(and sorry @NullVoxPopuli to have taken so long to respond - somehow I missed the alert on your update)

@NullVoxPopuli
Copy link
Contributor

slolwy changing expensive-to-calculate lists of things makes sense.

But I wonder if there is another way to approach that general problem space?
like, when you change one thing in a list, everything else in the list shouldn't also change -- all of that stuff shouldn't be dirtied.

Now, if you're using immutable patterns with tracking, you're kind of throwing the implicit cache out the window.
(stuff like:

this.collection = [...this.collection, newItem];

TrackedArray and I think deep-tracked do this automatically, where you use "the MDN APIs", and the auto-tracking is as optimized as it can be.

another place where I've seen folks lose performance is in each loops, where maybe a key='some prop' would help keep rendering of un-changed things stable (like if for some reason you're not retaining object identity in your lists).

@DLiblik do you have any open source examples of these scenarios? I'd like to take a look :partyfrog:

@DLiblik
Copy link
Author

DLiblik commented Mar 21, 2022

We are a custom dev shop (building ember and other apps "for-hire") so we don't have any open source examples I can share directly, but your read is spot on: we don't allow for mutation of arrays ever in our projects. We did a bit of analysis a couple of years ago and found out that code issues, overhead, API peculiarities (like 'pushObjects' vs. 'push'), etc., related to mutable arrays were a big cost.

And so the issue isn't around dirtied items, it's that the view-model of new items is generated in the property exactly as shown in the bidding example above, and since glimmer will re-call the getter when new items come in, the existing items need to be cached somewhere so they can be re-fed (vs. created as well).

A tracked array might solve the issue, but then a separate process needs to set the array on the component, and that leads to observation rather than binding, which we also want to avoid (we leave all observation to ember for UI matters, and that along with immutable arrays have been the single biggest payoff in coding productivity by no small margin).

Basically the patterns being pushed by ember (avoid observers, stay immutable where possible, etc.) have worked very well but have also lead us to this pattern, and the past 18 months have proven it to be highly effective and efficient to test, debug, code, and use.

We're happy to just "do it in our own code" - the RFC isn't essential and our pattern is delivering very well for us. I also recognize that at the scale of our projects (1,000-3,000 files in the app folder across 5 that happen to be under my repo root right now) I also think we may be pushing ember outside the "common area" of what most folks are doing with it (another ex. is that we had to stop using ember-data completely a couple years ago due to all kinds of issues that we never saw at smaller scales - it was absolutely wonderful until it wasn't).

@NullVoxPopuli
Copy link
Contributor

is there any chance you can whip up an example in open source, so I can see concretely what you're running in to? It'd be a huge help <3

@DLiblik
Copy link
Author

DLiblik commented Mar 24, 2022

Here is a sample straight out of one of our projects (this is live code today) for a system that runs a conferencing app (online meetings via WebRTC). It's JS (not TS) but the code is pretty clear.

The commManagers property creates comm managers "on demand" as participants join the meeting. This structure is clean and simple - as the participants array grows on its own property (separately managed), this property returns the necessary communications managers, which are only required if the UI is active (there are some cases where monitoring of the meeting is done and this property will not be invoked, in which case we do not want to generate these expensive instance classes).

Note that participants come and go (and sometimes get disconnected and reconnect, but that needs a new commManager instance for technical reasons) and so this property is regularly and dynamically being invoked, if the user is in the meeting.

  @cacheFor('commManagers') _commManagers = [];

  @cached
  get commManagers() {
    const {
      participants,
      outboundCommManager,
    } = this;

    const _prev = this._commManagers;
    const cms = [];

    for(const party of participants) {
      let cm;
      if (party.isMe) {
        cm = outboundCommManager;
      }
      else {
        cm = _prev.
          find(cmPrev => !cmPrev.isDestroyed &&
            cmPrev.party.uHash === party.uHash) ||
          new InboundCommManager({
            party,
            meetingManager: this,
          });
      }
      cms.push(cm);
    }

    const toDestroy = _prev.filter(
      cm => !cm.isDestroyed && !cms.includes(cm));

    if (toDestroy.length) {
      scheduleOnce('afterRender',
        () => toDestroy.forEach(d => d.destroy()));
    }

    return cms;
  }

@DLiblik
Copy link
Author

DLiblik commented Mar 24, 2022

...more points on the above and why it's done this way:

  • the driving property, participants changes asynchronously with UI - for example, someone can be "observing" the meeting without being in it - in that instance we want participants to be tracking that but we don't want to be creating comm managers
  • point 1 can change for a user - they can also see who is in the meeting before they themselves join - and so in this case, when the user joins and the UI is presented, it 'pulls' on commManagers and we get the desired result: lazy creation
  • if the user leaves the room but not the meeting, we need to destroy the commManagers even though participants is still tracking and not empty. This happens based on a separate function (not shown) which calls destroy on the the comm managers - which works great because commManagers is not reinvoked since the UI is gone, and so everything is tidy. If the user re-enters, the destroyed comm managers are ignored in the above code. It all Just Works(tm).
  • This pattern is composable: any other properties that depend on commManagers are consequently also lazily managed, and that ends up being a Good Thing(tm) as well.

Basically, we are trying to provide composition conventions to our developers at the lowest level of prior experience possible and this approach (I'm open to others) has been the only one that has enabled our more junior developers to successfully code against these critical aspects of these types of applications. Plus we can teach it in advance - i.e. it's enough of an abstraction with clear boundaries that you can explain it with simple analogies and/or existing code and have a high success rate at developers seeing that pattern themselves in the field.

@NullVoxPopuli
Copy link
Contributor

NullVoxPopuli commented Apr 17, 2022

Sorry it's taking so long to reply, been doing a bit too much 😅

But, what you've shown looks exactly what resources help solve.

As a huge disclaimer, you have no obligation to change anything you currently have working -- the following is only how I would solve the problem.

For example,

The commManagers property creates comm managers "on demand" as participants join the meeting.

Can be implemented with ember-resources/util/map:

import { map } from 'ember-resources/util/map';

class Demo {
  commManagers = map(this, {
     data: () => this.participants,
     map: (participant) => {
        let commManager  = new InboundCommManager({
          party: participant,
          meetingManager: this,
        });
        
        associateDestroyableChild(this, commManager);
        
        return commManager;
     }
  });
}

as this.participants changes, so does the InboundCommManager.
this particular utility, map utilizes object-identity (of the participants) to cache the InboundCommManagers. For folks new to this pattern, as that list updates, it's important to note that immuatibility should be avoided -- like { id: this.participants.length ...participant } when updating participants is not so great for efficient auto-tracking.

Goal achieved: lazy creation

As for destruction / handling the disconnects, I think that should happen within the InboundCommManager, using @ember/destroyable:

import { registerDestructor } from '@ember/destroyable';


class InboundCommManager {
  constructor({ party, meetingManager }) {
    // ...
    
    registerDestructor(this, () => {
      this.destroy();
    });
  }
}

full disclaimer tho, this map utility, does not call destroy on records removed from data (it probably should! (PRs welcome))

@DLiblik
Copy link
Author

DLiblik commented Apr 19, 2022

@NullVoxPopuli I really appreciate the argument, but although we could switch to use an external library, we prefer not to unless it's a complex and deep issue. Thin wrappers (lodash, or ember-validators, or...) end up growing large or going into "abandoned adoption" or worse, stagnate with bugs in which case we end up adopting the bits out of them at some point where we hit a bug and then are forced into an unexpected hack-a-thon if we discover it's external.

I believe embedding this in ember is best - and if ember-resources becomes a part of ember (or some variation of that or this or something else) then we'd switch to it for sure. For now, we have fine-grained control, our dev team understands it, and it's a bit more explicit. I also find that decorators "look like how they work" - they augment what's there. Moving back to the "function as property declaration" style of abc: map(...) is not something we are keen to reintroduce - we were happy to see it go from ember via decorators (I know they are functions, but they "show intuitively").

I'll leave this RFC open until either ember provides a solution directly and for now I think we'll continue with what we have.

Again - I really appreciate the input.

@wagenet
Copy link
Member

wagenet commented Jul 25, 2022

Sounds like there's still interest to get this moving. I'll do what I can to facilitate that.

chriskrycho added a commit to chriskrycho/signalis that referenced this pull request Nov 20, 2022
Peeking a value in contexts other than internal parts of the reactivity
system itself tends very strongly to produce bugs, because it decouples
consumers from the root state. (It is very, very tempting to wire your
own caching on with a "peek", rather than using caching tools composed
out of the core primitives, or to "be smarter" than the signal system.)

For an interesting background discussion from the history of Glimmer's
similar tag/signals system, see [here][github].

[github]: emberjs/rfcs#656 (comment)

This doesn't *force* us to keep that, but it sets a nice precedent and
means if we *do* introduce `peek()` as public behavior, we'll have a
clear indicator that we need to make the choice explicitly.
@wagenet wagenet added the S-Proposed In the Proposed Stage label Dec 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-Proposed In the Proposed Stage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants