Rewrite the skip stages lowering pass #8115

abadams · 2024-02-21T22:47:56Z

Skip stages was slow due to crappy computational complexity (quadratic?)

I reworked it into a two-pass linear-time algorithm. The first part
remembers which pieces of IR are actually relevant to the task, and the
second pass performs the task using a bounds-inference-like algorithm.

On main resnet50 spends 519 ms in this pass. This commit reduces it to
40 ms. Local laplacian with 100 pyramid levels spends 7.4 seconds in
this pass. This commit reduces it to ~3 ms.

This commit also moves the cache store for memoized Funcs into the
produce node, instead of at the top of the consume node, because it
naturally places it inside a condition you inject into the produce node.

Built on #8103, so don't bother reviewing until that's merged. I just want to get it tested on the bots.

This pattern has been bugging me for a long time: ``` if (scope.contains(key)) { Foo f = scope.get(key); } ``` This redundantly looks up the key in the scope twice. I've finally gotten around to fixing it. I've introduced a find method that either returns a const pointer to the value, if it exists, or null. It also searches any containing scopes, which are held by const pointer, so the method has to return a const pointer. ``` if (const Foo *f = scope.find(key)) { } ``` For cases where you want to get and then mutate, I added shallow_find, which doesn't search enclosing scopes, but returns a mutable pointer. We were also doing redundant scope lookups in ScopedBinding. We stored the key in the helper object, and then did a pop on that key in the ScopedBinding destructor. This commit changes Scope so that Scope::push returns an opaque token that you can pass to Scope::pop to have it remove that element without doing a fresh lookup. ScopedBinding now uses this. Under the hood it's just an iterator on the underlying map (map iterators are not invalidated on inserting or removing other stuff). The net effect is to speed up local laplacian lowering by about 5% I also considered making it look more like an stl class, and having find return an iterator, but it doesn't really work. The iterator it returns might point to an entry in an enclosing scope, in which case you can't compare it to the .end() method of the scope you have. Scopes are different enough from maps that the interface really needs to be distinct.

…ments

Skip stages was slow due to crappy computational complexity (quadratic?) I reworked it into a two-pass linear-time algorithm. The first part remembers which pieces of IR are actually relevant to the task, and the second pass performs the task using a bounds-inference-like algorithm. On main resnet50 spends 519 ms in this pass. This commit reduces it to 40 ms. Local laplacian with 100 pyramid levels spends 7.4 seconds in this pass. This commit reduces it to ~3 ms. This commit also moves the cache store for memoized Funcs into the produce node, instead of at the top of the consume node, because it naturally places it inside a condition you inject into the produce node.

zvookin · 2024-02-21T23:25:08Z

Is there any chance this changes the values or values available to the cache key computation? (The latter would likely be a compile time error. The former could introduce a change, likely a bug, in computation.)

abadams · 2024-02-21T23:35:02Z

I don't believe so. The change was from this:

compute cache key
perform cache lookup
realize Foo {
produce Foo {
 if (cache miss) {
   compute Foo
 }
}
consume Foo {
  if (cache miss) {
    cache store
  }
 ...
}
}

to this:

compute cache key
perform cache lookup
realize Foo {
produce Foo {
 if (cache miss) {
   compute Foo
   cache store
 }
} 
consume Foo {
 ...
}
}

The cache lookup and key computation are in the same place as they were before

…stages

steven-johnson · 2024-02-26T22:30:55Z

Ready to land pending green

abadams added 7 commits February 17, 2024 17:28

Pacify clang-tidy

72bcf1d

Fix unintentional mutation of interval in scope

1f8c8b5

Fix accidental Scope::get

8d59c7c

Merge remote-tracking branch 'origin/main' into abadams/scope_improve…

7534edb

…ments

clang-tidy fixes

42ebab3

abadams added 7 commits February 22, 2024 11:40

Fix skip stages interaction with compute_with

7ac285f

Merge remote-tracking branch 'origin/main' into abadams/rewrite_skip_…

8d7256f

…stages

Unify let visitors, and use fewer stack frames for them

7c2086c

Fix accidental leakage of .used into .loaded

0aec3bc

Visit the bodies of uninteresting let chains

f77dc78

Another used -> loaded

19d8368

Fix hoist_storage not handling condition correctly.

e258e33

Merge branch 'main' into abadams/rewrite_skip_stages

4e574c3

steven-johnson self-requested a review February 27, 2024 01:56

steven-johnson approved these changes Feb 27, 2024

View reviewed changes

steven-johnson merged commit 36d74a8 into main Feb 27, 2024
3 checks passed

steven-johnson deleted the abadams/rewrite_skip_stages branch February 27, 2024 01:57

abadams mentioned this pull request Feb 29, 2024

skip_stages doesn't respect debug_to_file, leading to runtime crashes. #8129

Closed

BrewTestBot mentioned this pull request Jul 17, 2024

halide 18.0.0 Homebrew/homebrew-core#177657

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite the skip stages lowering pass #8115

Rewrite the skip stages lowering pass #8115

abadams commented Feb 21, 2024

zvookin commented Feb 21, 2024

abadams commented Feb 21, 2024

steven-johnson commented Feb 26, 2024

Rewrite the skip stages lowering pass #8115

Rewrite the skip stages lowering pass #8115

Conversation

abadams commented Feb 21, 2024

zvookin commented Feb 21, 2024

abadams commented Feb 21, 2024

steven-johnson commented Feb 26, 2024