Dynamically compile shaders with uniforms instead of attributes for DDS #9185

kkaefer · 2017-06-05T08:57:32Z

In 141e995, we changed the way how most values are passed to the shader from uniforms to attributes. We did this, because for DDS, we need the ability to specify a different value for every feature, e.g. to have data-driven color values. Previously, once draw call used a uniform (constant) to give all features in a draw call the same color. In 141e995, we moved to constant vertex attributes, which allows us to optionally specify vertex attribute buffers for every possible combination of DDS values. Most of the time though, these attributes are set to constants.

Constant vertex attributes are an edge case features, and the OpenGL wiki has this to say about them:

Note: It is not recommended that you use these. The performance characteristics of using fixed attribute data are unknown, and it is not a high-priority case that OpenGL driver developers optimize for. They might be faster than uniforms, or they might not.

Unfortunately, using constant vertex attributes causes a performance regression on some devices. To verify that this is the case, I created an isolated benchmark to find out what the culprit is. It consisted of ~36,000 vertices. The color shader received 6 colors as input and combined them to one color with mix. I varied the way the colors are piped through to the fragment shader in the following three combinations:

uniform vec4 u_colorN directly in the fragment shader: 36 fps
uniform vec4 u_colorN in the vertex shader, with a varying vec4 v_colorN to pass the colors from the vertex to the fragment shader: 30fps
attribute vec4 a_colorN in the vertex shader, with a varying vec4 v_colorN to pass the colors from the vertex to the fragment shader, with constant vertex attributes: 26 fps

These results make it look like this performance regression is a combination of two issues (at least on the iPad mini 2):

Passing varyings from vertex to fragment shader costs performance
Constant vertex attributes are slower than uniforms

It's worth noting, that we had a very similar issue on WebGL in IE: mapbox/mapbox-gl-js#1714. In JS, we're now using a different implementation that generates shader code at run-time, and uses uniforms when the DDS attribute is constant.

Therefore, I believe we'll have to switch our DDS code to a similar model that JS uses where we dynamically generate shader code at runtime based on whether the stylesheet uses data-driven values for a certain property.

I think the way we could support this is by adding #defines into the shader code that recompiling the shader with certain defines based on whether an attribute uses a data-driven value or not.

add #defines in the generated shader code to selectively toggle between attribute and uniform
add corresponding uniforms to DataDrivenPaintProperty-derived classes
set the uniforms as an alternative to attributes for constant data-driven values
use the correct shader with the right uniforms enabled

/cc @jfirebaugh

jfirebaugh · 2017-06-05T20:29:57Z

     VM SIZE                               FILE SIZE
 ++++++++++++++ GROWING                 ++++++++++++++
  +1.7% +43.5Ki __TEXT,__text           +43.5Ki  +1.7%
   +10% +10.4Ki __TEXT,__cstring        +10.4Ki   +10%
  +1.6% +2.69Ki [None]                  +3.49Ki  +2.0%
  +0.9% +2.58Ki __TEXT,__gcc_except_tab +2.58Ki  +0.9%
  +1.2%    +748 __TEXT,__unwind_info       +748  +1.2%
  +0.1%    +128 __TEXT,__const             +128  +0.1%
  +0.3%     +48 __DATA,__data               +48  +0.3%

 -------------- SHRINKING               --------------
  -0.2%     -10 __TEXT,__stub_helper        -10  -0.2%
  -0.2%      -8 __DATA,__la_symbol_ptr        0  [ = ]
  -0.2%      -6 __TEXT,__stubs               -6  -0.2%

  +1.7% +60.0Ki TOTAL                   +60.8Ki  +1.7%

jfirebaugh · 2017-06-05T20:38:48Z

Implementation notes:

We still generate attribute locations for all paint attributes, even if they aren't used by the shader (this is explicitly allowed by the OpenGL spec). With more work, we could generate attribute locations for only the paint attributes that are needed. This is what GL JS does. It permits another approach to situations in which we are up against the 8 attribute limit: defer the failure case to runtime, when someone tries to use too many data-driven attributes simultaneously.
ConstantAttributeBinding and related code is still present. Its use in PaintPropertyBinders is essentially a no-op, but it's also used in SymbolSizeBinder. Ideally we remove it all.
There are no limits (other than the inherent limit of possible permutations of data-driven-or-not properties) on distinct program instances, either in-memory or in the on-disk binary program cache. If you make heavy use of runtime styling or otherwise have a large variance in which properties are data-driven, it could use significant memory or disk space. I haven't tried to quantify this at all.

jfirebaugh · 2017-06-05T23:13:38Z

Bench results (iPhone 6) before:

2017-06-05 16:02:09.942756-0700 Bench GL[3994:1000717] Result:
2017-06-05 16:02:09.943195-0700 Bench GL[3994:1000717] | paris      | 52.4 fps |
2017-06-05 16:02:09.943295-0700 Bench GL[3994:1000717] | paris2     | 60.0 fps |
2017-06-05 16:02:09.943381-0700 Bench GL[3994:1000717] | alps       | 34.0 fps |
2017-06-05 16:02:09.946679-0700 Bench GL[3994:1000717] | us east    | 59.9 fps |
2017-06-05 16:02:09.946789-0700 Bench GL[3994:1000717] | greater la | 48.2 fps |
2017-06-05 16:02:09.946875-0700 Bench GL[3994:1000717] | sf         | 59.5 fps |
2017-06-05 16:02:09.946956-0700 Bench GL[3994:1000717] | oakland    | 57.8 fps |
2017-06-05 16:02:09.947037-0700 Bench GL[3994:1000717] | germany    | 30.7 fps |
2017-06-05 16:02:09.947157-0700 Bench GL[3994:1000717] Total FPS: 402.4
2017-06-05 16:02:09.947231-0700 Bench GL[3994:1000717] Average FPS: 50.3

After:

2017-06-05 15:52:51.406604-0700 Bench GL[3985:998731] Benchmark completed.
2017-06-05 15:52:51.406674-0700 Bench GL[3985:998731] Result:
2017-06-05 15:52:51.406756-0700 Bench GL[3985:998731] | paris      | 59.9 fps |
2017-06-05 15:52:51.406848-0700 Bench GL[3985:998731] | paris2     | 60.1 fps |
2017-06-05 15:52:51.406932-0700 Bench GL[3985:998731] | alps       | 40.5 fps |
2017-06-05 15:52:51.407014-0700 Bench GL[3985:998731] | us east    | 59.9 fps |
2017-06-05 15:52:51.407095-0700 Bench GL[3985:998731] | greater la | 60.1 fps |
2017-06-05 15:52:51.407177-0700 Bench GL[3985:998731] | sf         | 60.1 fps |
2017-06-05 15:52:51.407257-0700 Bench GL[3985:998731] | oakland    | 60.0 fps |
2017-06-05 15:52:51.407338-0700 Bench GL[3985:998731] | germany    | 39.3 fps |
2017-06-05 15:52:51.407410-0700 Bench GL[3985:998731] Total FPS: 439.8
2017-06-05 15:52:51.407481-0700 Bench GL[3985:998731] Average FPS: 55.0

kkaefer · 2017-06-06T11:59:33Z

Here are my benchmarks on the iPad mini 2:

benchmark	before DDS	after DDS	master	dynamic shaders
paris	30.7 fps	16.4 fps	17.8 fps	22.2 fps
paris2	32.3 fps	20.5 fps	19.6 fps	27.3 fps
alps	16.2 fps	7.8 fps	8.8 fps	14.0 fps
us east	40.0 fps	27.3 fps	25.6 fps	33.2 fps
greater la	29.5 fps	17.1 fps	17.9 fps	24.9 fps
sf	43.2 fps	25.3 fps	23.6 fps	33.5 fps
oakland	31.0 fps	19.2 fps	19.9 fps	25.3 fps
germany	18.6 fps	9.0 fps	10.1 fps	16.2 fps

Looking at the shaders, it looks like there are still attributes generated for some of the shaders even though they're #ifdefed out. Maybe calling glBindAttribLocation has an adverse effect?

kkaefer · 2017-06-06T13:24:19Z

kkaefer · 2017-06-06T14:11:47Z

Compare this with the pre-DDS state of the same program:

jfirebaugh · 2017-06-06T16:02:01Z

@kkaefer Can you run bench on iPad mini 2 one more time? I rebased on master, including 9dfc2d9, which is a significant per-frame performance speedup.

kkaefer · 2017-06-06T19:57:24Z

Results are pretty much unchanged:

benchmark	`8a341aa`
paris	21.0 fps
paris2	27.7 fps
alps	12.6 fps
us east	30.0 fps
greater la	20.2 fps
sf	33.7 fps
oakland	23.1 fps
germany	14.5 fps

kkaefer · 2017-06-13T09:14:27Z

Remaining task is to convert the size attributes for symbol programs to this format as well.

…for DDS

jfirebaugh · 2017-06-13T18:40:22Z

Capturing from chat:

Porting symbol size to dynamic shaders is difficult. It doesn't fit in the existing #pragma system. That's why it wasn't implemented that way to begin with. I think we should merge and port the current changes, and not try to switch this right now.

@ansis:

that sounds good to me. The current plan for working around the 8 attribute limit for pitch-scaling is to pack a_size into a_data
if a_size was changed to a dynamically generated separate attribute (which I guess it should be eventually) we'd need to find another way

I profiled the changes here and found #9257. With that included, FPS is maxed at 60 on my iPhone.

kkaefer added bug Core The cross-platform C++ core, aka mbgl performance Speed, stability, CPU usage, memory usage, or power usage rendering ⚠️ DO NOT MERGE Work in progress, proof of concept, or on hold labels Jun 5, 2017

kkaefer requested a review from jfirebaugh June 5, 2017 08:57

kkaefer force-pushed the dds-dynamic-shaders branch from 76909c5 to 1db4642 Compare June 5, 2017 08:58

jfirebaugh force-pushed the dds-dynamic-shaders branch from 47c249c to 3c2ab62 Compare June 5, 2017 19:56

jfirebaugh force-pushed the dds-dynamic-shaders branch from 3c2ab62 to 761a2b0 Compare June 5, 2017 20:31

jfirebaugh force-pushed the dds-dynamic-shaders branch from 761a2b0 to ca5115b Compare June 5, 2017 21:00

jfirebaugh force-pushed the dds-dynamic-shaders branch from ca5115b to 8a341aa Compare June 6, 2017 16:00

kkaefer and others added 6 commits June 13, 2017 10:22

[core] only bind uniforms that exist in the program

eb23984

[core] add shader defines for enabling/disabling attributes/uniforms …

05ccb2f

…for DDS

[core] add uniforms to DataDrivenPaintPropertys

af61758

[core] add constant DDS values as uniforms

3d487b7

[core] cleanup ProgramParameters

78d20f5

[core] Dynamic program compilation for data-driven properties

d1b1b1a

jfirebaugh force-pushed the dds-dynamic-shaders branch from 8a341aa to d1b1b1a Compare June 13, 2017 17:31

jfirebaugh mentioned this pull request Jun 13, 2017

[core] Don't upload the FrameHistory texture in frames where it's not changing #9257

Merged

jfirebaugh approved these changes Jun 13, 2017

View reviewed changes

jfirebaugh mentioned this pull request Jun 13, 2017

[core] GL performance fixes #9255

Merged

jfirebaugh merged commit 9d276f3 into master Jun 13, 2017

jfirebaugh deleted the dds-dynamic-shaders branch June 13, 2017 19:21

jfirebaugh mentioned this pull request Jun 14, 2017

[core] Eliminate constant attribute bindings #9261

Merged

1ec5 removed the ⚠️ DO NOT MERGE Work in progress, proof of concept, or on hold label Jun 16, 2017

kkaefer mentioned this pull request Jul 3, 2018

Remove unused generated OpenGL programs #12287

Merged

christophermark mentioned this pull request Jun 30, 2020

iOS build error after upgrading from 7.0.9 to 8.0.0 (on New Build System) rnmapbox/maps#932

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamically compile shaders with uniforms instead of attributes for DDS #9185

Dynamically compile shaders with uniforms instead of attributes for DDS #9185

kkaefer commented Jun 5, 2017 •

edited by jfirebaugh

Loading

jfirebaugh commented Jun 5, 2017

jfirebaugh commented Jun 5, 2017 •

edited

Loading

jfirebaugh commented Jun 5, 2017

kkaefer commented Jun 6, 2017

kkaefer commented Jun 6, 2017

kkaefer commented Jun 6, 2017

jfirebaugh commented Jun 6, 2017

kkaefer commented Jun 6, 2017 •

edited

Loading

kkaefer commented Jun 13, 2017

jfirebaugh commented Jun 13, 2017

Dynamically compile shaders with uniforms instead of attributes for DDS #9185

Dynamically compile shaders with uniforms instead of attributes for DDS #9185

Conversation

kkaefer commented Jun 5, 2017 • edited by jfirebaugh Loading

jfirebaugh commented Jun 5, 2017

jfirebaugh commented Jun 5, 2017 • edited Loading

jfirebaugh commented Jun 5, 2017

kkaefer commented Jun 6, 2017

kkaefer commented Jun 6, 2017

kkaefer commented Jun 6, 2017

jfirebaugh commented Jun 6, 2017

kkaefer commented Jun 6, 2017 • edited Loading

kkaefer commented Jun 13, 2017

jfirebaugh commented Jun 13, 2017

kkaefer commented Jun 5, 2017 •

edited by jfirebaugh

Loading

jfirebaugh commented Jun 5, 2017 •

edited

Loading

kkaefer commented Jun 6, 2017 •

edited

Loading