forked from rxi/rxi.github.io
-
Notifications
You must be signed in to change notification settings - Fork 0
/
200402.html
175 lines (166 loc) · 9.19 KB
/
200402.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
<html><head><style>.logo { width: 160px; height: 160px; background: #e81633 url("logo.png"); background-size: 160px 160px; display: block; border: 0; } .logo:hover { padding: 0; margin: 0; background: #000 url("logo.png"); background-size: 160px 160px; filter: grayscale(100%) contrast(180%); } .date { margin-bottom: 30px; margin-top: -8px; display: inline-block; } .date_updated { display:inline-block; padding-left: 30px; font-style: italic; } body { margin: 60px auto; margin-bottom: 160px; max-width:750px; line-height:1.6; font-family: 'Inconsolata', monospace; color:#000; padding:0 30px; } pre, blockquote { font-family: 'Inconsolata', monospace; border-left: 1px solid #000; padding: 6px 18px; } code { font-family: 'Inconsolata', monospace; padding: 1px 4px; border: 1px solid #000; } h1,h2,h3 { line-height:1.2; padding-top: 18px; margin-bottom: 10px; font-weight: normal; } a { color: #000; text-decoration: none; border-bottom: 1px solid #000; } a:hover { padding: 4px; margin: -4px; color: #fff; background: #000; text-decoration: none; } ol { counter-reset: ol-counter; list-style: none; margin: 0; padding: 0; } ol li:before { counter-increment: ol-counter; content: "0" counter(ol-counter) ". "; } ul { list-style: none; margin: 0; padding-left: 19px; } ul li:before { content: "- "; margin-left: -19px; width: 19px; display: inline-block; } img { margin: 0 auto; display: block; }
</style></head><body><meta name="viewport" content="width=device-width, initial-scale=0.45">
<title>rxi</title>
<script>
document.body.onload = function() {
var title = document.getElementsByClassName("title")[0];
if (title) { document.title = title.innerText; }
var sections = document.getElementsByClassName("sections")[0];
if (sections) {
var elems = document.getElementsByTagName("h2");
var html = "<ol>";
for (var e of elems) {
e.id = e.innerText.replace(" ", "-");
html += "<li>";
html += "<a href='#" + e.id + "'>" + e.innerText + "</a>";
}
html += "</ol>";
sections.innerHTML = html;
}
};
</script>
<a class="logo" href="index.html"></a>
<h1 class="title">
Cached Software Rendering
</h1><div class="date">
2020.04.02
</div><div class="date_updated"> updated:
2020.04.03
</div>
<div class="sections"></div>
<p><h2>Introduction
</h2>This write-up details an approach to 2d software rendering useful for typical
application UIs where between most frames little-or-none of the screen's content
changes. This approach is used in the <a href="https://github.com/rxi/lite">lite</a> text
editor.
<p>The approach provides the performance advantages of dirty rectangles but
abstracts any of the mental overhead or code complexity usually associated with
them — the application's code can act as if it is redrawing everything each
frame while the renderer takes care of only redrawing regions of the screen
which have changed. This leads to <em>simpler</em>, less error prone application code
without the waste of CPU time which would result from continually redrawing.
<p><p><h2>Overview
</h2>The approach exists in 3 parts: the <code>Command Buffer</code>, <code>Hash Grid</code> and
<code>Renderer</code>. For each frame where the application would typically draw, it
instead pushes "draw commands" to a command buffer; at the end of the frame the
command buffer is iterated and the commands themselves are added to the hash
grid. Finally each cell of the hash grid is compared with the previous frame's
and the regions for the cells which have changed are redrawn.
<p><p><h2>Command Buffer
</h2>The command buffer is a simple byte array which each draw command is pushed to:
<pre>char command_buf[1024*1024];
int command_buf_idx;
</pre>A draw command must contain a rectangle representing the area of the screen
which the draw operation would effect when performed (this is used later by the
hash grid), as well as any additional information required to draw the item (eg.
an image pointer, color, or string for text rendering). A base command at
minimum might be represented by the following struct:
<pre>typedef struct { int x, y, w, h; } Rect;
typedef struct {
int type, size;
Rect rect;
} Command;
</pre>This would be used as a base for any other commands which would extend it with
additional information they require:
<pre>typedef struct { unsigned char r, g, b, a; } Color;
typedef struct {
Command cmd;
Color color;
} RectCommand;
</pre>Whenever we call the <code>push_rectangle</code> function we would use the arguments passed
it to fill the <code>RectCommand</code> struct and push it to the renderer's command
buffer; no actual drawing would be done at this point:
<pre>void push_rectangle(Rect r, Color clr) {
RectCommand *c = (void*) &command_buf[command_buf_idx];
c->cmd.type = RECT_COMMAND;
c->cmd.size = sizeof(*c);
c->cmd.rect = r;
c->color = clr;
command_buf_idx += c->cmd.size;
}
</pre>We would do the same for any other draw commands which occur in the frame.
<p><p><h2>Hash Grid
</h2>The <code>Hash Grid</code> is a 2d grid of cells where each cell is represented by a hash
value <em>(a 32bit unsigned integer and simple hash function such as FNV-1a should
suffice)</em> which is associated with an NxN pixel region of the screen. There must
be enough cells to account for the whole screen, eg. a screen resolution of
1920x1080 and cell size of 100x100 pixels would require a 20x11 grid of cells. A
second hash grid of the same size is also stored as we'll need to compare the
current hash grid with the one of the previous frame.
<pre>unsigned cell_buf1[CELLS_X * CELLS_Y];
unsigned cell_buf2[CELLS_X * CELLS_Y];
unsigned *cells = cell_buf1;
unsigned *cells_prev = cell_buf2;
</pre><p>At the end of each frame when the command buffer has been filled by the
application's code, we reset the hash grid to an initial state; that is,
setting each cell's hash value to the initial value used by our hash function.
<pre>for (int i = 0; i < CELLS_X * CELLS_Y; i++) {
cells[i] = HASH_INITIAL;
}
</pre><p>We then iterate the buffer of commands and for each one we hash the command
itself. Each hash value for the cells of the hash grid which the command's
<code>rect</code> overlaps would then be updated with the command's hash value:
<pre>Command *cmd = (void*) command_buf;
Command *end = (void*) &command_buf[command_buf_idx];
while (cmd != end) {
unsigned h = HASH_INITIAL;
hash(&h, cmd, cmd->size);
update_overlapping_cells(cmd->rect, h);
cmd = (void*) (((char*) cmd) + cmd->size);
}
</pre>With the implementation of <code>update_overlapping_cells</code>:
<pre>void update_overlapping_cells(Rect r, unsigned h) {
int x1 = r.x / CELL_SIZE;
int y1 = r.y / CELL_SIZE;
int x2 = (r.x + r.w) / CELL_SIZE;
int y2 = (r.y + r.h) / CELL_SIZE;
for (int y = y1; y <= y2; y++) {
for (int x = x1; x <= x2; x++) {
hash(&cells[x + y * CELLS_X], &h, sizeof(h));
}
}
}
</pre><p><h2>Rendering
</h2>Once all the commands have been iterated the hash grid is complete. We then
iterate each cell of the hash grid and compare it to the previous frame's
value, if the cells differ we need to redraw the region of the screen that the
cell represents.
<p>For each changed region we set our renderer to clip to that region and iterate
the draw commands again, for every command where the rect overlaps that cell we
perform the command's draw operation:
<pre>for (int y = 0; y < CELLS_Y; y++) {
for (int x = 0; x < CELLS_X; x++) {
int idx = x + y * CELLS_X;
if (cells[idx] != cells_prev[idx]) {
Rect rect = { x * CELL_SIZE, y * CELL_SIZE, CELL_SIZE, CELL_SIZE };
redraw_region(rect);
}
}
}
</pre><p>As a simple optimisation each changed cell shouldn't draw immediately, but
instead an effort should be made to merge the rectangle regions for adjacent
cells into larger rectangle regions which are then redrawn.
<p>Once we have finished redrawing the changed regions we swap the current frame's
hash grid with the previous frame's (such that next frame we would be
comparing with this frame's state) and reset the command buffer index to the
start of the buffer:
<pre>unsigned *tmp = cells_prev;
cells_prev = cells;
cells = tmp;
command_buf_idx = 0;
</pre><p>Although the approach is aimed towards software rendering, it is not tied to it
specifically and can be used to minimize redrawing when using hardware
rendering; though it's questionable in reality how many situations would be
improved by the approach when using hardware rendering.
<p><p><h2>Conclusion
</h2>The result of the approach means that rendering each frame is reduced to pushing
a series of commands to a linear buffer and performing a fast hash function. For
a typical UI most frames will result in nothing being redrawn, and most redraws
will do so to only a small portion of the screen. Consider the difference in
redrawing a transparent 512x512 rectangle which would require blending 262,144
pixels (1,048,576 bytes), versus hashing 180 bytes of data (the 40 byte command
and 36 hash grid cells; <em>note: the number of cells can be reduced by
increasing the cell size if you're typically drawing larger elements</em>). The more
costly the drawing operation would have been the more benefit the approach
yields.
<p><p></body></html>