You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Aug 23, 2023. It is now read-only.
Copy file name to clipboardexpand all lines: devdocs/expr.md
+3-3
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@
12
12
Such functions require special options.
13
13
see https://github.com/grafana/metrictank/issues/926#issuecomment-559596384
14
14
15
-
## implement our copy-o-write approach when dealing with modifying series
15
+
## implement our copy-on-write approach when dealing with modifying series
16
16
17
17
See section 'Considerations around Series changes and reuse and why we chose copy-on-write' below.
18
18
@@ -27,7 +27,7 @@ example: an averageSeries() of 3 series:
27
27
* will create an output series value.
28
28
* it will use a new datapoints slice, retrieved from pool, because the points will be different. also it will allocate a new meta section and tags map because they are different from the input series also.
29
29
* won't put the 3 inputs back in the pool or cache, because whoever allocated the input series was responsible for doing that. we should not add the same arrays to the pool multiple times.
30
-
* It will however store the newly created series into the cache such that that during plan cleanup time, the series' datapoints slice will be moved back to the pool.
30
+
* It will however store the newly created series into the cache such that during plan cleanup time, the series' datapoints slice will be moved back to the pool.
31
31
32
32
# Considerations around Series changes and reuse and why we chose copy-on-write.
33
33
@@ -72,7 +72,7 @@ for now we assume that multi-steps in a row is not that common, and COW seems mo
72
72
73
73
74
74
This leaves the problem of effectively managing allocations and using a sync.Pool.
75
-
Note that the expr library can be called by different clients. At this point only Metrictank uses it, but we intend this lirbrary to be easily embeddable in other programs.
75
+
Note that the expr library can be called by different clients. At this point only Metrictank uses it, but we intend this library to be easily embeddable in other programs.
76
76
It's up to the client to instantiate the pool, and set up the default allocation to return point slices of desired point capacity.
77
77
The client can then of course use this pool to store series, which it then feeds to expr.
78
78
expr library does the rest. It manages the series/pointslices and gets new ones as a basis for the COW.
Copy file name to clipboardexpand all lines: docs/render-path.md
+13-7
Original file line number
Diff line number
Diff line change
@@ -113,7 +113,7 @@ First, let's look at some definitions.
113
113
Certain functions will return output series in an interval different from the input interval.
114
114
For example summarize() and smartSummarize(). We refer to these as IA-functions below.
115
115
In principle we can predict what the output interval will be during the plan phase, because we can parse the function arguments.
116
-
However, for simplicty, we don't implement this and treat all IA functions as functions that may change the interval of series in unpredicatable ways.
116
+
However, for simplicity, we don't implement this and treat all IA functions as functions that may change the interval of series in unpredicatable ways.
117
117
118
118
### Transparent aggregation
119
119
@@ -133,9 +133,16 @@ Generally, if series have different intervals, they can keep those and we return
133
133
However, when data will be used together (e.g. aggregating multiple series together, or certain functions like divideSeries, asPercent, etc) they will need to have the same interval.
134
134
An aggregation can be opaque or transparent as defined above.
135
135
136
-
Pre-normalizing is when we can safely - during planning - set up normalization to happen right after fetching (or better: set up the fetch parameters such that normalizing is not needed) and wen we know the normalization won't affect anything else.
137
-
This is the case when series go from fetching to transparent aggregation, possibly with some processing functions - except opaque aggregation(s) or IA-function(s) - in between, and
138
-
with asPercent in a certain mode (where it has to normalize all inputs), but not with divideSeries where it applies the same divisor to multiple dividend inputs, for example.
136
+
Pre-normalizing is when we can safely - during planning - set up normalization to happen right after fetching (or better: set up the fetch parameters such that normalizing is not needed) and when we know the normalization won't affect anything else.
137
+
138
+
This is the case when series go from fetching to a processing function like:
139
+
* a transparent aggregation
140
+
* asPercent in a certain mode (where it has to normalize all inputs)
141
+
142
+
possibly with some processing functions in between the fetching and the above function, except opaque aggregation(s) or IA-function(s).
143
+
144
+
Some functions also have to normalize (some of) their inputs, but yet cannot have their inputs pre-normalized. For example,
145
+
divideSeries because it applies the same divisor to multiple distinct dividend inputs (of possibly different intervals).
139
146
140
147
For example if we have these schemas:
141
148
```
@@ -152,13 +159,12 @@ Likewise, if the query is `groupByNode(group(A,B), 2, callback='sum')` we cannot
152
159
153
160
Benefits of this optimization:
154
161
1) less work spent consolidating at runtime, less data to fetch
155
-
2) it assures data will be fetched in a pre-canonical way. If we don't set up normalization for fetching, data may not be pre-canonical, such that
162
+
2) it assures data will be fetched in a pre-canonical way. If we don't set up normalization for fetching, data may not be pre-canonical, which means we may have to add null points to normalize it to canonical data, lowering the accuracy of the first or last point.
156
163
3) pre-normalized data reduces a request's chance of breaching max-points-per-req-soft and thus makes it less likely that other data that should be high-resolution gets fetched in a coarser way.
157
164
when it eventually needs to be normalized at runtime, points at the beginning or end of the series may be less accurate.
158
165
159
166
Downsides of this optimization:
160
-
1) if you already have the raw data cached, and the rollup data is not cached yet, it may result in a slower query. But this is an edge case
161
-
2) uses slightly more of the chunk cache.
167
+
1) if you already have the raw data cached, and the rollup data is not cached yet, it may result in a slower query, and you'd use slightly more chunk cache after the fetch. But this is an edge case
0 commit comments