-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor raintank/schema into a v2 version #15
Conversation
woodsaj
commented
Feb 22, 2018
•
edited
Loading
edited
- reuse the same MetricMetadata struct in MetricData and MetricDefinition
- remove "Metric" field from the MetricMetadata
- Update SetId() to use the Name field instead of Metric.
- update MetricData to be a combo of MetricMetadata and a Point
- Add a MetricPoint struct, which is just the metricId + a Point
- Remove non-zero interval requirement in validation.
- add MetricWithMetadata and DataPoint interfaces
- MetricData and MetricDefinition both implement MetricWithMetaData
- MetricData and MetricPoint both implement DataPoint
- Add new msg.Formats for FormatMetricDataArrayJson and FormatMetricDataArrayMsgp
- add new msg.Formats for FormatMetricData and FormatMetricPoint which are for messages sent to and received from kafka
- Add EncodeDataPoint and DecodeDatapoint functions
- Add EncodeMetricDataArray and DecodeMetricDataArray which are backwards compatible for schema.v1 payloads
- use a bufferPool for SetId as a byte slice is used to get the md5sum then discarded.
metric.go
Outdated
} | ||
|
||
// returns a id (hash key) in the format OrgId.md5Sum | ||
// the md5sum is a hash of the the concatination of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"concatenation"
metric.go
Outdated
buffer.WriteString(m.Unit) | ||
buffer.WriteByte(0) | ||
buffer.WriteString(m.Mtype) | ||
buffer.WriteByte(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know the old version before this patch did the same, but anyway: Isn't that one buffer.WriteByte(0)
too many? If there are no tags then the last char of buffer
is 0
, if there are >0
tags then there's a double-0
before the first tag.
I guess we can't change that, otherwise that would lead to trouble because the IDs of existing metrics would change with the update.
metric_test.go
Outdated
Interval: 15, | ||
Unit: "gauge", | ||
Mtype: "ms", | ||
Tags: []string{"key1:val1", "key2:val2"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tag format is <key>=<value>
metric_test.go
Outdated
Interval: 15, | ||
Unit: "gauge", | ||
Mtype: "ms", | ||
Tags: []string{"key1:val1", "key2:val2"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tag format is <key>=<value>
msg/msg.go
Outdated
@@ -91,3 +40,136 @@ func CreateMsg(metrics []*schema.MetricData, id int64, version Format) ([]byte, | |||
} | |||
return buf.Bytes(), nil | |||
} | |||
|
|||
func DecodeMetricDataArray(b []byte) ([]*schema.MetricData, error) { | |||
if len(b) < 9 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a comment where the number 9
comes from would be helpful
msg/msg.go
Outdated
} | ||
return p, nil | ||
default: | ||
// if format is not known, lets assume that this is a MetricData message with no version byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by handling it like this, isn't there a risk that we miss a potential error case and end up inserting garbage data into the backend store because the received metrics were somehow corrupted and we didn't notice it due to this default
case?
If this default
case would just do nothing and error we'd notice corrupted input faster probably
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Messages currently being sent into kafka dont have a format byte. So we have no choice other then to treat messages like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the only new message type is the compact payload i proposed in grafana/metrictank#849, then we can also rely on the length of the message for additional confidence. (the proposed payload is exactly 28 bytes)
metric.go
Outdated
|
||
// returns a id (hash key) in the format OrgId.md5Sum | ||
// the md5sum is a hash of the the concatination of the | ||
// metric + each tag key:value pair (in metrics2.0 sense, so also fields), sorted alphabetically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
key=value
- reuse the same MetricMetadata struct in MetricData and MetricDefinition - remove "Metric" field from the MetricMetadata - Update SetId() to use the Name field instead of Metric. - update MetricData to be a combo of MetricMetadata and a Point - Add a MetricPoint struct, which is just the metricId + a Point - Remove non-zero interval requirement in validation. - add MetricWithMetadata and DataPoint interfaces - MetricData and MetricDefinition both implement MetricWithMetaData - MetricData and MetricPoint both implement DataPoint - Add new msg.Formats for FormatMetricDataArrayJson and FormatMetricDataArrayMsgp - add new msg.Formats for FormatMetricData and FormatMetricPoint which are for messages sent to and received from kafka - Add EncodeDataPoint and DecodeDatapoint functions - Add EncodeMetricDataArray and DecodeMetricDataArray which are backwards compatible for schema.v1 payloads - use a bufferPool for SetId as a byte slice is used to get the md5sum then discarded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the embedding of different structs within one another. that seems useful. i need to look into the format stuff a bit more
|
||
type BufferPool struct { | ||
pool sync.Pool | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the purpose of embedding the sync.Pool as opposed to simply type aliasing a BufferPool to sync.Pool, or even skip the BufferPool type and have NewBufferPool return a *sync.Pool
?
if m.OrgId == 0 { | ||
return errInvalidOrgIdzero | ||
} | ||
if m.Interval == 0 { | ||
return errInvalidIntervalzero | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have now decided to not support interval autodetection any time soon, so we can add this validation back
b = append(b, []byte(m.Name)...) | ||
return b | ||
type MetricPoint struct { | ||
Id string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we agreed in #199 that Id's should be stored as a [16]byte
(and a uint32
for orgId when needed)
buffer.WriteByte(0) | ||
buffer.WriteString(m.Unit) | ||
buffer.WriteByte(0) | ||
buffer.WriteString(m.Mtype) | ||
buffer.WriteByte(0) | ||
fmt.Fprintf(buffer, "%d", m.Interval) | ||
|
||
for _, t := range m.Tags { | ||
if len(t) >= 5 && t[:5] == "name=" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove this clause?
huh? these formats exist also in master. am I missing something? big picture though, we should stop using messagepack. it takes too much space. I thought we established that in the discussions of #199 and I think grafana/metrictank#849 (comment) confirms it. |