Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tsdbrelay: denormalization backfill app #1076

Merged
merged 1 commit into from
Jun 22, 2015
Merged

tsdbrelay: denormalization backfill app #1076

merged 1 commit into from
Jun 22, 2015

Conversation

captncraig
Copy link
Contributor

Review on Reviewable

@maddyblue
Copy link
Contributor

Review status: 0 of 8 files reviewed at latest revision, 9 unresolved discussions, all commit checks successful.


cmd/tsdbrelay/denormalize/backfill/main.go, line 1 [r1] (raw file):
Why isn't this program in the cmd directory with all the other commands?


cmd/tsdbrelay/denormalize/backfill/main.go, line 15 [r1] (raw file):
Why do we need q at all? Can't we infer this from ruleFlag completely?


cmd/tsdbrelay/denormalize/backfill/main.go, line 16 [r1] (raw file):
I'd recommend YYYY-MM-DD here instead of slashes. Dashes are generally used if year is first, slashes if year is last. Maybe because that's OpenTSDB's date format?


cmd/tsdbrelay/denormalize/backfill/main.go, line 59 [r3] (raw file):
I think this can be omitted because it doesn't tell the user anything they didn't know.


cmd/tsdbrelay/denormalize/backfill/main.go, line 63 [r3] (raw file):
What happens if the queue length is too small?


cmd/tsdbrelay/denormalize/backfill/main.go, line 68 [r3] (raw file):
This URL should be specified on the command line.


cmd/tsdbrelay/denormalize/backfill/main.go, line 109 [r3] (raw file):
Why this instead of sending it ourselves? Because of the retry logic in collect, or some other reason?

tsdbrelay just ignores failed denormalize requests:

Maybe it should have some retry logic too?


cmd/tsdbrelay/denormalize/backfill/main.go, line 128 [r3] (raw file):
This could cause the backfill process to exit before all the datapoints have been sent. See:

sending := queue[:i]

Points are removed from the queue and then the send attempt is made. If it fails, then they are requeued. If we are only checking to see if the queue length is 0, then we could exit while there are points in-flight. This could cause the request to not be sent (since the entire program exits), or a failed batch to not be requeued.


collect/collect.go, line 116 [r3] (raw file):
I do not like that we start 3 goroutines and then do a check that can return an error. Do all possible things that can return an error before starting the goroutines.


Comments from the review on Reviewable.io

@captncraig
Copy link
Contributor Author

Review status: 0 of 8 files reviewed at latest revision, 8 unresolved discussions, all commit checks successful.


cmd/tsdbrelay/denormalize/backfill/main.go, line 1 [r1] (raw file):
just for naming I liked it being under the denormalize namespace. Also didn't want to overpopulate the cmd namespace. But I think you are right.


cmd/tsdbrelay/denormalize/backfill/main.go, line 17 [r1] (raw file):
the rule might be os.net.bytes__host but we would still want to query os.net.bytes{host=*,direction=*,interface=*} to get all points. I was thinking of auto-divining all of the aggregate tags, but it seemed easier in the end to just make the query explicit.


cmd/tsdbrelay/denormalize/backfill/main.go, line 18 [r1] (raw file):
This is the opentsdb format.


cmd/tsdbrelay/denormalize/backfill/main.go, line 63 [r3] (raw file):
If the queue overflows, we will drop stuff. That would not be good. Perhaps I should check the queue between time intervals to be safe.


cmd/tsdbrelay/denormalize/backfill/main.go, line 68 [r3] (raw file):
Oops.


cmd/tsdbrelay/denormalize/backfill/main.go, line 109 [r3] (raw file):
It was an attempt to not duplicate the gzip, json code once again. Maybe I should just centralize the "send a bunch of data points to opentsdb" bit in the collect package.


cmd/tsdbrelay/denormalize/backfill/main.go, line 128 [r3] (raw file):
using collect was an attempt to save code, but I agree I probably don't want all of the baggage. I'll make the relevant bits reusable instead of using the entire collect pipeline.


collect/collect.go, line 116 [r3] (raw file):
true. I'll revert this. The reasoning was if you are not sending self metrics, the root doesn't matter.


Comments from the review on Reviewable.io

@maddyblue
Copy link
Contributor

Review status: 0 of 8 files reviewed at latest revision, 8 unresolved discussions, all commit checks successful.


cmd/tsdbrelay/denormalize/backfill/main.go, line 17 [r1] (raw file):
Making the query explicit allows a user to accidentally do something wrong (i.e., they didn't know about the interface tag and now their backfill is useless since all their incoming data has it). OpenTSDB reports back the aggregate tags, so we can use that to make the correct query. I think it's worth the work.


Comments from the review on Reviewable.io

@maddyblue
Copy link
Contributor

Review status: 0 of 9 files reviewed at latest revision, 13 unresolved discussions, all commit checks successful.


cmd/backfill/main.go, line 1 [r4] (raw file):
Package comments should start with the name of the program: "Backfill denormalizes historic OpenTSDB data.".


cmd/backfill/main.go, line 2 [r4] (raw file):
This should have a newline above it to prevent this sentence from appearing in the godoc index.


cmd/backfill/main.go, line 21 [r4] (raw file):
opentsdb -> OpenTSDB


cmd/backfill/main.go, line 22 [r4] (raw file):
OpenTSDB


cmd/backfill/main.go, line 52 [r4] (raw file):
can remove this debug


cmd/backfill/main.go, line 112 [r4] (raw file):
Isn't 204 expected, not 200?


cmd/backfill/main.go, line 139 [r4] (raw file):
return err


cmd/backfill/main.go, line 142 [r4] (raw file):
return err


collect/collect.go, line 107 [r4] (raw file):
Why did this move?


collect/collect.go, line 109 [r4] (raw file):
I don't understand why this block moved.


collect/collect.go, line 120 [r4] (raw file):
Why the extra newline?


Comments from the review on Reviewable.io

@captncraig
Copy link
Contributor Author

Review status: 0 of 9 files reviewed at latest revision, 10 unresolved discussions, all commit checks successful.


cmd/backfill/main.go, line 1 [r4] (raw file):
Done.


cmd/backfill/main.go, line 2 [r4] (raw file):
Done.


cmd/backfill/main.go, line 21 [r4] (raw file):
Done.


cmd/backfill/main.go, line 22 [r4] (raw file):
Done.


cmd/backfill/main.go, line 52 [r4] (raw file):
Done.


cmd/backfill/main.go, line 112 [r4] (raw file):
from http://opentsdb.net/docs/build/html/api_http/query/index.html it looks like 200 is expected. That is what I get from devbosun anyway. Should I look for 2xx instead?


cmd/backfill/main.go, line 139 [r4] (raw file):
Done.


cmd/backfill/main.go, line 142 [r4] (raw file):
Done.


collect/collect.go, line 107 [r4] (raw file):
replaced


collect/queue.go, line 133 [r4] (raw file):
Good point.


Comments from the review on Reviewable.io

@maddyblue
Copy link
Contributor

Reviewed 5 of 6 files at r1, 3 of 3 files at r5.
Review status: all files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.


cmd/backfill/main.go, line 113 [r5] (raw file):
no need for the else here since you have a return


cmd/backfill/main.go, line 118 [r4] (raw file):
Sorry, you're right. 200 is correct for queries, 204 is for /api/put.


Comments from the review on Reviewable.io

@maddyblue
Copy link
Contributor

Reviewed 1 of 1 files at r6, 1 of 1 files at r7.
Review status: all files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.


collect/queue.go, line 117 [r7] (raw file):
I think the current changes will work. However, if it were me I'd just copy this function into the backfill program so we didn't have to change this file at all. Here's the reason: this function is only about 20 lines, and it's pretty simple. Yet it required about 40 other lines to change in order for the refactor to work. Just not worth it to me. Your choice.


Comments from the review on Reviewable.io

@maddyblue
Copy link
Contributor

LGTM with the caveat I posted below.


Review status: all files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.


Comments from the review on Reviewable.io

@captncraig captncraig force-pushed the backfill branch 2 times, most recently from f1a0af0 to d5c62a8 Compare June 22, 2015 15:42
@captncraig
Copy link
Contributor Author

@mjibson I'm gonna merge as is. Most of the changes to collect were to change queue to carry datapoints instead of raw messages. This seems cosmetic now, but will be necessary in the future if scollector is to target more backends.

@captncraig captncraig changed the title WIP - denormalization backfill app tsdbrelay: denormalization backfill app Jun 22, 2015
captncraig added a commit that referenced this pull request Jun 22, 2015
tsdbrelay: denormalization backfill app
@captncraig captncraig merged commit d282960 into master Jun 22, 2015
@captncraig captncraig deleted the backfill branch June 22, 2015 15:45
@maddyblue
Copy link
Contributor

Agreed. Will be useful for multiply back ends.

On Mon, Jun 22, 2015, 11:44 AM Craig Peterson notifications@github.com
wrote:

@mjibson https://github.com/mjibson I'm gonna merge as is. Most of the
changes to collect were to change queue to carry datapoints instead of raw
messages. This seems cosmetic now, but will be necessary in the future if
scollector is to target more backends.


Reply to this email directly or view it on GitHub
#1076 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants