-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monitor batch write #22
Conversation
e85fca2
to
895f8da
Compare
895f8da
to
3bebb6a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally fine, some problems that need to handle:
- As we discussed that we don't merge the Fix parse monitor URL #21 , so this PR shouldn't base on the changes of Fix parse monitor URL #21.
- We need to find a way to push the data in the queue before the go routine being cancelling, currently, the data in the queue will lost once the go routine is cancelled.
(But I don't know how to do this yet, maybe golang has a way to know about the go routine is being cancelled).
May we do some cleaning work when we receive the SIGTERM signal, need to do some research. A simple way, wait > 1m (or set a smaller batch interval) when to close the pod. |
@qi-zhou Please confirm if K8s has this waiting time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect using a shared array and a mutex could cause "starvation" when you have high traffic.
Only the goroutines from the HTTP requests may acquire the lock, and the goroutine to write the batched requests may never get the lock.
Remember the Go "proverb":
Do not communicate by sharing memory; instead, share memory by communicating.
(https://blog.golang.org/share-memory-by-communicating)
-
Create an input channel for events, instead of the shared array + lock.
-
Each request writes an event to the input channel.
-
The batching go-routine reads events from the channel and adds them to an array.
-
Either after the timeout or after you reach some large batch size, you can send the the batch of points to InfluxDB.
Benefits:
-
You don't need a shared array, because the buffer is private to the batching goroutine.
-
Because you don't have a shared array, you don't need the mutex.
-
It will be much easier to retry when sending data to InfluxDB fails for a 5xx reason.
-
You can control the timeout and the size of the buffer that you send to InfluxDB. Here's a refernece
Oops. Review submitted early by accident, sorry!
What I was going to say was: Here's an (older) reference that says 5-10k events per request is a good amount: https://community.influxdata.com/t/what-is-the-highest-performance-method-of-getting-data-in-out-of-influxdb/464 |
move to #25 |
solve #13