-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metric name consistent hashing distribution #31
Comments
Can you share exactly how you are executing statsrelay with all the
arguments and options?
The Jump has is usually really even in how it distributes items in the hash
ring. Let's check for simple problems first.
Jack
…On Wed, Jul 28, 2021, 04:51 Anatoliy D. ***@***.***> wrote:
hi,
we are having issue with statsrelay - how it distributes metrics between
statsite hosts. We run 4 statsrelay servers, and 5 statsite servers at the
baclkend. All are configured identically.
It appears, that one statsite node is always getting higher amount of
metrics, look at the is-1941b :
[image: statsrelay]
<https://user-images.githubusercontent.com/1822261/127290718-0cd480ce-897e-49fc-86ae-7f537c0726a5.png>
We tried multiple INSTANCE values (in 'HOST:PORT:INSTANCE') , but it did
not make any visible effect.
Our current idea is that consistent hashing is not distributing metrics in
efficient way.
We sometimes use very long metric names, for example:
env.analyze.document_analyzer.analyzeDocument.entities.InternetDomainName.source.DataProviderNameHERE.DocumentNameHashHere.occurrences
In our case, it looks like all metric names prefixed
env.analyze.document_analyzer.analyzeDocument.entities are forwarded to
the same single statsite backend. Which is not good, we would expect those
metric names to be distributed between different statsite backends.
Any ideas how to solve ?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#31>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAACSLCUKS7NVCIT2HIA43LTZ7AHZANCNFSM5BD4TYGQ>
.
|
here it is:
. Yesterday, we changed it a bit by populating hash ring with multiple aliases:
|
Adding extra buckets to the hash ring will definitely help as you
demonstrated. The "instance" part of each bucket can be used to help add
some randomness as well. I like to use a UUID here. Example:
* is-1935b:8125:b0c4b6ea-1746-4423-984b-e984748b9a33
* is-1939b:8125:b3bc5476-e049-4bd6-8a34-8439c4d86406
* is-1940b:8125:21758c10-95d8-4238-90d3-68d4a290245c
Is it possible you are using an older version of StatsRelay?
But in any case, the solution you proposed definitely looks pretty
acceptable. I have definitely used multiple aliases before for hash ring
based distribution.
Jack
…On Thu, Jul 29, 2021 at 4:04 AM Anatoliy D. ***@***.***> wrote:
here it is:
/opt/statsrelay/statsrelay --bind 0.0.0.0 --port 8125 --packetlen 8000 --prefix statsrelay.it-s-0f5c57688efa79273 --pprof --pprof-bind :8123 --sendproto UDP \
is-1935b:8125 is-1939b:8125 is-1940b:8125 is-1941b:8125 is-1942b:8125
.
.
Yesterday, we changed it a bit by populating hash ring with multiple
aliases:
/opt/statsrelay/statsrelay --bind 0.0.0.0 --port 8125 --packetlen 8000 --prefix statsrelay.it-s-0f5c57688efa79273 --pprof --pprof-bind :8123 --sendproto UDP \
is-1935b:8125:1 is-1939b:8125:2 is-1940b:8125:3 is-1941b:8125:4 is-1942b:8125:5 \
is-1935b:8125:10 is-1939b:8125:20 is-1940b:8125:30 is-1941b:8125:40 is-1942b:8125:50 \
is-1935b:8125:100 is-1939b:8125:200 is-1940b:8125:300 is-1941b:8125:400 is-1942b:8125:500
, and we've got a better distribution:
[image: statsreray-2]
<https://user-images.githubusercontent.com/1822261/127454018-aa936299-d862-40a0-9b4a-d2086a1620dc.png>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#31 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAACSLGM62GUYPDKKUNCBMDT2EDPFANCNFSM5BD4TYGQ>
.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
hi,
we are having issue with statsrelay - how it distributes metrics between statsite hosts. We run 4 statsrelay servers, and 5 statsite servers at the baclkend. All are configured identically.
It appears, that one statsite node is always getting higher amount of metrics, look at the
is-1941b
:We tried to add INSTANCE values (in 'HOST:PORT:INSTANCE') , but it did not make any visible effect.
Our current idea is that consistent hashing is not distributing metrics in efficient way.
We sometimes use very long metric names, for example:
env.analyze.document_analyzer.analyzeDocument.entities.InternetDomainName.source.DataProviderNameHERE.DocumentNameHashHere.occurrences
In our case, it looks like all metric names prefixed
env.analyze.document_analyzer.analyzeDocument.entities
are forwarded to the same single statsite backend. Which is not good, we would expect those metric names to be distributed between different statsite backends.Any ideas how to solve ?
The text was updated successfully, but these errors were encountered: