Skip to content
This repository has been archived by the owner on Oct 3, 2023. It is now read-only.

Add details about supporting qps based sampler. #158

Merged
merged 2 commits into from
Aug 25, 2018

Conversation

bogdandrutu
Copy link
Contributor

@bogdandrutu bogdandrutu commented Aug 10, 2018

This idea belongs to a different Google engineer who does not have a github account to link.

applied to a child `Span` of a **sampled** parent `Span`, the child `Span` keeps the sampling decision.
applied to a child `Span` of a **sampled** parent `Span`, the child `Span` keeps the sampling
decision.
* `QPS` - sampler that tries to sample traces based on QPS (e.g. 1 trace/second). When applied to a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure QPS is a good name for this. It's common in Google to use QPS to mean rate of anything (not just queries) but I think this is less common elsewhere in my experience, but I would be interested to hear non-Google perspectives on this. What about "fixed rate sampler"?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Jaeger we call them rate limiting samplers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for rate limiting - my previous suggestion of fixed rate isn't good because if the application produces fewer than this number of spans, the rate will actually be lower.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

### QPS sampler implementation details
The problem we are trying to solve is:
1. Getting QPS based sampling.
2. Providing real sampling probabilities.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean? Do you mean that we can attach to each span the actual sampling probability in use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my idea that we may want to do this.

@codefromthecrypt
Copy link

codefromthecrypt commented Aug 11, 2018 via email

@semistrict
Copy link
Contributor

Reservoir sampling is an algorithm to obtain a fixed size sample from a continuous stream where each item (usually) has a fixed chance of being included in the sample. Seems like it could be useful as a technique for implementing this but I don't like it as a name since it is more descriptive of the "how" rather than the "what".

Copy link
Contributor

@songy23 songy23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. I agree that "rate limiting samplers" sounds like a better name.

applied to a child `Span` of a **sampled** parent `Span`, the child `Span` keeps the sampling decision.
applied to a child `Span` of a **sampled** parent `Span`, the child `Span` keeps the sampling
decision.
* `QPS` - sampler that tries to sample traces based on QPS (e.g. 1 trace/second). When applied to a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nevinheintze mentioned offline that we might want to set the default rate significantly lower than 1 trace/sec to avoid issues when there are a very large number of tasks - maybe 0.1 trace/sec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

@@ -33,3 +37,26 @@ The OpenCensus library samples based on the following rules:
3. If the span is a child of a local `Span` the sampling decision will be:
* If a "span-scoped" `Sampler` is provided, use it to determine the sampling decision.
* Else keep the sampling decision from the parent.

### QPS sampler implementation details
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative here would be to use reservoir sampling I think, should we explore that? Sounds like it's what is used to accomplish this for other systems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I would look into that. The advantage of this algorithm is that we update only one global atomic variable, happy to see a small pseudocode for the reservoir.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A disadvantage for this is we most likely apply the probability using the trace-id (or last part of the trace-id) which we need to be randomly generated to ensure the correctness OR we generate a random number at that point.

What is your thought on this?

@codefromthecrypt
Copy link

codefromthecrypt commented Aug 15, 2018 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants