Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite processQueue for better batching #719

Merged
merged 10 commits into from
May 18, 2020

Conversation

vmihailenco
Copy link
Contributor

@vmihailenco vmihailenco commented May 13, 2020

Previous discussion in #702

The tests are failing and I am not sure what to do about them. As I understand they were working by accident or buggy behavior in previous implementation. E.g.

		{
			name: "parallel span generation",
			o: []sdktrace.BatchSpanProcessorOption{
				sdktrace.WithScheduleDelayMillis(schDelay),
				sdktrace.WithMaxQueueSize(200),
			},
			wantNumSpans:   200,
			wantBatchCount: 1,
			genNumSpans:    205,
			waitTime:       waitTime,
			parallel:       true,
		},

fails with

parallel span generation: number of exported span: got 205, want 200

This result depends on call/scheduling order (the fact that consuming is stopped for ScheduledDelayMillis) and BatchSpanProcessor does not provide such guarantees in real/production environment. So I would suggest to remove these assertions instead of just updating the numbers.

I would also rename ScheduledDelayMillis -> ScheduledDelay -> BatchTimeout.

@jmacd
Copy link
Contributor

jmacd commented May 13, 2020

I also support renaming the timeout to BatchTimeout.

@jmacd
Copy link
Contributor

jmacd commented May 13, 2020

This result depends on call/scheduling order (the fact that consuming is stopped for ScheduledDelayMillis) and BatchSpanProcessor does not provide such guarantees in real/production environment.

I would like to see the call to time.Sleep removed from the test. I believe the correct test should assert

			wantNumSpans:   205,
			wantBatchCount: 2,

@vmihailenco
Copy link
Contributor Author

vmihailenco commented May 14, 2020

PTAL. I've left BatchTimeout for separate PR since it is likely to affect several packages. There multiple commits to ease review.

sdk/trace/batch_span_processor.go Outdated Show resolved Hide resolved
sdk/trace/batch_span_processor.go Outdated Show resolved Hide resolved
Copy link
Contributor

@jmacd jmacd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my prior review I suggested using a context for safe shutdown, not a wait group. These comments would apply, if you take the suggestion.

sdk/trace/batch_span_processor.go Outdated Show resolved Hide resolved
sdk/trace/batch_span_processor.go Outdated Show resolved Hide resolved
sdk/trace/batch_span_processor.go Outdated Show resolved Hide resolved
sdk/trace/batch_span_processor.go Outdated Show resolved Hide resolved
Copy link
Contributor

@jmacd jmacd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there is still a problem that can be fixed by adding one more pair of Add(1) and Done() calls.

sdk/trace/batch_span_processor.go Outdated Show resolved Hide resolved
sdk/trace/batch_span_processor.go Outdated Show resolved Hide resolved
sdk/trace/batch_span_processor.go Outdated Show resolved Hide resolved
sdk/trace/batch_span_processor.go Outdated Show resolved Hide resolved
@jmacd
Copy link
Contributor

jmacd commented May 15, 2020

This is very close, and I appreciate your patience!

@vmihailenco
Copy link
Contributor Author

PTAL - I've went with read-and-throw-away goroutine as you suggested previously. Perhaps golang/go#21985 is not such a bad idea after all.

@jmacd
Copy link
Contributor

jmacd commented May 16, 2020

I noticed near the end of golang/go#21985

Conventionally these days people use a context.Context value for this.

which is similar to the idea in #719 (comment). However I'm inclined to accept this as-is. I may file an issue to remove throwAwayFutureSends() later.

@vmihailenco
Copy link
Contributor Author

I believe that comment suggests to use context.Context instead of close to signal that there are no more messages. That is not the case here (since there can be more messages) and that does not allow to remove throwAwayFutureSends. Otherwise I don't understand how you suggest to use that extra context channel.

Copy link
Contributor

@MrAlias MrAlias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this!

@MrAlias MrAlias merged commit 5a534a0 into open-telemetry:master May 18, 2020
@pellared pellared added this to the untracked milestone Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants