Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Jaeger tracer #1030 #1147

Merged
merged 40 commits into from
Jun 4, 2019
Merged

Support Jaeger tracer #1030 #1147

merged 40 commits into from
Jun 4, 2019

Conversation

IKSIN
Copy link
Contributor

@IKSIN IKSIN commented May 15, 2019

Changes

  • Refactoring tracing package.
  • Add support for Jaeger tracing.

Verification

  • test from pkg/tracing/tracing_test.go moved to pkg/tracing/provider/stackdriver/tracer_test.go
  • our Thanos fork already integrated with Jaeger
  • Run query on Google Compute Engine.

@IKSIN IKSIN marked this pull request as ready for review May 15, 2019 14:47
@IKSIN IKSIN closed this May 15, 2019
…tracing

# Conflicts:
#	pkg/tracing/gct.go
#	pkg/tracing/provider/stackdriver/tracer_test.go
#	pkg/tracing/tracing.go
@IKSIN IKSIN reopened this May 15, 2019
@IKSIN IKSIN mentioned this pull request May 15, 2019
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! It looks generally good, but I would really vote for sticking to the same configuration method we did for object storage.

Essentially factory like this: https://github.com/improbable-eng/thanos/blob/2c4f64b1b96907295a7f8e99d8fd64697f0eb12a/pkg/objstore/client/factory.go#L38
And using pathOrContent flag (: What do you think? (:

We are really grateful that you contribute back with the awesome addition like this 👍 Let us know what other awesome things you have in your fork 💪 (:

GCP project to send Google Cloud Trace tracings to.
If empty, tracing will be disabled.
--gcloudtrace.sample-factor=1
--stackdriver.sample-factor=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about having flags totally provider agnostic, same way we do for objconfig? Benefits would be great as everytime you change or add thousands of customization flags to one provider the flag help stays clean relevant for all users?
Even though we only agreed ont jaeger and stackdriver for now. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would stick to objstore config like configuration, mainly for consistency. Shouldn't be much more code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it would be great to move most of configuration to one file to have prometheus-like configuration approach. Its convenient to have all flags in one place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a bigger decision. We chose not to do it as:

  • We have many smaller components instead of big single one like Prometheus
  • Most of options are not dynamic. Adding this dynamicity is enourmous amount of work
  • Objstore is separate to tracing. And it could be multiple of objstore configs as well, so it's hard to unifiy.

Copy link
Member

@bwplotka bwplotka May 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But someday (: Also, we are fans of flags (: for readability

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, got it, thanks! :)

return f
}

// Create creates the tracer in appliance with a tracerType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing trailing period


// Create creates the tracer in appliance with a tracerType
func (f *Factory) Create(ctx context.Context, logger log.Logger, serviceName string) (opentracing.Tracer, io.Closer) {
var tracer opentracing.Tracer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest this to make it bit more elegant

var (


)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also TBH I would just name return arguments (:

var err error
factory, ok := f.factories[*f.tracingType];
if !ok {
level.Info(logger).Log("msg", "Invalid tracer type. Tracing will be disabled.", "err", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not fan of this. I would prefer error handling done outside of factory I think.

Reason is, that you should fail fast - if user actually provide some tracing configuration - if it's wrong you want to fail Thanos process immdiately (since it's on start - it's ok)

Allowing silent errors (only log lines and Info) is not enough - it will be missed and create annoyance. Can we fix this? (:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

)

// Factory implements tracing.Factory for Jaeger Tracer.
type Factory struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type Factory struct {
type Factory struct {}


// Create implements tracing.Factory
func (f *Factory) Create(ctx context.Context, logger log.Logger, serviceName string) (opentracing.Tracer, io.Closer, error) {
cfg, err := config.FromEnv()
Copy link
Member

@bwplotka bwplotka May 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm this is again - breaking consistency with other things we have. Having verbose config instead like objstore config would allow dynamic docs like here: https://thanos.io/storage.md/#aws-s3-configuration that is auto generated on every commit.

This way we are sure that all fields are known to the users for exact version of Thanos binary (: I know it's additional work, but IMO it would be awesome to allow specfying this not from env but from config (which is can be passed directly by flags as well). Additionally grabing from env might be possible as well

JaegerDebugHeader: tracing.ForceTracingBaggageKey,
}
cfg.Headers.ApplyDefaults()
if serviceName != "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we would not need this if we would explicitly parse config from flags (:

logger: logger,
}
jMetricsFactory := prometheus.New()
tracer, closer, err := cfg.NewTracer(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just return cfg.NewTracer(...) would be enough here?

return tracer, closer, nil
}

// RegisterKingpinFlags implements tracing.Factory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -0,0 +1,112 @@
package stackdriver
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious, any change here or just move?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can we look on pjg/objstore package? Can we stick to same approach and just skip this provider directory? What do you think? (: I think both makes sense, just let's choose one

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not addressed

@IKSIN
Copy link
Contributor Author

IKSIN commented May 17, 2019

@bwplotka Hello =) I sent this pull-request to discuss configurations and flags. I agree with configuration via the same configuration method we did for object storage, but I want to save jaeger.FromEnv configuration. We deploy prometheus with sidecar via prometheus-operator and configuration via ENV is most convenient.

@bwplotka
Copy link
Member

@IKSIN this makes sense, and I am sure we can do proper consistent config from flag thing AND support doing it from env as well (:

See https://github.com/improbable-eng/thanos/blob/aedcb2e05b28f7fa35695be9c4127646dea5246e/pkg/objstore/s3/s3.go#L333

@IKSIN
Copy link
Contributor Author

IKSIN commented May 22, 2019

@bwplotka Hello) Method to configure tracing was changed as you want (jaeger config from file not implemented yet).
what are your suggestions to use configuration from ENV?

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, generally that's what we want ❤️ Thanks for the good work!

@@ -201,3 +201,27 @@ func regCommonObjStoreFlags(cmd *kingpin.CmdClause, suffix string, required bool
content: bucketConf,
}
}


func regCommonTracingFlags(cmd *kingpin.Application, suffix string, required bool, extraDesc ...string) *pathOrContent {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove suffix and require: YAGNI - we don't need that now (: same with extraDesc

String()
gcloudTraceSampleFactor := app.Flag("gcloudtrace.sample-factor", "How often we send traces (1/<sample-factor>). If 0 no trace will be sent periodically, unless forced by baggage item. See `pkg/tracing/tracing.go` for details.").
Default("1").Uint64()
//tracingFactory := provider.NewFactory(app)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove commented code from PR if it's rdy for review (:

var confContentYaml []byte
confContentYaml, err = tracingConfig.Content()
//tracer, closer = tracingFactory.Create(ctx, logger, *debugName)
tracer, closer, err = provider.NewTracer(ctx, logger, confContentYaml)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to manually make sure closer is closed on error cases of this function

pkg/tracing/provider/factory.go Outdated Show resolved Hide resolved
level.Info(logger).Log("msg", "loading tracing configuration")
tracingConf := &TracingConfig{}
if err := yaml.UnmarshalStrict(confContentYaml, tracingConf); err != nil {
return &opentracing.NoopTracer{}, ioutil.NopCloser(nil), errors.Wrap(err, "parsing config YAML file")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return &opentracing.NoopTracer{}, ioutil.NopCloser(nil), errors.Wrap(err, "parsing config YAML file")
return &opentracing.NoopTracer{}, ioutil.NopCloser(nil), errors.Wrap(err, "parsing config tracing YAML")

}

func parseConfig(conf []byte) (Config, error) {
config := Config{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too shallow function IMO - it does not hide much. What do you think about inlining this?

}

func NewTracerWithConfig(ctx context.Context, logger log.Logger, conf Config) (opentracing.Tracer, io.Closer, error) {
cfg, err := config.FromEnv()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should grab from env only if config is empty? So Flags > Environment? That avoids confusion

logger: logger,
}
jMetricsFactory := prometheus.New()
tracer, closer, err := cfg.NewTracer(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

literally return cfg.NewTracer(.... is enough here (:

pkg/tracing/provider/stackdriver/stackdriver.go Outdated Show resolved Hide resolved
if err != nil {
return nil, nil, err
}
println(config.ProjectId)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we kill this?

@@ -156,8 +166,10 @@ func main() {
<-ctx.Done()
return ctx.Err()
}, func(error) {
if err := closeFn(); err != nil {
level.Warn(logger).Log("msg", "closing tracer failed", "err", err)
if closer != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

closer is never nil in this case, right?

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It LGTM! Thanks for this! We need changelog entry, but we can add it ourselve as well.

)
if conf != nil {
level.Info(logger).Log("msg", "loading Jaeger tracing configuration from YAML")
cfg, err = FromYaml(conf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cfg, err = FromYaml(conf)
cfg, err = ParseConfigFromYaml(conf)

pkg/tracing/jaeger/jaeger.go Show resolved Hide resolved
pkg/tracing/jaeger/jaeger.go Outdated Show resolved Hide resolved
pkg/tracing/jaeger/jaeger.go Show resolved Hide resolved
pkg/tracing/stackdriver/stackdriver.go Show resolved Hide resolved
pkg/tracing/stackdriver/stackdriver.go Show resolved Hide resolved
}

if e := conf.RPCMetrics; e != "" {
if value, err := strconv.ParseBool(e); err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we make a field bool and get rid of parsing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, missed that - We can enforce YAML unmarshalling to parse that for us.

}

if e := os.Getenv(conf.Disabled); e != "" {
if value, err := strconv.ParseBool(e); err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

}

if e := cfg.SamplerParam; e != "" {
if value, err := strconv.ParseFloat(e, 64); err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't Config.SamplerParam be float64 in struct?

}

if e := cfg.SamplerMaxOperations; e != "" {
if value, err := strconv.ParseInt(e, 10, 0); err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

}

if e := cfg.SamplerRefreshInterval; e != "" {
if value, err := time.ParseDuration(e); err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One blocker @povilasv found and LGTM

}

if e := conf.RPCMetrics; e != "" {
if value, err := strconv.ParseBool(e); err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, missed that - We can enforce YAML unmarshalling to parse that for us.

Copy link
Member

@povilasv povilasv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should simplify Config struct, instead of having all the fields as string and parsing that, let's just let the yaml unmarshaller to do the work.

@IKSIN
Copy link
Contributor Author

IKSIN commented Jun 3, 2019

@bwplotka Done.

@brancz brancz requested a review from povilasv June 4, 2019 10:06
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for awesome work on this. LGTM

There are some minior comment nits to be strict... ((:

Tags string `yaml:"tags"`
SamplerType string `yaml:"sampler_type"`
SamplerParam float64 `yaml:"sampler_param"`
SamplerManagerHostPort string `yaml:"sampler_manager_host_port"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some docs on this structure would be nice at some point in next PR (:

)


// Tracer extends opentracing.Tracer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing trailig period and all below. (:

"gopkg.in/yaml.v2"
)

// Config - YAML configuration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing trailing period on comment.

SampleFactor uint64 `yaml:"sample_factor"`
}

// NewTracer create tracer from YAML
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this and below

@IKSIN
Copy link
Contributor Author

IKSIN commented Jun 4, 2019

Refactored )

@bwplotka
Copy link
Member

bwplotka commented Jun 4, 2019

Awesome! Thank you for awesome job on this! we have Jaeger support now 🎉

@bwplotka bwplotka merged commit fa9ca8b into thanos-io:master Jun 4, 2019
@d-ulyanov
Copy link
Contributor

Great job, guys!
And thanks a lot for thorough review. We'll add more spans to code cuz currently there are a lot of blind spots. I think we'll share this code soon too :)

@bwplotka
Copy link
Member

bwplotka commented Jun 4, 2019 via email

return nil, nil, err
}

r, err := gcloudtracer.NewRecorder(
Copy link
Contributor

@krasi-georgiev krasi-georgiev Jun 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure how the tests passed her as gcloudtracer is not defined anywhere and when I run make locally it fails there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you download the required go mods?

Copy link
Contributor

@krasi-georgiev krasi-georgiev Jun 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep I found the culprit the module was corrupted during the download.

it was missleading that the module name is gcloudtracer and the import path is github.com/lovoo/gcloud-opentracing, strange why they did it this way.

FUSAKLA pushed a commit to FUSAKLA/thanos that referenced this pull request Jun 8, 2019
* Add jaeger tracing feature.

* Remove comments

* Remove comments

* Refactoring tracing

* Implementing jaeger logger.

* Add jaeger force tracing header.

* Use debugName for tracing service-name

* RemoveFactory config

* Formatting fix; Use io.Closer intead func() error

* Rename gcloud => stackdriver

* Refactoring google tracing testing.

* Delete comments.

* Rename gcloud flags => stackdriver.

* Update tracing docs.

* Fix ..traceType to ..tracerType

* Refactoring NoopTracer; Comments for exported functions.

* Remove noop tracer. Fix docs.

* Remove noop tracer. Fix docs.

* Config tracing same as objstore.

* Configure jaeger tracing from YAML. Some small fixes.

* Fix errcheck

* Add X-Thanos-Trace-Id HTTP header for simplified search traces

* Cleanup

* make docs

* Add store addr to tracing tags.

* Tracing refactoring.

* Fix noop tracing closer.

* Add few tracing spans.

* Pass prometheus registry to jaeger.

* go.mod

* Refactoring

* Resolve go mod conflicts

* Remove comments.

* PR refactoring for review.

* Format files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants