Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance / Logging when Agent is Not Available #53

Closed
msaffitz opened this issue Jan 5, 2017 · 25 comments
Closed

Performance / Logging when Agent is Not Available #53

msaffitz opened this issue Jan 5, 2017 · 25 comments
Assignees
Labels
bug Involves a bug integrations Involves tracing integrations
Milestone

Comments

@msaffitz
Copy link

msaffitz commented Jan 5, 2017

When the agent is not available we see huge numbers of log lines of the form:

Connection refused - connect(2) for "localhost" port 7777

(For example, we see this in our test environment and it seems that the tests run much more slowly and/or timeout as a result. I'd expect the Rails integration to no-op in any environment other than production.)

@palazzem palazzem self-assigned this Jan 5, 2017
@palazzem palazzem added bug Involves a bug integrations Involves tracing integrations labels Jan 5, 2017
@palazzem
Copy link
Contributor

palazzem commented Jan 6, 2017

Thank you @msaffitz to report this issue! We're working to write better (but more important) useful logs from our tracing clients. We'll provide a fix as soon as possible!

@palazzem
Copy link
Contributor

@msaffitz we could provide a patch for a "generic" Rails application so that it could detect the production mode if default environments are used (using something like Rails.env.production?). If someone is not using the default naming convention (that could surely happen), you can disable the tracer updating our settings in a tracer initializer:

# config/initializers/datadog-tracer.rb (or whatever)

# this is just an example and you can change it with a func or anything else
tracing_state = Rails.env.production?

Rails.configuration.datadog_trace = {
  enabled: tracing_state,
  auto_instrument: tracing_state,
  auto_instrument_redis: tracing_state,
  # ... and other settings if you have them
}

Later on, we'll address other things like the exponential backoff / spammy logs in case of an agent failure in a production environment.

Is it a reasonable default / approach that works for you?

@palazzem
Copy link
Contributor

@msaffitz we've merged #55 so that Rails is disabled by default in the next release. This means that you have to follow the snippet in the PR description if you're going to use the master branch. We will release this one as the next 0.4.0.

Closing that for now but feel free to continue the discussion with more ideas!

@hugobast
Copy link

hugobast commented Feb 8, 2017

Hi everyone,

Is it possible that this fix isn't fixing anything? I've fought with this locally on my dev environment for about 30 minutes and decided to look into the lib to see what was happening. On require there's an object that gets initialized: https://github.com/DataDog/dd-trace-rb/blob/master/lib/ddtrace.rb#L7 That completely disregards the config, of note no options are passed to it.

That in turn goes to init a writer that has no options passed to it: https://github.com/DataDog/dd-trace-rb/blob/master/lib/ddtrace/tracer.rb#L46

Which does what it's supposed to and ultimately tries to connect to the agent that isn't there and we get log spammed with Connection refused - connect(2) for "localhost" port 7777.

I've tried from above and in #55:

# config/initializers/datadog-tracer.rb (or whatever)

# this is just an example and you can change it with a func or anything else
tracing_state = Rails.env.production?

Rails.configuration.datadog_trace = {
  enabled: tracing_state,
  auto_instrument: tracing_state,
  auto_instrument_redis: tracing_state,
  # ... and other settings if you have them
}

I'm sure this does something it does nothing about the problem of having Connection refused - connect(2) for "localhost" port 7777 spammed everywhre.

Let me know what I am missing or if I should open a PR to fix it.

Thanks!

@robbkidd
Copy link

Just started using this agent while testing out APM. I'm seeing the Connection refused - connect(2) for "localhost" port 7777 in environments where I have explicitly disabled it (in either config/application.rb or an initializer). e.g.:

Rails.configuration.datadog_trace = {
  enabled: false,
  auto_instrument: false,
  auto_instrument_redis: false
}

I can reproduce the error running the Rails console in this environment:

E, [2017-02-22T19:41:49.949034 #7662] ERROR -- : Failed to open TCP connection to localhost:7777 (Connection refused - connect(2) for "localhost" port 7777)
D, [2017-02-22T19:41:50.054235 #7662] DEBUG -- : set_service_info: service: rails-app app: rails type: web
D, [2017-02-22T19:41:50.054521 #7662] DEBUG -- : set_service_info: service: rails-cache app: rails type: cache
D, [2017-02-22T19:41:50.054870 #7662] DEBUG -- : set_service_info: service: postgres app: postgres type: db
Loading production environment (Rails 4.2.7.1)
irb(main):001:0> 
D, [2017-02-22T19:41:50.950520 #7662] DEBUG -- : Sending data from process: 7662
E, [2017-02-22T19:41:50.951696 #7662] ERROR -- : Failed to open TCP connection to localhost:7777 (Connection refused - connect(2) for "localhost" port 7777)
D, [2017-02-22T19:41:51.952619 #7662] DEBUG -- : Sending data from process: 7662
E, [2017-02-22T19:41:51.953664 #7662] ERROR -- : Failed to open TCP connection to localhost:7777 (Connection refused - connect(2) for "localhost" port 7777)
D, [2017-02-22T19:41:52.955651 #7662] DEBUG -- : Sending data from process: 7662
E, [2017-02-22T19:41:52.956804 #7662] ERROR -- : Failed to open TCP connection to localhost:7777 (Connection refused - connect(2) for "localhost" port 7777)
D, [2017-02-22T19:41:53.957781 #7662] DEBUG -- : Sending data from process: 7662
E, [2017-02-22T19:41:53.958915 #7662] ERROR -- : Failed to open TCP connection to localhost:7777 (Connection refused - connect(2) for "localhost" port 7777)

Sending data from process has me suspect that something is telling the transport to post data regardless of the value of datadog_trace[:enabled].

@robbkidd
Copy link

Is this maybe a workaround? I've managed to turn off the connection attempts and log spam by only requiring the ddtrace gem in the initializer if it I want tracing. Setting my feature flag to 'true' does enable tracing again and send data along.

# Gemfile
gem 'ddtrace', require: false
# config/initializers/datadog_tracer.rb
if ENV['DATADOG_TRACER_ENABLED'] && ENV['DATADOG_TRACER_ENABLED'] == 'true'
  require 'ddtrace'
  Rails.configuration.datadog_trace =
  {
    auto_instrument: true,
    auto_instrument_redis: true
  }
end

@hugobast
Copy link

@robbkidd Yes that's how I fixed it too.

@palazzem
Copy link
Contributor

Thank you @hugobast and @robbkidd for reporting the issue. We're going to reopen that and be sure that with the next release the logging is working properly!

@palazzem palazzem reopened this Feb 23, 2017
@robbkidd
Copy link

Is logging the issue or is it a symptom of connection attempts being made even when the tracer is disabled?

@walterking
Copy link
Contributor

I tried to only require if its enabled like robbkidd suggests, but we want to do custom instrumentation so if the Datadog object isnt defined it will fail for any of those. Seeing something get enqueued after we set datadog_trace config but before its actually disabled when using sidekiq

@palazzem
Copy link
Contributor

palazzem commented Mar 7, 2017

#86 should remove the root cause about Redis sending traces despite defined settings. A configuration like:

Rails.configuration.datadog_trace =
  {
    # may depend by your conditional env
    enabled: false,
    # these may be 'true' to keep parity with your production
    auto_instrument: true,
    auto_instrument_redis: true
  }

ensures that:

  • Rails is auto-instrumented after bundler requires ddtrace
  • Redis is auto-instrumented after bundler requires ddtrace
  • if redis-rails is used and Redis is auto-instrumented, redis-rails will use the traced Redis client
  • disabling the tracer means that traces will not be reported but the code will still be instrumented like in your production environment.

@palazzem
Copy link
Contributor

palazzem commented Mar 8, 2017

Fixed with #86 . @robbkidd if you think that it doesn't solve the problem, feel free to re-open the discussion. A new release will be shipped by the end of the week!

@palazzem palazzem closed this as completed Mar 8, 2017
@palazzem
Copy link
Contributor

palazzem commented Mar 9, 2017

We've released a new gem version that should handle this issue: https://github.com/DataDog/dd-trace-rb/releases/tag/v0.5.0

@robbkidd think it should fix your problems, let us know if it doesn't!

@palazzem
Copy link
Contributor

palazzem commented Mar 9, 2017

@FinnBorge can you provide me more details? Actually what is your stack (Rails + Sidekiq?) and what is your Rails.configuration.datadog_trace configuration for the auto_instrument_* fields?

@FinnBorge
Copy link

FinnBorge commented Mar 9, 2017

My mistake @palazzem, misread the history of this.

Rails.configuration.datadog_trace =
  {
    # may depend by your conditional env
    enabled: false,
    # these may be 'true' to keep parity with your production
    auto_instrument: true,
    auto_instrument_redis: true
  }

works fine.

@palazzem
Copy link
Contributor

palazzem commented Mar 9, 2017

oh no worries @FinnBorge ! thank you anyway for reporting that!

Actually if the enabled flag is set to false, the expected behavior is to have Rails and Redis instrumented (so traces / spans are generated like in your staging / production / whatever environment), but they are not sent to the Agent (good thing for your CI / development environments).

@davetron5000
Copy link

I'm having this issue because, despite enabled: being false, if you call Datadog::Monkey.patch_all it seems to re-enable it.

My current config that doesn't have this problem is:

require "net/http"
require "ddtrace"

app_name           = Rails.application.class.parent.name
is_rails_console   = Rails.const_defined?("Console")
dd_tracing_enabled = ENV.has_key?("DD_AGENT_PORT_8126_TCP_ADDR") && !is_rails_console

Datadog::Monkey.patch_all if dd_tracing_enabled

Rails.configuration.datadog_trace = {
  enabled: dd_tracing_enabled,
  auto_instrument: true,
  auto_instrument_redis: true,
  default_controller_service: "rails-controller-#{app_name}",
  default_cache_service: "rails-cache-#{app_name}",
  default_database_service: "postgresql-#{app_name}",
  default_service: app_name,
  trace_agent_hostname: ENV["DD_AGENT_PORT_8126_TCP_ADDR"],
  trace_agent_port: ENV["DD_AGENT_PORT_8126_TCP_PORT"]
}

@palazzem
Copy link
Contributor

Thanks @davetron5000 for sharing your working configuration. It's possible that the problem is more related to the fact that Datadog::Monkey.patch_all patches automatically all available libraries, while the enabled argument is evaluated only when Rails calls our initializer.

This may create traces (and so logs if the agent is not available) when libraries such as Redis, Mongoid, or others, are initialized and used before our initializer is invoked (and so the tracer is disabled).

If you want, you can open a new issue on GitHub posting your current state, so we can keep track of it providing better documentation or a different approach for it. Thanks!

@stevenleeg
Copy link

stevenleeg commented Feb 2, 2018

I'm still seeing this issue even without calling DataDog::Monkey.patch_all. I've tried a configuration like so:

# config/initializers/datadog_tracer.rb
Rails.configuration.datadog_trace = {
  enabled: false,
  auto_instrument: false,
}

and yet I'm still seeing

Failed to open TCP connection to localhost:8126 (Connection refused - connect(2) for "localhost" port 8126)

all over my development server console. Any ideas as to where I may be going wrong here?

@delner
Copy link
Contributor

delner commented Feb 2, 2018

Hey @stevenleeg , what version of the gem are you using? As of 0.11.0, we added a new configuration API that replaced the old Rails.configuration.datadog_trace you're using above.

In the new implementation, integrations shouldn't be automatically enabled/patched; you're expected to define each integration you'd like to trace explicitly (otherwise they won't be activated.) E.g to activate rails, it looks something like:

Datadog.configure do |c|
  c.use :rails
end

This new configuration API is supposed to replace the use of Monkey, which past version 0.11.0, will not be available for use.

However, in version 0.11.0, despite Monkey being deprecated, this line of code was still present which was patching Rails. This line was consequently removed in version 0.11.1+ when we disabled Monkey.

All in all... maybe try our latest version 0.11.2, and see if it addresses your issue? If you still have code using our older configuration style, you'll need to follow our migration guide [listed here]((https://github.com/DataDog/dd-trace-rb/releases/tag/v0.11.0) to be compatible.

Let me know if that helps or not!

@stevenleeg
Copy link

Ahh yes! Looks like I was following some outdated documentation rather than the latest on this repository.

For those of you following along at home, this configuration ended up fixing things for me:

# config/initializers/datadog_tracer.rb

Datadog.configure do |c|
  c.tracer enabled: Rails.env.production?

  c.use :rails
  c.use :active_record
end

Once that was in place I was able to spin up a development server without the annoying log noise and deploy to production with tracing. 🎆

@delner
Copy link
Contributor

delner commented Feb 2, 2018

Glad to hear it's working for you! 😄

Where did you see the outdated documentation? Maybe we should update that.

@rashmi-tondare
Copy link

I was following this documentation - https://elements.heroku.com/buildpacks/datadog/heroku-buildpack-datadog, realised now that it was outdated 😅
Maybe you guys could put in a link to your official docs here so there will be a central point of reference then.

@delner
Copy link
Contributor

delner commented Feb 8, 2018

@rashmi-tondare Ahhh, thanks for letting us know! I think it should be updated now on the buildpack repo, and on Heroku soon.

@rashmi-tondare
Copy link

@delner awesome! thank you 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Involves a bug integrations Involves tracing integrations
Projects
None yet
Development

No branches or pull requests

10 participants