ec2 tags collection is broken when kube2iam is used #3174

arminioa · 2019-03-20T10:20:23Z

What does this PR do?

Fix ec2 tag collection when datadog agent is used in kubernetes with kube2iam.

Additional Notes

There is no need to add "/" after a IAM role when composing the url used to get AWS security credentials.

Please see these issues:
DataDog/datadog-agen#3173
jtblin/kube2iam#204

bits-bot · 2019-03-20T10:20:30Z

All committers have signed the CLA.

codecov · 2019-03-20T12:25:56Z

Codecov Report

Merging #3174 into master will increase coverage by 2.96%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3174      +/-   ##
==========================================
+ Coverage   54.39%   57.36%   +2.96%     
==========================================
  Files         543      462      -81     
  Lines       39043    28796   -10247     
==========================================
- Hits        21239    16519    -4720     
+ Misses      16536    11165    -5371     
+ Partials     1268     1112     -156

Flag	Coverage Δ
#linux	`57.36% <100%> (ø)`	⬆️
#windows	`?`

Impacted Files	Coverage Δ
pkg/util/ec2/ec2_tags.go	`28.98% <100%> (+3.69%)`	⬆️
pkg/trace/sampler/prioritysampler.go	`49.05% <0%> (-11.99%)`	⬇️
pkg/metadata/resources/resources.go	`75% <0%> (-8.34%)`	⬇️
pkg/trace/sampler/scoresampler.go	`58.33% <0%> (-7.71%)`	⬇️
pkg/quantile/key.go	`46.15% <0%> (-7.18%)`	⬇️
pkg/metadata/host/host_tags.go	`74% <0%> (-6%)`	⬇️
pkg/procmatch/signature.go	`66.66% <0%> (-4.77%)`	⬇️
pkg/metrics/event.go	`67.1% <0%> (-4.64%)`	⬇️
pkg/metrics/service_check.go	`56.71% <0%> (-4.33%)`	⬇️
pkg/trace/test/buffer.go	`75% <0%> (-4.32%)`	⬇️
... and 386 more

arminioa · 2019-03-20T12:31:51Z

@truthbk please review

Simwar · 2019-03-28T09:53:49Z

@clamoriniere can you review?

Thanks!

2rs2ts · 2019-03-28T18:54:18Z

This may actually depend on which instance type you are running on. On m5 and c5, you do not want the trailing /, but on other instance types you do. See this issue for more details (particularly this comment) uswitch/kiam#42 (comment)

2rs2ts · 2019-03-28T18:55:26Z

I wonder if you even need to manually fetch credentials at all. Normally, if you provide no credentials the AWS SDK will attempt to use the instance profile, and it does all this work behind the scenes.

Simwar · 2019-03-29T13:01:36Z

Hi @2rs2ts

The issue linked is, I believe, about the URL:
http://169.254.169.254/latest/meta-data/iam/security-credentials having a trailing '/' or not.
Our issue here is about http://169.254.169.254/latest/meta-data/iam/security-credentials/iam-role/ so not the same URL.
I tested on a m5 and on a t2 instance with or without a '/' and it works.
Need to test on kube2iam and we can merge.

Simwar · 2019-03-29T16:37:57Z

Tested on kube2iam. Working without the '/' in the curl.

arminioa · 2019-03-30T09:48:07Z

Thank you guys!

xvello · 2019-04-03T14:59:15Z

While QA-ing this feature, I stumbled upon this known kube2iam issue about cold caches. This is an issue for the agent as our cloud provider autodetection algorithm has intentionally short timeouts, to avoid delaying too much the agent startup.

To pre-fill the kube2iam cache, I had to add the following init container to the agent daemonset:

      initContainers:
      - image: datadog/agent:latest
        imagePullPolicy: Always
        name: curl-iam
        command: ['curl', 'http://169.254.169.254/latest/meta-data/iam/security-credentials/<ASSUMED_IAM_ROLE_NAME>']

2rs2ts · 2019-04-04T19:33:10Z

That intentionally short timeout causes a lot of issues for us and many others... perhaps it should be reconsidered. One really really common problem is that the metadata endpoint isn't available (often due to setups like kube2iam or kiam) and so the agent boots up with the ec2 hostname as the agent hostname, which of course causes all sorts of problems in the backend because hostnames are not unique so you get tons of no data alerts and other confusion. If the user knows they're deploying the agent to a cloud provider, autodetection only makes things harder for them.

hkaj · 2019-04-05T12:46:57Z

Thanks for the feedback @2rs2ts

I'll add something to the backlog to make it configurable. I think it requires a different config for each hostname provider, as otherwise we'll wait for X seconds * Y hostname provider, Y being the amount of providers preceding the one that will succeed in the user's environment, and the EC2 provider is the second last one which would significantly slow down the hostname detection. Do you agree, or did you have another solution in mind?

2rs2ts · 2019-04-05T22:16:11Z

I think you probably just want to provide a config value for the cloud provider in general. If I say to the agent "you are in EC2" then it should do only EC2 things, not GCE things or anything else. Hopefully that makes sense!

Remove trailing slash from url used to retrive security credentials

a55942f

arminioa requested a review from a team as a code owner March 20, 2019 10:20

arminioa changed the title ~~Remove trailing slash from url used to retrive security credentials~~ ec2 tags collection is broken when kube2iam is used Mar 20, 2019

arminioa force-pushed the ec2_tags_iam_role_fix branch from 7b0b465 to 13c16c9 Compare March 20, 2019 12:38

reno added

3e6975b

arminioa force-pushed the ec2_tags_iam_role_fix branch from 13c16c9 to 3e6975b Compare March 20, 2019 12:40

clamoriniere approved these changes Mar 28, 2019

View reviewed changes

hush-hush added this to the 6.11.0 milestone Mar 28, 2019

hush-hush added the team/containers label Mar 28, 2019

hush-hush approved these changes Mar 28, 2019

View reviewed changes

Simwar merged commit 36d7364 into DataDog:master Mar 29, 2019

arminioa deleted the ec2_tags_iam_role_fix branch March 30, 2019 09:49

arminioa mentioned this pull request Mar 30, 2019

Invalid role: does not match annotated role because of trailing / in metadata url jtblin/kube2iam#204

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ec2 tags collection is broken when kube2iam is used #3174

ec2 tags collection is broken when kube2iam is used #3174

arminioa commented Mar 20, 2019

bits-bot commented Mar 20, 2019 •

edited

Loading

codecov bot commented Mar 20, 2019 •

edited

Loading

arminioa commented Mar 20, 2019

Simwar commented Mar 28, 2019 •

edited

Loading

2rs2ts commented Mar 28, 2019

2rs2ts commented Mar 28, 2019

Simwar commented Mar 29, 2019 •

edited

Loading

Simwar commented Mar 29, 2019

arminioa commented Mar 30, 2019

xvello commented Apr 3, 2019

2rs2ts commented Apr 4, 2019

hkaj commented Apr 5, 2019

2rs2ts commented Apr 5, 2019 •

edited

Loading

ec2 tags collection is broken when kube2iam is used #3174

ec2 tags collection is broken when kube2iam is used #3174

Conversation

arminioa commented Mar 20, 2019

What does this PR do?

Additional Notes

bits-bot commented Mar 20, 2019 • edited Loading

codecov bot commented Mar 20, 2019 • edited Loading

Codecov Report

arminioa commented Mar 20, 2019

Simwar commented Mar 28, 2019 • edited Loading

2rs2ts commented Mar 28, 2019

2rs2ts commented Mar 28, 2019

Simwar commented Mar 29, 2019 • edited Loading

Simwar commented Mar 29, 2019

arminioa commented Mar 30, 2019

xvello commented Apr 3, 2019

2rs2ts commented Apr 4, 2019

hkaj commented Apr 5, 2019

2rs2ts commented Apr 5, 2019 • edited Loading

bits-bot commented Mar 20, 2019 •

edited

Loading

codecov bot commented Mar 20, 2019 •

edited

Loading

Simwar commented Mar 28, 2019 •

edited

Loading

Simwar commented Mar 29, 2019 •

edited

Loading

2rs2ts commented Apr 5, 2019 •

edited

Loading