Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

environment variable OTEL_CLI_SERVICE_NAME and OTEL_CLI_ATTRIBUTES did not take effect #109

Closed
wolf666666 opened this issue Jul 11, 2022 · 4 comments
Labels
bug Something isn't working

Comments

@wolf666666
Copy link

When i set OTEL_CLI_SERVICE_NAME and OTEL_CLI_ATTRIBUTES with command line:
export OTEL_CLI_SERVICE_NAME="aaa"
export OTEL_CLI_ATTRIBUTES="id=123"
It seems the two variable did not take effect when the span generated, but if i set them in the command line:
eg, otel-cli span --service aaa --attr "id=123", the two attributes will be add to the span

I think it may be a bug of otel-cli

@tobert tobert added the bug Something isn't working label Aug 9, 2022
@tobert
Copy link
Collaborator

tobert commented Aug 9, 2022

Thanks for the report! I have a test in mind that I'll write first. I'm pretty swamped with work and other things and hope to work on this soon.

@tobert
Copy link
Collaborator

tobert commented Jan 9, 2023

This was trickier to fix than I expected. Here's a PR, we'll merge soon. #117

tobert added a commit that referenced this issue Jan 11, 2023
It seems like some documented environment variables never worked, or haven't in a while.
When I wrote the first pass, I included Viper because it was recommended with Cobra, and
maybe a config file will be more useful than I'm imagining right now. A couple things were
broken such that config loading and environment variables via Viper were not working at
all. I think the critical ones like `OTEL_EXPORTER_OTLP_ENDPOINT` worked because the
OpenTelemetry SDK picked it up directly.

So really this is a story about testing. We weren't testing environment variable loading.
There is still work to do but now there are some tests. The test harness already had most
everything I needed to build the tests so I did that first. At least now we won't be
embarrassed again.

I'll mention the fixes to Viper for posterity as they were tricky to figure out and should
become a post of their own. The first is that viper.Unmarshal uses mapstructure under
the covers so it's the `mapstructure:"tag"` struct tag that it needs to map pflag names
to the struct fields. The second is that Viper expects JSON maps in envvars, not the
k=v,k=v format otel-cli uses elsewhere and documents. So I had to add this, which
also took a minute to figure out:
https://github.com/tobert/otel-cli/blob/109-fix-envvar-settings/otelcli/root.go#L158

After getting tests passing with Viper in a PR, I went looking at issues and read #83 
and decided to take a little detour and see what removing Viper looked like. I always
wanted to try doing struct tags by hand too so I gave that a go and that's what we're
going with for now. So this PR / squash merge is a derivative of the Viper PR in #117 
that completely removes it for custom code, and ends up shrinking the binary on
x86_64 by 1.2 megabytes. That was enough for me to polish up the removal and land
us here.

Normally I would avoid such a big behavioral change, knowing folks are using otel-cli
on their systems, but since the functionality has been broken so long, now's the time
to make the cut and clean up a little bit.

-Amy

Fixes: #83 #109

Commit history follows:

* is_sampled was never available in span data

This never should have worked. Next commit will fix the span data
checker.

* rework checkSpanData for correctness

Working on a test for #109 it became clear there were some faults in how
checkSpanData worked. For one it wasn't validating that an expected
'is_sampled' was actually in the received span. It turns out that field
was never available for comparison. That check was added.

Also only the regex fields were being checked because of looping over
that map so I moved it up to a global map and now loop over received
fields and compare against instead of the other way around. This should
be more exhaustive.

* add failing test for #109

* plumb service attributes into CliEvent

In order to test service name setting, this needs to come through in the
tests. This patch does that.

Refactored attribute flattening code to its own function.

* make a failing test for service name

This test should pass once the bug is fixed.

* make attr map flattening more ergonomic

* also test OTEL_RESOURCE_ATTRIBUTES

also test multiple attributes are handled correctly

* add mapstructure struct tags

viper seems to need these. the service name envvar works now.
OTEL_CLI_ATTRIBUTES still does not work.

* check error on viper.Unmarshal 🤦

test fails in a more interesting way now

always check your errors, kids

* add a comment

* add parseCkvStringMap helper & test

The one in pflags is buried deep so we need our own to make Viper do the
right thing.

* hack: strip types off stringified span attributes

This isn't ideal since the type information is there somewhere and it
could be converted cleanly. That said, I dug through the grpc and otel
code trying to figure it out and didn't make much progress. Since this
server isn't super serious, it should be ok to clean up with string
manipulation and get all the tests to work as expected. This seems to
work for now so proper type conversion can happen another day. Sorry.

* implement & use mapstructure decode hook

By default viper will parse JSON in envvars but not the comma-separated
k=v style that cobra/pflag does. This implements a hook to intercept
those values and decode them manually, which makes OTEL_CLI_ATTRIBUTES
finally work as advertised.

* use strings.SplitN instead of Cut for older go

For now, let's make sure folks can build widely. Probably should make a
decision soon about minimum Go compiler though.

* replace viper with custom environment loading

We discussed this in #83. It seems to be less code overall, even before
removing the dependency.

* clean up TODOs, wrap errors

* make env parse failures a soft failure

I'm not super sure this is the right move but feels like it follows
policy.

* add envvar failure test, add Results.CliOutputRe

Wrote a test to verify parsing fails on an invalid environment variable
value.

Because softFail uses log.Fatal, it includes a date stamp which is the
right thing to do. To make testing easier, this called for a way to
modify the output before comparing to the fixture data, thus I added
Results.CliOutputRe which takes a regex and always deletes the match.

* remove accidental comment

* re-implment config file loading

It's json-only now, and an example is added. I'm not thrilled about the
ordering of Defaults -> CLI -> environment -> file but it works and is
consistent.

* update README docs for configuration

* fix up config test, add needed infra for it

In order to fix the config test, a "*" option needed to be added to the
test harness so I did that. Cleaned up up README and other things
after getting the test right, so there are a few odds & ends in this
commit that are related.

* correct comment
@tobert
Copy link
Collaborator

tobert commented Jan 11, 2023

@wolf666666 thanks so much for the report. I feel embarrassed for leaving it like this so long but it's fixed now.

Please let me know if the latest code is working for you. I have a few more issues to work on then we'll get another release out.

tobert added a commit that referenced this issue Jan 26, 2023
* add a failing test for span background --attrs

Actually modifying an existing test to use the feature.

* add missing func to support span background --attrs

test still fails but fails less badly, next to fix the map[string]string
translations

* fix test data so test passes

The test was getting the right data but failing becaues of a TODO in
`otlpserver/server.go`. That TODO involves building a type switch and
supporting at least a subset of otel types so rather than doing that
right now, just have the test accept the format with the type in it.

* is_sampled was never available in span data

This never should have worked. Next commit will fix the span data
checker.

* rework checkSpanData for correctness

Working on a test for #109 it became clear there were some faults in how
checkSpanData worked. For one it wasn't validating that an expected
'is_sampled' was actually in the received span. It turns out that field
was never available for comparison. That check was added.

Also only the regex fields were being checked because of looping over
that map so I moved it up to a global map and now loop over received
fields and compare against instead of the other way around. This should
be more exhaustive.

* add failing test for #109

* update test

The type tags leaking through in strings was fixed in #120 so
it doesn't have to be in the fixture strings anymore.
@tobert
Copy link
Collaborator

tobert commented Feb 2, 2023

This should be fixed permanently in release v0.1.0. Thanks for you patience <3

https://github.com/equinix-labs/otel-cli/releases/tag/v0.1.0

@tobert tobert closed this as completed Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants