Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No error handling / logging in MetadataConfig #4479

Closed
dietervdw-spotify opened this issue Feb 11, 2019 · 6 comments
Closed

No error handling / logging in MetadataConfig #4479

dietervdw-spotify opened this issue Feb 11, 2019 · 6 comments
Assignees
Labels
api: core type: question Request for information or clarification. Not an issue.

Comments

@dietervdw-spotify
Copy link

Calls to metadata service can fail. This causes a NPE a little further.
This is a prime example of crappy error handling.

So now this is failing for some reason for me (while I can perfectly curl the service), and I don't have a clue. Logging the actual error would be much appreciated!

Not making a PR for this since I don't know what your error handling policy is.
I'd just throw this up all the way?

@dietervdw-spotify
Copy link
Author

My issue actually seems to be caused by #4269 .

@ajaaym
Copy link
Contributor

ajaaym commented Feb 11, 2019

@dietervdw-spotify can you please provide sample of code where its failing for you?

@ajaaym ajaaym added type: question Request for information or clarification. Not an issue. api: core labels Feb 11, 2019
@dietervdw-spotify
Copy link
Author

@ajaaym This was happening on version 0.70.0-alpha, it seems like that code path is now updated to handle the resulting null value.

However just swallowing errors is just bad. In my case apparently due to bug #4269 the metadata hostname didn't resolve to the actual google metadata service.
In version 0.70.0-alpha this caused an NPE, in the newer version this will just silently break anything that depends on MetadataConfig automatic metadata.

I guess it depends on Google's logging policy, but at least logging something on DEBUG or a one-line warning would help immensely!

@ajaaym
Copy link
Contributor

ajaaym commented Feb 13, 2019

@dietervdw-spotify do you mean http://metadata.google.internal is not getting resolved in GCE/GKE? error is getting silently ignored to avoid printing noise when not running in GCE/GKE mode and http://metadata.google.internal should always get resolved from GCE/GKE.

@dietervdw-spotify
Copy link
Author

@ajaaym : In a previous version the metadata hostname was specified as metadata instead of metadata.google.internal and that failed to resolve on our GCE machines for some reason.

I personally don't agree with suppressing errors and I think it's bad practice. There's a lot that can go wrong on a network, and it would be great to have a trace afterwards if something breaks.

For example in the Logback Stackdriver appender: if the automatic resolution fails for some reason, logs will get sent to resource 'global' instead of using the right resource type and metadata tags.
That is wrong on so many levels:

  • Logs get sent to the wrong destination and get the wrong tags, making them effectively get lost
  • This will not recover until the application is restarted

So if there's a small network glitch during startup, my application is now running with a completely wrong configuration. I'd rather have startup fail and print a big fat error in that case.

Maybe the Stackdriver Logback appender is not using this class in the right way then?
In any case, something seems very wrong to me here.

@ajaaym
Copy link
Contributor

ajaaym commented Feb 14, 2019

@dietervdw-spotify This can not be done bcz if logback is used outside GCE/GKE then it becomes reentrant and can not fail startup as well in that case. Closing this for now as upgrading to latest version resolves this. Please feel free to reopen if you think otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: core type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

2 participants