Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: adding try/catch to initialize lsp server, added telemetry log events #320

Merged
merged 3 commits into from
Feb 11, 2025

Conversation

suprajaven
Copy link
Contributor

Problem

Relevant PRs for the background : #307, #310 and #306

As part of the comment/ followup - we want to add observability related improvements including telemetry and try/catch blocks.

Solution

As a followup to fixing the optional field 'aws' in the initialization options, this PR adds :

  1. Try/ catch block to the initialization of the router
  2. Added a telemetry log event for observability

License

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@suprajaven suprajaven requested a review from a team as a code owner February 5, 2025 09:12
@@ -89,6 +89,17 @@ export class LspServer {
token: CancellationToken
): Promise<PartialInitializeResult | ResponseError<InitializeError> | undefined> => {
if (!params.initializationOptions?.aws) {
if (this.lspConnection?.telemetry) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would lspConnection connection be undefined given it's expected in the constructor? lspConnection is a must - nothing works without it. telemetry is also expected to be defined. I do not think we should check everything for undefined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, noted. I shall update.

@@ -89,6 +89,17 @@ export class LspServer {
token: CancellationToken
): Promise<PartialInitializeResult | ResponseError<InitializeError> | undefined> => {
if (!params.initializationOptions?.aws) {
if (this.lspConnection?.telemetry) {
this.lspConnection.telemetry.logEvent({
name: 'Initialization error',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to also get approval for the structure from Toolkits.

@@ -89,6 +89,17 @@ export class LspServer {
token: CancellationToken
): Promise<PartialInitializeResult | ResponseError<InitializeError> | undefined> => {
if (!params.initializationOptions?.aws) {
if (this.lspConnection?.telemetry) {
this.lspConnection.telemetry.logEvent({
name: 'Initialization error',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Toolkit metric names and shapes are defined in a central place. If you look for the "metrics" array, you can see that the name fields of each entry are all standardized. For the toolkits they follow a naming convention.

Given that the language servers will be used in more places than Toolkits, the language servers don't have to use the same convention or names as the Toolkit (in fact, the Toolkits may change their conventions at some point too). It is critical that the language servers use a convention, and have consistency in the data that is emitted through the telemetry logEvent calls. This is important because destinations have to ensure that the data is valid for where they emit telemetry to, and may need to perform post-processing transformations, which can only be done with an understanding of what the language server data will look like. I believe this is the first time that telemetry logEvent is being used, so this may take some time to produce a long-term design for. I'd like to see the following:

  • define and document the metric naming convention
    • what are valid characters? (request: no whitespaces please)
    • will they follow a pattern like "(component).(occasion)"
  • centralize the metric definitions somewhere
    • at a minimum, have a central collection of consts for all of the metric names
    • even nicer: have an accompanying comment for each of these, to indicate what the metric represents and contains
    • motivation: easier to look up or discover for both language server and destination developers
  • maintain a documentation of telemetry that can be emitted (names, their purposes, and when they are emitted).

CC: @justinmk3 for additional thoughts on everything above

Unless you have a convention that influences the name, I'd like to see this metric named something like runtimeInitializationValidation. This way, the foo service can emit metrics like fooBar and we are able to more clearly understand where the metric was applicable to.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flare team was discussing OpenTelemetry, right?

BTW: if possible, it will gain a lot if this project implements logging and telemetry as one system. There is no good reason for them to be separate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@awschristou

  1. Agreed regarding the discussion on convention/documentation - However, since this is an important fix for the observability, I will create a separate item to track for the same. For now, I shall update the fields used based on the previous uses of this (There have been a few in the servers repo already)
  2. @justinmk3 Regarding OpenTelemetry, the project is indeed in progress, however, for this specific instance, it would make more sense for the client to have the understanding of this aws field than Flare's systems. Let me know if that is what you meant.

this.lspConnection.telemetry.logEvent({
name: 'Initialization error',
result: 'Failed',
data: JSON.stringify(params.initializationOptions),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data is intended to represent a collection of key/value pairs, because any given metric can and will contain multiple datapoints. This will cause serialization problems with destinations. I don't know if it is possible to model MetricEvent.data to reflect this. data should never be treated as a scalar.

Here we should set the blob to a property, for example:

   data: {
      initializationOptions: JSON.stringify(...),
   },

In case anyone asks, we should be deliberate about using JSON.stringify here, and not just add the object into the data map. The reason is we want to be able to inspect the state received by the server, and not risk additional mutations as the event is marshalled through to a destination.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can logEvent() do that (or "deep copy") automatically/implicitly, instead of doing it here?

If we are worried about mutable state during the logEvent() codepath, it should probably enforce that somehow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the callout, I shall update it based on the schema and would use deep copy here. Ack.

return initializeResult
} catch (e) {
this.logger.log(`Unknown initialization error\n${e}`)
return new ResponseError(ErrorCodes.UnknownErrorCode, `Unknown initialization error\n${e}`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably emit a separate metric here runtimeInitializationError containing the error reason. If a destination (like the Toolkit) gets a report that there was a response error, that team is going to want to look for additional observability to help develop an understanding of the problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

result: 'Failed',
data: {
hasAwsConfig: Boolean(params.initializationOptions?.aws),
hasLogLevel: params.initializationOptions?.logLevel,
Copy link
Contributor

@volodkevych volodkevych Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it should be named logLevel as it's not a boolean

@suprajaven suprajaven merged commit 48b064b into main Feb 11, 2025
3 checks passed
@suprajaven suprajaven deleted the initializetest branch February 11, 2025 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants