Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Messaging Updates for Diagnostics #17314

Closed
bflad opened this issue Jan 27, 2021 · 4 comments
Closed

Error Messaging Updates for Diagnostics #17314

bflad opened this issue Jan 27, 2021 · 4 comments
Labels
provider Pertains to the provider itself, rather than any interaction with AWS. service/rds Issues and PRs that pertain to the rds service. stale Old or inactive issues managed by automation, if no further action taken these will get closed. technical-debt Addresses areas of the codebase that need refactoring or redesign.

Comments

@bflad
Copy link
Contributor

bflad commented Jan 27, 2021

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

Given the age of the Terraform AWS Provider and some of its resources, there has been a long history of varied styles for returning resource error messages. Some places return no context (return err) while others return some additional context (return fmt.Errorf("...", err)) with varied message styles.

Through much of the history of these error messages (prior to Terraform CLI version 0.12), the user interface experience when showing these error messages was in a "go-multierror" style (destroy operation error shown below):

Error: Error applying plan:

1 error(s) occurred:

* aws_db_instance.rds_sc_node (destroy): 1 error(s) occurred:

* aws_db_instance.rds_sc_node: DB Instance FinalSnapshotIdentifier is required when a final snapshot is required

Where Terraform CLI injected all this user interface output:

Error: Error applying plan:

1 error(s) occurred:

* aws_db_instance.rds_sc_node (destroy): 1 error(s) occurred:

* aws_db_instance.rds_sc_node:

And the Terraform resource error was the remaining part (as sourced from return ...):

DB Instance FinalSnapshotIdentifier is required when a final snapshot is required

When returning AWS Go SDK (e.g. API) error messages, the lack of returning error message context meant that practitioners could be faced with very confusing errors, such as this refresh operation example:

Error refreshing state: 1 error(s) occurred:

* 1 error(s) occurred:

* AccessDenied: Access Denied
    status code: 403, request id: []

To rectify this, the maintainers moved towards requiring a consistent error messaging context including a noun and verb in addition to the actual error, so it was more clear about what was happening and where. e.g. return fmt.Errorf("error creating/reading/updating/deleting Service Thing (%s): %w", d.Id(), err). This preferred style is documented in the Contributing Guide.

Terraform CLI version 0.12 and later added support for resources to return both warning and error diagnostics in the user interface output. Diagnostics support both a summary and details fields. Given these changes, the output is now much more terse:

Error: Summary

Details
Warning: Summary

Details

To compensate for existing resources that return a raw error type, the Terraform Plugin SDK converts these errors into an error diagnostic with the error message string in the summary field. Given the prior consistency standard for error messages, the error messaging now has an awkward output:

Error: error creating/reading/updating/deleting Service Thing (abc123): ErrCode: ErrMessage

Rather than introduce yet more error message stylings with raw error types (since they should be updated anyways), it will likely be ideal to address this as or after resources are migrated to Terraform Plugin SDK version 2 and fully support diagnostics. This will allow resources to properly inject a summary and separate the details.

Some initial proposals include introducing helpers to wrap error messages before they are returned in resource logic. For example, having error diagnostic wrapper(s) such as:

// The below represents illustrative examples where the details of the
// naming, messaging, and function signatures are only to show the
// conceptual functionality and potential.

// "Simple" wrapper

func ResourceReadErrorDiagnostic(service string, component string, identifier string, err error) diag.Diagnostic {
  return diag.Diagnostic{
    Severity: diag.Error,
    Summary: fmt.Sprintf("Unable to Read %s %s (%s)", service, component, identifier),
    Detail: err.Error(),
  }
}

// "Error Type" wrapper

func ResourceReadErrorDiagnostic(service string, component string, identifier string, err error) diag.Diagnostic {
  summary := fmt.Sprintf("Unable to Read %s %s (%s)", service, component, identifier)
  var awsError awserr.Error
  var resourceTimeoutError resource.TimeoutError

  if errors.As(err, &awsError) {
    return diag.Diagnostic{
      Severity: diag.Error,
      Summary: summary,
      Detail: fmt.Sprintf("AWS API operation error: %s", awsError.Error()),
    }
  }

  if errors.As(err, &resourceTimeoutError) {
    return diag.Diagnostic{
      Severity: diag.Error,
      Summary: summary,
      Detail: fmt.Sprintf("Terraform resource timeout error: %s", resourceTimeoutError.Error()),
    }
  }

  return diag.Diagnostic{
    Severity: diag.Error,
    Summary: summary,
    Detail: err.Error(),
  }
}

// "Advanced Error Type" wrapper
// return AwsError{Operation: "DescribeVpcs", Error: err}

type AwsError struct {
  Operation string
  Error err
  // to get really detailed/fancy
  // Endpoint string
  // Region string
}

func ResourceReadErrorDiagnostic(service string, component string, identifier string, err error) diag.Diagnostic {
  summary := fmt.Sprintf("Unable to Read %s %s (%s)", service, component, identifier)
  var awsError AwsError
  var resourceTimeoutError resource.TimeoutError

  if errors.As(err, &awsError) {
    return diag.Diagnostic{
      Severity: diag.Error,
      Summary: summary,
      Detail: fmt.Sprintf("AWS API operation (%s) error: %s", awsError.Operation, awsError.Error()),
    }
  }

  if errors.As(err, &resourceTimeoutError) {
    return diag.Diagnostic{
      Severity: diag.Error,
      Summary: summary,
      Detail: fmt.Sprintf("Terraform resource timeout error: %s", resourceTimeoutError.Error()),
    }
  }

  return diag.Diagnostic{
    Severity: diag.Error,
    Summary: summary,
    Detail: err.Error(),
  }
}

Which the above could render as:

Error: Unable to Read EC2 VPC (vpc-123)

AWS API operation (DescribeVpcs) error: AccessDenied: AccessDenied

References

@bflad bflad added thinking technical-debt Addresses areas of the codebase that need refactoring or redesign. provider Pertains to the provider itself, rather than any interaction with AWS. labels Jan 27, 2021
@ghost ghost added the service/rds Issues and PRs that pertain to the rds service. label Jan 27, 2021
@bflad
Copy link
Contributor Author

bflad commented Jan 27, 2021

Until this handling is decided, we could consider dropping any leading error prefix in fmt.Error() messages to remove the Error: error ... problem now. I would just be worried about creating yet more styles that will make things harder to find/replace when we get to this point. (Unless we want to tackle that now, but I think we should let practitioners weigh in on the urgency of that type of work given everything else going on. Those updates might make the error messages confusing without further sentence/context updates, making more work.)

@YakDriver
Copy link
Member

Excellent write up. I concur on sticking with one standard to facilitate future replacing of fmt.Errorf("error....

Copy link

Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 30 days it will automatically be closed. Maintainers can also remove the stale label.

If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

@github-actions github-actions bot added the stale Old or inactive issues managed by automation, if no further action taken these will get closed. label Dec 26, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 29, 2024
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
provider Pertains to the provider itself, rather than any interaction with AWS. service/rds Issues and PRs that pertain to the rds service. stale Old or inactive issues managed by automation, if no further action taken these will get closed. technical-debt Addresses areas of the codebase that need refactoring or redesign.
Projects
None yet
Development

No branches or pull requests

3 participants