ManagedClusterView should preserve original error when retrieving resources #604

BenamarMk · 2023-05-08T13:50:22Z

Problem:
Currently, the ManagedClusterView implementation does not preserve the original error when attempting to retrieve resources from a ManagedCluster. Instead, it converts all errors to a generic error and embeds the error text in the condition message (see here), which can be frustrating and difficult to parse.

For example, when a resource is not found or when there is a connectivity issue, there is nothing that distinguishes between the two cases other than the embedded error message within the .status.condition[0].Message

To address this, we recommend improving the ManagedClusterView implementation to preserve the last error when retrieving resources. Specifically, we suggest adding a new field to the .status object that holds the last error encountered during resource retrieval. Another possibility is to add another condition that holds the error.

The text was updated successfully, but these errors were encountered:

zhiweiyin318 · 2023-05-25T13:12:56Z

we have different reasons for this condition when failed to get resource:
"ResourceNameInvalid"
"ResourceTypeInvalid"
"ResourceGVKInvalid"
"GetResourceFailed"

not sure to add a not found reason can help you?
do you still need the error if the process is recovered from failure?

BenamarMk · 2023-06-06T20:05:08Z

For the reason GetResourceFailed, there is no distinction between a Not Found error and a Timeout error for example. Both errors end up in the message string.
The Not Found error tells us that the resource does not exist in the managed cluster. Here is an example of not found error

GetResourceFailed failed to get resource with err: volumereplicationgroups.ramendr.openshift.io "busybox-drpc" not found

And here is an example of an inaccessible cluster

GetResourceFailed failed to get resource with err: Get "https://172.30.0.1:443/apis/ramendr.openshift.io/v1alpha1/namespaces/busybox-workloads-3/volumereplicationgroups/busybox-drpc": dial tcp 172.30.0.1:443: i/o timeout

The line that builds the error message -> see here

zhiweiyin318 · 2023-06-08T09:41:33Z

@BenamarMk how about use the statusReason of error to replace GetResourceFailed ?

BenamarMk mentioned this issue May 8, 2023

Fix MCV resource retrieval error conversion RamenDR/ramen#873

Merged

1 task

zhiweiyin318 mentioned this issue Jun 8, 2023

use statusReason of error #620

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ManagedClusterView should preserve original error when retrieving resources #604

ManagedClusterView should preserve original error when retrieving resources #604

BenamarMk commented May 8, 2023

zhiweiyin318 commented May 25, 2023 •

edited

Loading

BenamarMk commented Jun 6, 2023 •

edited

Loading

zhiweiyin318 commented Jun 8, 2023

ManagedClusterView should preserve original error when retrieving resources #604

ManagedClusterView should preserve original error when retrieving resources #604

Comments

BenamarMk commented May 8, 2023

zhiweiyin318 commented May 25, 2023 • edited Loading

BenamarMk commented Jun 6, 2023 • edited Loading

zhiweiyin318 commented Jun 8, 2023

zhiweiyin318 commented May 25, 2023 •

edited

Loading

BenamarMk commented Jun 6, 2023 •

edited

Loading