Skip to content

Conversation

@onematchfox
Copy link
Contributor

Currently, every agent must exist in both Kubernetes and the database for this call to succeed. Thus, if for whatever reason an agent fails to reconcile on creation, any calls to this endpoint will fail. This, in turn, breaks all agents in the UI. This solution is a bit of a hack/workaround, and I do think issues should be surfaced; however, it at least ensures that the UI stays functional in this scenario for the time being.

image image
$ k get agents                
NAME           ACCEPTED   MODELCONFIG
k8s-agent      True       default-model-config
promql-agent   False      default-model-config

…ts for one/more agents

Signed-off-by: Brian Fox <878612+onematchfox@users.noreply.github.com>
Copilot AI review requested due to automatic review settings July 31, 2025 15:34
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an issue where the list agents endpoint would fail completely if any single agent couldn't be loaded from the database, preventing the UI from displaying any agents. The fix changes the error handling to skip problematic agents and continue processing the rest, ensuring the API remains functional even when some agents have reconciliation issues.

  • Replaces hard failure with error logging and continuation for database lookup failures
  • Adds TODO comment acknowledging this is a temporary workaround
  • Maintains UI functionality by returning available agents instead of failing entirely

// TODO: Rather than excluding the agent completely we should
// probably return it but mark it as "unhealthy" in some way so this
// is visible to callers (e.g. in the UI)
log.Error(err, "failed to load agent from database", "agent", teamRef)
Copy link

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error logging should include more context about the specific error type to help with debugging. Consider logging the error type or checking if it's a specific type of error (e.g., not found vs connection error) before deciding to continue.

Suggested change
log.Error(err, "failed to load agent from database", "agent", teamRef)
if k8serrors.IsNotFound(err) {
log.Error(err, "agent not found in database", "agent", teamRef)
} else if strings.Contains(err.Error(), "connection") {
log.Error(err, "database connection error while loading agent", "agent", teamRef)
} else {
log.Error(err, "unexpected error while loading agent from database", "agent", teamRef)
}

Copilot uses AI. Check for mistakes.
@onematchfox
Copy link
Contributor Author

Problematic code was commented out in #671

@onematchfox onematchfox closed this Aug 1, 2025
@onematchfox onematchfox deleted the list-agents branch August 28, 2025 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant