Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

charm is constantly logging errors #30

Open
nishant-dash opened this issue Jan 30, 2024 · 3 comments
Open

charm is constantly logging errors #30

nishant-dash opened this issue Jan 30, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@nishant-dash
Copy link

nishant-dash commented Jan 30, 2024

Bug Description

the charm gets into this state wheres logging tls errors and it stays active idle. Its not working as expected and is not injecting configs specified in its settings_yaml config to the other pods (in the corresponding namespaces.)

There is no proper visibility into the state of this charm outside of logs. It would be nice if the charm workload status reflected its state and it could forward its logs to COS(loki).

To Reproduce

N/A

Environment

namespace-node-affinity                                    active       1  namespace-node-affinity  0.1/beta              5  REDACTED    no

Relevant Log Output

2024/01/29 1835 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1838 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1840 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1842 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1857 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1857 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1857 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1836 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1805 http: TLS handshake error from REDACTED remote error: tls: bad certificate

Additional Context

settings_yaml config

      controller-k8s: |
        nodeSelectorTerms:
          - matchExpressions:
            - key: kubeflowserver
              operator: In
              values:
              - true
      kubeflow: |
        nodeSelectorTerms:
          - matchExpressions:
            - key: kubeflowserver
              operator: In
              values:
              - true
      metallb: |
        nodeSelectorTerms:
          - matchExpressions:
            - key: kubeflowserver
              operator: In
              values:
              - true
@nishant-dash nishant-dash added the bug Something isn't working label Jan 30, 2024
Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5269.

This message was autogenerated

@nishant-dash
Copy link
Author

this charm erroring out effectively breaks segregation between charmed kubeflow services and other workloads (when the pods get restarted that is)

@nishant-dash
Copy link
Author

also chatted with @kimwnasptd and we agree that it makes sense for this functionality to exist in juju itself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant