Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kserve-controller fails to be removed #131

Open
DnPlas opened this issue Jun 16, 2023 · 3 comments
Open

kserve-controller fails to be removed #131

DnPlas opened this issue Jun 16, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@DnPlas
Copy link
Contributor

DnPlas commented Jun 16, 2023

Removing kserve-controller will raise an error because it expects the istio-pilot:gateway-info relation to be present. This is caused by how the resources are rendered in preparation for removal. The context for rendering these Kubernetes resources is tightly coupled to the presence of certain relations.

Traceback (most recent call last):                                                                                                                                                
  File "./src/charm.py", line 467, in <module>                                                                                                                                    
    main(KServeControllerCharm)                                                                                                                                                   
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/main.py", line 441, in main                                                                                  
    _emit_charm_event(charm, dispatcher.event_name)                                                                                                                               
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/main.py", line 149, in _emit_charm_event                                                                     
    event_to_emit.emit(*args, **kwargs)                                                                                                                                           
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 354, in emit                                                                             
    framework._emit(event)                                                                                                                                                        
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 830, in _emit                                                                            
    self._reemit(event_path)                                                                                                                                                      
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 919, in _reemit                                                                          
    custom_handler(event)                                                                                                                                                         
  File "./src/charm.py", line 284, in _on_remove                                                                                                                                  
    cm_resources_manifests = self.cm_resource_handler.render_manifests()                                                                                                          
  File "./src/charm.py", line 159, in cm_resource_handler                                                                                                                         
    context=self._inference_service_context,                                                                                                                                      
  File "./src/charm.py", line 136, in _inference_service_context                                                                                                                  
    gateways_context = self._generate_gateways_context()                                                                                                                          
  File "./src/charm.py", line 348, in _generate_gateways_context                                                                                                                  
    raise ErrorWithStatus("Please relate to istio-pilot:gateway-info", BlockedStatus)                                                                                             
charmed_kubeflow_chisme.exceptions._with_status.ErrorWithStatus: Please relate to istio-pilot:gateway-info    

Steps to reproduce

  1. Deploy juju deploy kserve-controller --channel 0.11/stable --trust or juju deploy kserve-controller --channel latest/edge --trust
  2. Deploy juju istio-pilot --channel 1.17/stable --trust and juju istio-gateway istio-ingressgateway --channel 1.17/stable --trust --config kind=ingress
  3. Deploy juju deploy knative-serving --channel 1.10/stable --trust and juju deploy knative-operator --channel 1.10/stable --trust
  4. Configure and relate juju config knative-serving namespace="knative-serving" istio.gateway.namespace=kubeflow istio.gateway.name=istio-gateway
  5. Relate all of the charms that are deployed as needed
  6. Once everything is settled, remove juju remove-application kserve-controller

Environment

microk8s 1.29-strict/stable
microk8s addons: dns hostpath-storage metallb:10.64.140.43-10.64.140.49
juju 3.4/stable (3.4.4)

@DnPlas
Copy link
Contributor Author

DnPlas commented Jul 4, 2024

Even after the refactoring introduced in #246 and #197, this issue is still present:

unit-kserve-controller-0: 20:00:30 INFO unit.kserve-controller/0.juju-log ingress-gateway:32: Reconcile completed successfully
unit-kserve-controller-0: 20:00:30 ERROR unit.kserve-controller/0.juju-log ingress-gateway:32: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "./src/charm.py", line 638, in <module>
    main(KServeControllerCharm)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/main.py", line 441, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/main.py", line 149, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 342, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 839, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 928, in _reemit
    custom_handler(event)
  File "./src/charm.py", line 432, in _on_event
    self.cm_resource_handler.apply()
  File "./src/charm.py", line 212, in cm_resource_handler
    context={**self._inference_service_context, **self.images_context},
  File "./src/charm.py", line 189, in _inference_service_context
    gateways_context = self._generate_gateways_context()
  File "./src/charm.py", line 517, in _generate_gateways_context
    ingress_gateway_info = self._ingress_gateway_info
  File "./src/charm.py", line 256, in _ingress_gateway_info
    return self._ingress_gateway_requirer.get_relation_data()
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/lib/charms/istio_pilot/v0/istio_gateway_info.py", line 206, in get_relation_data
    self._relation_preflight_checks(relation=relation)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/lib/charms/istio_pilot/v0/istio_gateway_info.py", line 183, in _relation_preflight_checks
    relation_data = relation.data[remote_app]
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/model.py", line 1480, in __getitem__
    raise KeyError(
KeyError: 'Cannot index relation data with "None". Are you trying to access remote app data during a relation-broken event? This is not allowed.'

@DnPlas DnPlas added the bug Something isn't working label Jul 4, 2024
Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5962.

This message was autogenerated

@DnPlas
Copy link
Contributor Author

DnPlas commented Jul 10, 2024

The team suggests to either:

  1. Refactor this charm using chisme's base charm
  2. Add labels for apply and removal to avoid depending on the context for rendering manifests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant