-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Knative probes are failing after deploying sample sk-learn model, not able to call the URL #1153
Comments
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! |
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! |
1 similar comment
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! |
My Kubeflow installation is 1.1 and it is installed on premise no Dex is configured still I am getting 403 error, can you suggest if installing separate gateway is the only option here? @yuzisun - apologies for tagging but is there any other solution to this?
|
@kd303 Do you have istio sidecar injected in the inference service pod? I think you have istio security turned on in the cluster so istio is blocking the request to hitting the inference service main container. |
@yuzisun I have side care injected pls see the output, i dont think I have any security turned out its a plain cluster where I dont have to provide the logons to kubeflow dashboard.. Pls note I have the envoy debugging turned on for rbac & http2 logs enabled.. on activator my logs are failing with RBAC error, I am not sure what is wrong..
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@yuzisun Pls ignore above logs now, and redeploy the cluster just to be sure ok, I cleaned all the services everything that was available on the cluster, I no longer get RBAC error, Please see the issue description now:
activator logs are below:
It is trying to probe 10.244.3.43 which is my Pod IP, where sk-learn service is deployed, the probe is failing at
|
It seems when I follow Knative Debugging guide, below step is not as expected, route names are shown in the label
|
@kd303 looks the inference service is actually in ready state now? Can you actually curl the service ? |
@yuzisun the curl returns 404
|
@yuzisun Thanks for all the help, I could resolve the issue however I would like point out the issue so that this helps others and may be included in documentation section page, its been some work all these days. The failing probes remains a mistry as after couple of re-deployment of knative pods things work fine. I was testing with Postman and other tools again this is a dev environment where no external domains are configured with Isitio, The Virtual Service definition created by Inferenceservice is looking for authority header, so if you have any tools like Postman or any similar tools would add additional header, in my case below is the IP address. Now IMO, either there should be way for development environments etc. to
HTML Header
|
@yuzisun further updates, my other hunch is ClusterRbacConfig has only istio-system name-space in exclusion (by default even on a non-dex on-remise cluster, so one may have to add the namespace where the KFServing models are getting deployed. I think some updates to the KfServing Debugging documentation may be necessary. |
Just ran into this issue as well. To clarify for future me who googles this next time: |
Ran into the same issue while performing e2e tests for net-istio: |
Good catch! Can you help add this to the debugging guide? |
@yuzisun May be I am bit confused about ClusterRbac part , we installed a new cluster with Dex and RBAC and again ran into same issue, and the problem got resolved with changing ClusterRbacConfig and adding the namespace in the ON_WITH_EXCLUSION list, now what is puzzling is there are ServiceRole and ServiceRoleBindings are created so why would we still have to add to ON_WITH_EXCLUSION? I would be happy to add this debugging guide :) |
Added the description for failing probes and 404 as per suggestion by @yuzisun on issue [1153]#(kserve#1153)
@kd303 does the service role and service rolebinding created cover for inference service ? Can you paste the output of the istio rbac? |
Closing the staled issue. |
/kind bug
What steps did you take and what happened:
Installed https://github.com/kubeflow/kfserving/tree/master/docs/samples/sklearn in on-premise Kubernetes cluster
Created a sample services, all containers are up and running (istio-init, sklearnserving, storageInitizer etc).
I have installed default Kfserving with kubeflow cluster
What did you expect to happen:
I expect the inference service to be in "ready" state
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
Followed the debugging guide
Environment:
Installed using Installed using - https://github.com/kubeflow/kfctl/releases/tag/v1.1.0
kubectl version
):/etc/os-release
):Logs::
Logs from Autoscaler;
Logs from Activator:
Additional Information
KLindly help as I have reached to wits end as to why is this happening.
The text was updated successfully, but these errors were encountered: