Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] REST request gets parsed to String if any of the element in Numpy array is String #745

Closed
lennon310 opened this issue Aug 1, 2019 · 12 comments
Assignees
Milestone

Comments

@lennon310
Copy link
Contributor

lennon310 commented Aug 1, 2019

If I'm sending a request on /predict with payload:

'json={"data": {"names": ["sepal_length", "sepal_width", "petal_length", "petal_width"], "ndarray": [[7.233, 4.652, 7.39, 0.324]]}}'

the parsed numpy array is [[7.233, 4.652, 7.39, 0.324]].

However, if the payload is

'json={"data": {"names": ["sepal_length", "sepal_width", "petal_length", "petal_width"], "ndarray": [["str", 4.652, 7.39, 0.324]]}}'

the parsed Nunpy array is [["str", "4.652", "7.39", "0.324"]] (Note the data types are all converted to String instead of keep the last 3 as floats).

@lennon310 lennon310 changed the title REST request gets parsed to String if any of the element in Numpy array is String [Bug] REST request gets parsed to String if any of the element in Numpy array is String Aug 1, 2019
@ukclivecox
Copy link
Contributor

I think this might be expected if we are passing the data in the python wrapper as a numpy array as numpy dtype would probably be object or string.

@lennon310
Copy link
Contributor Author

Thanks @cliveseldon. I was wondering if there should be some smarter dtype indication in Numpy. For example in Pandas, we can set dtype to object , and then use infer_object to cast the actual type. I don't know if Numpy is having something similar with that. Otherwise, is it expected the user has to handle the casting in predict function by themselves?

@ukclivecox
Copy link
Contributor

Yes, for multitype data at present its expected you would need to cast to the correct types as numpy will assign a single type. You could pass via binData, strData or jsonData fields we also provide. Each has their own pros/cons.

@lennon310
Copy link
Contributor Author

OK I will try converting pandas to binary , hopefully the deserialization won't affect the original data types. Thanks @cliveseldon

@lennon310
Copy link
Contributor Author

Can someone show me what would the Model class look like if a binary data is passed to predict function. Looks like I need to override predict_raw instead of predict, but when I run seldon-core-microservice <MODEL> REST, I was not allowed to curl with 0.0.0.0/5000/predict_raw, which means I still have to pass a Numpy.ndarray to predict. Did I miss something? It would be great if there could be an example showing how to do that.

@ukclivecox ukclivecox self-assigned this Aug 8, 2019
@ukclivecox
Copy link
Contributor

The internal container endpoint would be the same 0.0.0.0:5000/predict if you define a predict_raw then the wrapper will chose that if it exists.

https://docs.seldon.io/projects/seldon-core/en/latest/python/python_component.html#low-level-methods

@lennon310
Copy link
Contributor Author

Yeah I noticed that reference and thought it should work that way. However, from my local test looks like the wrapper did not chose predict_raw for me --- I was getting "Empty json parameter in data" error which I assume it was trying to find that numpy array from predict.

@ukclivecox
Copy link
Contributor

That error would imply its not finding the payload in the REST request.

Can you try with: https://docs.seldon.io/projects/seldon-core/en/latest/workflow/api-testing.html#microservice-api-tester

@lennon310
Copy link
Contributor Author

I was able to override predict_raw by sending the request from SeldonClient instead of using curl.
Unlike predict that we can pass both data and feature, predict_raw seems only accept data (binData, strData, jsonData...etc), so should I need to wrap the data and feature names list into one data before converting to bytes from client side, and at server side (in predict_raw) deserialize it and extract the data and list?

@ukclivecox
Copy link
Contributor

ukclivecox commented Aug 16, 2019

Predict raw will simply pass the messgae that was received to the user_model:

return user_model.predict_raw(request)

So it should be the whole SeldonMessage as a Dict or proto

It presently expects a SeldonMesage proto returned,

@ukclivecox
Copy link
Contributor

@lennon310 Any update on whether this is still an issue?

@ukclivecox ukclivecox added this to the 1.0.x milestone Aug 24, 2019
@lennon310
Copy link
Contributor Author

Thank you @cliveseldon . I was try to serialize the numpy (or json) to a byte data and send it with Seldon Client. Looks like I should add them to Seldon Message as you suggested. I haven't worked on this part yet, but I guess this issue can be closed since it was initially asking about the data type in REST.
I will re-open or recreate another issue if I'm having problems in using predict_raw in the future. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants