-
Notifications
You must be signed in to change notification settings - Fork 835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NDArray with values being lists not supported - RESOLVED: Proto lists were not being deep-copied #600
Comments
I can confirm there is some strighly strange/unexpected behaviour: when the output of a container/step is an array of arrays, the next step will receive an array of PROTO lists, as opposed to just an array of python lists. This may not be a problem. A simple end to end test was carried out with two containers, one that receives strings and splits each into words as [str1, str2] -> [[tokens from str1], [tokens from str2]). What happens is basically that the next step receives an array [ LIST_PROTO_1, LIST_PROTO_2, ...]. The example can be found here: https://github.com/axsaucedo/test_string When running with the following command:
If we look at the logs of the container
This could cause issues if the expected input is a list, as opposed to a proto list. |
Ok, I've been able to narrow it down further. In order to replicate this, the only thing required is to send a request on the form of an array of arrays. More specifically, it all works well if you send a request with data of the form of However if we send a request with data of the form of This can be reproduced by creating a python wrapper that just prints the input, and sending it the following request:
This would then print:
The reason why it prints it as "Values" is because it's a proto array. It should be printing it as a python array (i.e. |
Right, I have finally found the issue. The issue was in the Python Wrapper in this line: The issue was that when the Proto was converted into a numpy array, the action performed was a shallow conversion, which meant that the array of protos was converted into a numpy array of protos. The fix is to use the MessageToDict utility that the google json library provides, which performs a deep copy, and converts the full array into a python array. Since the python wrapper v12 this won't be an issue for REST as it doesn't use protos at all, but the two tests that were added would fail for any image pre-v12. |
…es (#600) * Respect MLServer content type * update huggingface example notebook as returned prediction is not base64 encoded anymore with this change
EDIT with comment below:
Ok, I've been able to narrow it down further. In order to replicate this, the only thing required is to send a request on the form of an array of arrays.
More specifically, it all works well if you send a request with data of the form of
[[1,2,3]]
or["hello", "world"]
or [["hello", "world"]]`.However if we send a request with data of the form of
[[1,2], [3,4]]
or[["hello", "world"], ["hello", "world"]]
, then the inner arrays don't get converted from proto.This can be reproduced by creating a python wrapper that just prints the input, and sending it the following request:
This would then print:
The reason why it prints it as "Values" is because it's a proto array. It should be printing it as a python array (i.e.
[[1,2],[3,4,5]]
)PREVIOUS TEXT (not fully accurate - see edit above for accurate description)
There is currently an issue that was uncovered when building the Kubeflow Seldon example ()#589, which may require either constraining the implementation or extending the type-support functionality. This is basically in the way that SeldonCore currently handles numpy arrays of tokens (i.e. lists of strings).
Currently the example only sends one sentence as a test which works correctly. However if more than one sentence is sent to the SeldonEngine for processing, it then is unable to process this request.
The reason why this is the case is because the return value is of the format np.array(list(str), list(str)), which translates into np.array(list(str).extend(list(str)).
To be more specific with an example, if we send as example np.array(["example one", "example 2")], the spacytokenizer image would return the value np.array.([list("example", "1"), list("example", "2")]), which SeldonEngine then converts into np.array(["example", "1", "example", "2"]).
It may be possible that using JSON objects may solve this issue (i.e. #595)
The text was updated successfully, but these errors were encountered: