-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent numpy byte order in Ophyd Signal #1186
base: main
Are you sure you want to change the base?
Conversation
I'm pretty curious how to accidentally make caproto give you big-endian values! Is there a reasonable way to test this? |
If you like, I can whip my containers into shape, try to push working versions, and you can see for yourself. For what it's worth, I don't think the issue is Caproto, in that I don't think big-endian values are being generated by caproto. In my mind, it is probably EPICS somehow dictating that they get serialized and sent over the network that way. I was as surprised as anyone that this could be a problem, but here we are -- I can't run my pods unless this is fixed. |
Just as a follow-up, I did open an issue in So that adds a reason to check for this in Ophyd to make sure we just never end up passing data that will cause problems later on. |
Digging into this, big endian is the "native" format for CA: https://github.com/caproto/caproto/blob/46020e4816c7dbe8ccae9b7f0af362ce9a7f4f07/caproto/_numpy_backend.py#L11-L23 I think the bug here is actually in caproto in the
I am inclined to say that this should be fixed farther down. |
Additionally, the change upstream in orjson is not to reject bigendian, it is to reject non-native. |
Good catch on the specifics of the orjson/numpy endian-specified vs native distinction. I will try to pull your caproto fix branch and make sure it also fixes the problem. Assuming it does, I don't have a major preference about where this gets fixed. Knowing that this is specifically due to a bug in Caproto's pyepics_compat does make me hopeful that once that gets merged, an additional check in Ophyd won't be necessary. And anyway, future problems of this sort will be raised as errors by Your caproto fix does make me think that I should actually do the ophyd check in the same way, by checking for non-native byte order, and swapping to native, rather than setting the endianness, if we are going to do an additional check in Ophyd. |
This is a very niche issue, but important for some of the pod systems I'm running.
Description of the problem
Some time ago, I started running Ophyd and friends in a pod. However, I quickly ran into problems trying to move data over Kafka, which were related to the following behavior of
tiled.utils.safe_json_dump
when big-endian data is passed in:The json serializer in tiled relies very directly on the
orjson
package, which just doesn't care about byte order at all, meaning that data does not round trip.But why was my data big-endian in the first-place? This turns out to come directly from the Ophyd signal
Which means that something in my container's network stack is prompting Caproto/EPICS to pass around big-endian data. This makes sense, as networking is pretty much the only place that big-endian data is used.
Where could I fix this?
This could probably be fixed in several places. In rough order of data-handling, they are
Why fix this in Ophyd?
I believe that Ophyd is the most plausible/helpful place to correct the endianness issue.
Why not fix Caproto/EPICS/my network configuration?
I personally haven't been able to figure out why my container is causing Caproto to send big-endian data, but even if the underlying issue is in Caproto, or even in EPICS, I believe that Ophyd should be able to handle this case more gracefully for maximum compatibility. Sometimes you'll get big-endian data, apparently, and Ophyd should be able to handle that.
Why not fix the serializer in Tiled?
This was my initial thought -- just add some checks to
safe_json_dump
in Tiled! Unfortunately, by the time you get to Tiled, you are dealing with highly nested data. It was really easy to serialize a solitary array. But to serialize Tiled data correctly, I would need to delve into the values of the dictionary, see if any of those were a numpy array, and change its byte order. But now the Tiled serializer has side effects unless you make a deep copy of all the content that gets passed in. And you are on the hook for going through all the nested entries of any dictionary, list, or other container. No thanks!Why not fix orjson?
This is plausible, and I have submitted an issue to orjson. I don't know if they'll fix it, or even if they view it as a bug, or the serializer working as intended. However, even if it's fixed in orjson, it wouldn't help anybody else who's using a different serializer, or who has other issues related to inconsistent endianness in Ophyd.
Summary
I think that fixing this in Ophyd would be the most practical, consistent solution. We can check all the data that we ingest, and at least make sure that it's converted to a consistent byte order in Ophyd.
Where should we fix this in Ophyd?
The most immediate place to fix this, to me, was in the existing
_fix_type
method ofEpicsSignalBase
. It's a logical place for this kind of kludge. It's really the first place that Ophyd handles values that come in from the outside.I don't think it's possible to go lower, as the control layer is really just a shim around pyepics, caproto, etc. But I'm open to suggestions.