You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to propose a change on the way XML data is decoded in pyxdf.
The current way that XML is decoded in pyxdf is to create defaultdict with a default factory constructor that creates a list. Attributes are not handled at all (which can be ok, but may restrict users) but the specification does not mention any restriction concerning the XML content.
More importantly, using a defaultdict with a list as a default creates an unnecessary deeper structure that has an important impact on how the 'info' header is handled. For example, my user code, and the pyxdf code has many of the following idiomatic examples:
self.nchns=int(xml['info']['channel_count'][0])
self.srate=round(float(xml['info']['nominal_srate'][0]))
logger.debug(' found stream '+hdr['info']['name'][0])
In these examples, one has two systematically add [0] to get the element. This can get messy with deeper XML contents. For example, I have something like:
The main problem of converting XML to a dictionary is that this is a complicated task and it would ideally need a schema. Converters from XML to JSON exist, which can be useful here, but there is an ambiguity on the attributes because JSON does not support them. There are several "standard" conventions to solve this problem, with different benefits/disadvantages (see http://wiki.open311.org/JSON_and_XML_Conversion)
What I propose is to two ideas:
Let the user provide a function to convert XML to whatever format they choose. The user may have some knowledge on what to expect on their streams, but currently, they cannot handle the XML directly.
Change the current XML parsing in favor of another library that handles this well. XML parsing is hard and it would probably be sensible to let someone else do that. I propose to replace the current _load_xml function for one that adopts the Parker convention as follows:
from xmljson import parker # https://github.com/sanand0/xmljson
def _load_xml(t):
"""Convert an attribute-less etree.Element into a dict."""
return parker.data(t, preserve_root=True)
In this proposal, it would be easier to traverse the information dictionary. The complicated example mentioned above becomes:
One point that needs discussion in my proposal is that the temporary per-stream data object (amongst other cases) would need to be updated; it would not need to index [0] everywhere to get the contents:
classStreamData:
"""Temporary per-stream data."""def__init__(self, xml):
"""Init a new StreamData object from a stream header."""
...
# number of channelsself.nchns=int(xml['info']['channel_count'])
# nominal sampling rate in Hzself.srate=round(float(xml['info']['nominal_srate']))
# format string (int8, int16, int32, float32, double64, string)self.fmt=xml['info']['channel_format']
More importantly, if the user provides a function, it might become their responsibility to provide this information object, which complicates the usage. Maybe we could leave the current _load_xml but fill the "info" element of each stream with the provided user function (which defaults to the current one).
The text was updated successfully, but these errors were encountered:
I would like to propose a change on the way XML data is decoded in
pyxdf
.The current way that XML is decoded in
pyxdf
is to createdefaultdict
with a default factory constructor that creates a list. Attributes are not handled at all (which can be ok, but may restrict users) but the specification does not mention any restriction concerning the XML content.More importantly, using a
defaultdict
with a list as a default creates an unnecessary deeper structure that has an important impact on how the 'info' header is handled. For example, my user code, and thepyxdf
code has many of the following idiomatic examples:In these examples, one has two systematically add
[0]
to get the element. This can get messy with deeper XML contents. For example, I have something like:To get the name of the first channel I need to do:
The main problem of converting XML to a dictionary is that this is a complicated task and it would ideally need a schema. Converters from XML to JSON exist, which can be useful here, but there is an ambiguity on the attributes because JSON does not support them. There are several "standard" conventions to solve this problem, with different benefits/disadvantages (see http://wiki.open311.org/JSON_and_XML_Conversion)
What I propose is to two ideas:
_load_xml
function for one that adopts the Parker convention as follows:In this proposal, it would be easier to traverse the information dictionary. The complicated example mentioned above becomes:
One point that needs discussion in my proposal is that the temporary per-stream data object (amongst other cases) would need to be updated; it would not need to index
[0]
everywhere to get the contents:More importantly, if the user provides a function, it might become their responsibility to provide this information object, which complicates the usage. Maybe we could leave the current
_load_xml
but fill the "info" element of each stream with the provided user function (which defaults to the current one).The text was updated successfully, but these errors were encountered: