-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report which custom reference was applied #8962
Comments
Maybe we can use this existing attribute, but instead of |
Currently, the following values are supported: FIFF.FIFFV_MNE_CUSTOM_REF_OFF = 0
FIFF.FIFFV_MNE_CUSTOM_REF_ON = 1
FIFF.FIFFV_MNE_CUSTOM_REF_CSD = 2 (CSD – current source density) The correct value is then stored at the following location FIFF.FIFF_MNE_CUSTOM_REF = 3567 # Whether a custom reference was applied to the data In any case, when writing the measurement info, we do the following: mne-python/mne/io/meas_info.py Lines 1681 to 1682 in 8e3bcef
I don't think we can simply write a different data type there… As for the avg reference: FIFF.FIFFV_MNE_CUSTOM_REF_AVG = 3 however this would break compatibility with existing code, as so far |
(heads-up, I've edited my above post several times) |
Yeah, that's the old FIFF format compatibility story which is really holding us back. If we can't change |
Can one "simply" add new entries to the FIFF files? I wasn't aware of that! I was also thinking about abusing the new per-channel Annotations somehow… But this is really not my area of expertise. Let's wait for @larsoner to chime in here :) |
Also we'd want to support the REST ref as well. And maybe future references which haven't been invented yet… |
No, we can't simply add new FIFF fields, but I thought that exporting to FIFF doesn't simply dump the whole |
Yes, that seems to be the case
👌 You mean ignored when read through software that doesn't know about this field, right? |
I meant ignored by MNE when writing to FIFF - so this field shouldn't be written. However, it would be even better if we wrote that field and programs would just ignore it when reading if they don't know about it. But let's wait for @larsoner, he'll know for sure. |
But then you'd lose the info during an I/O roundtrip, which is super confusing IMHO… Let's wait for @larsoner ;) |
I know, but FIFF was never designed to hold all that information. If we can easily add optional fields, then everything is fine. Maybe we should think about a new simple format such as pickling raw? |
I do love pickles (Spreewald 🥳) but I don't like pickled data :) It's not really portable and not suitable for long-term storage. I'd actually prefer something like a simple JSON sidecar file or so… |
I don't like the pickle format either, but two files are also asking for trouble. Maybe we should just store as .mat (I'm only half-joking here)... |
Agreed,
I don't think we plan to jump to a new format anytime soon and even if we do, we need to make FIF work as well as possible so I don't see it really helping with the problem. We can add things to info the right way -- i.e., by enhancing the fiff-constants spec -- but we need to justify it and do the work of coming up with an appropriate spec. I agree it would be nice to store what each channel's set of references is more flexibly. For example in sEEG if you have a ton if bipolar-reference pairs, it would be nice to store properly these references and be able to recover them. This case is technically possible now because you can store one ref location in Is there a format that stores such arbitrary reference information well? We could look to that for guidance. Just to give one proposal -- we recently spec'ed out and added support for an enhanced channel information block, mostly to get us past the 15-char limit for FIF. To make a concrete proposal, then, we could add a new tag there like |
But this would only help you recover the location of the reference, not its name or channel index (in case the channel is present in the data). To find out if the specified reference channel is even still part of your data, you'd have to iterate through all channel locations until you hopefully find the matching coordinates … seems awkward? Which also highlights that there are two different situations we have to deal with:
When I opened this issue, I was only thinking about addressing case 1, assuming this would be easier. We'd force users to add missing reference channels via
I'm not sure we really need the weights. I'd just assume all refs are weighted equally … I mean, I've never seen anything else during EEG processing. But of course, your experience may differ. Again, here we're only talking about channel locations only, instead of indices or names. Doesn't that make things unnecessarily complicated for case 1 described above? |
Let's be clear about the "who" is here. When I'm talking about keeping track of the information in the FIF spec, the "who" I'm talking us as maintainers dealing with our internal storage of the information and I/O possibilities. For the user -- a very different "who" -- we can easily expose some function or something that converts these locations to channel names, or whatever they need. So to me the I/O spec should be complete (what I was addressing), and the user-facing API is a separate issue.
For the base case of Neuromag EEG data, you won't find one because there is a single ref electrode and it's not in the data (it would be all zero anyway). So we're already in this situation for a lot of datasets. And I'm not too worried about having to do this iteration, a suitable
Only in terms of how we internally deal with things, not to users. And it actually supports use case (2), and works like things do now ("store the location of the reference" -> "store the locations of any references"). |
I don't know what sorts of refs people have used, but having weights allows for (nearly) arbitrary linear transformations of the data, which is nice. And the storage here is pretty trivial so I don't see any problem with including it, even if we never really end up using it. |
Thanks for the clarification, @larsoner! If you think the |
How would we encode the REST referencing scheme? |
REST is weird as it depends on passing a forward etc. We could think about adding a new constant for it or something but this seems like a far less critical case than standard channel-based references |
Yep. But a way to store a name – simply the string "REST" – could make sense, don't you think? |
Actually on second thought IIRC REST is also known as an infinity reference, so we can signal it's being used even in main just by setting locs[3:6] to np.inf, and we can also do this in the system I proposed. So no need for a chance just for that |
@larsoner Good idea! What do you think, how should we go about moving forward here? Clearly I'm lacking a big deal of expertise, but if you think I could somehow contribute, please do let me know! |
using locations to know how channels were referenced should not be the primary source of information. I anticipate that we have EEG data without channel locations but we some channel based references. See channel names you see in EEG world (eg saying that it's Cz-M1 referenced to M1). This suggests that the channel block idea (which I like) should contain a list of channels and not locations). Note that it means that it may contains channel names that don't exist in inst.ch_names or info['chs"] and it contains no data... |
This is exactly the case for data acquired by Neuromag systems currently for example. So I think just keeping track of channel names won't work for systems that do actually encode physical locations, just like you think keeping track based on locations won't work for systems that don't. What if we keep track of name+location of each reference? This would just mean one more tag with is a list of names to match the list of locations. |
Sounds good to me. @agramfort's comment also reminded me of EEG recordings we'd done a few years back with reference electrodes placed on the tip of the nose. These would not have any proper location associated with them. |
but to me you will always have names even if they are ad-hoc names like
EEG001,EEG002, etc.
… |
Yep! |
Whatever you decide is best to represent the reference internally, I would really love to see a public API that simply gives me the reference as a string (e.g. Come to think of it, maybe we shouldn't even use |
It sounds like you're pushing to store only the names? To me this is still problematic because it's lossy. If we eventually want to do anything having to do with the physical arrangement of the recording setup (e.g., forward modeling), we need the locations of the references, which might not be proper channels (in the sense of having an entry in We have some recordings with just channels and no locations. We have other recordings with (at least) a single ref location and no channel name (Neuromag). And in theory it's not hard to imagine a recording with just reference locations stored in the dataset somehow (e.g., a recording with two EEG data channels, Cz and FCz, each of which was recorded referenced to the average of left and right mastoids), and in practice it's easy to get this after some picking/dropping. To me I don't see a downside to storing names+locations rather than just names. In terms of implementation it's not much harder: if we set an EEG location based on a channel name ( |
ok fair enough It seems that storing both names and locations is
necessary...
|
I like that! |
Currently, after calling
set_eeg_reference()
with a custom channel name, or invokingadd_reference_channels()
(which calls the former after adding all-zeros channels), to my knowledge it's impossible to retrieve which reference channels are being used in the current referencing scheme. The only thing that gets set ininfo
is'custom_ref_applied'
.Isn't there a way to keep track of which references were set?
The text was updated successfully, but these errors were encountered: