Report which custom reference was applied #8962

hoechenberger · 2021-02-25T23:30:03Z

Currently, after calling set_eeg_reference() with a custom channel name, or invoking add_reference_channels() (which calls the former after adding all-zeros channels), to my knowledge it's impossible to retrieve which reference channels are being used in the current referencing scheme. The only thing that gets set in info is 'custom_ref_applied'.

import os.path as op
import mne

data_path = mne.datasets.testing.data_path()
examples_dir = op.join(data_path, 'Brainvision')
vhdr_file = op.join(examples_dir, 'Analyzer_nV_Export.vhdr')

raw = mne.io.read_raw(vhdr_file, preload=True)
raw.set_eeg_reference(['F8'])

# do fancy stuff here
# ...
# Wait … WHICH reference did I apply again? Guess I'll have to find the channel
# with all-zero data :(

Isn't there a way to keep track of which references were set?

The text was updated successfully, but these errors were encountered:

cbrnr · 2021-02-26T08:03:16Z

Maybe we can use this existing attribute, but instead of True/False we set it to the reference channel name(s)? I'd also include the average reference case here, which is currently also not stored anywhere.

hoechenberger · 2021-02-26T08:43:41Z

Maybe we can use this existing attribute, but instead of True/False we set it to the reference channel name(s)?

Currently, the following values are supported:

FIFF.FIFFV_MNE_CUSTOM_REF_OFF       = 0
FIFF.FIFFV_MNE_CUSTOM_REF_ON        = 1
FIFF.FIFFV_MNE_CUSTOM_REF_CSD       = 2

(CSD – current source density)

The correct value is then stored at the following location

FIFF.FIFF_MNE_CUSTOM_REF             = 3567     # Whether a custom reference was applied to the data

In any case, when writing the measurement info, we do the following:

mne-python/mne/io/meas_info.py

Lines 1681 to 1682 in 8e3bcef

    
           if info.get('custom_ref_applied'): 
        
               write_int(fid, FIFF.FIFF_MNE_CUSTOM_REF, info['custom_ref_applied'])

I don't think we can simply write a different data type there…
At first I thought: Oh, cool, it's an int, so we simply write the index of the reference channel! But … obviously this wouldn't work for channels 0 to 2 ;) Also you wouldn't be able to specify more than just a single reference.

As for the avg reference:
It would be easy to add something like

FIFF.FIFFV_MNE_CUSTOM_REF_AVG       = 3

however this would break compatibility with existing code, as so far FIFF.FIFF_MNE_CUSTOM_REF has been set to 0 (i.e., False) when an average reference had been applied.

hoechenberger · 2021-02-26T08:48:39Z

(heads-up, I've edited my above post several times)

cbrnr · 2021-02-26T09:13:36Z

Yeah, that's the old FIFF format compatibility story which is really holding us back. If we can't change info["custom_ref_applied"], what about adding info["reference"]?

hoechenberger · 2021-02-26T09:34:16Z

Yeah, that's the old FIFF format compatibility story which is really holding us back. If we can't change info["custom_ref_applied"], what about adding info["reference"]?

Can one "simply" add new entries to the FIFF files? I wasn't aware of that!

I was also thinking about abusing the new per-channel Annotations somehow… But this is really not my area of expertise. Let's wait for @larsoner to chime in here :)

hoechenberger · 2021-02-26T09:35:21Z

Also we'd want to support the REST ref as well. And maybe future references which haven't been invented yet…

cbrnr · 2021-02-26T09:37:22Z

No, we can't simply add new FIFF fields, but I thought that exporting to FIFF doesn't simply dump the whole Info object, but instead picks out the relevant fields. If this really works like this, we can add a new field which will be ignored. This field should be a string or list of strings, so any future references are automatically supported.

hoechenberger · 2021-02-26T09:50:19Z

No, we can't simply add new FIFF fields, but I thought that exporting to FIFF doesn't simply dump the whole Info object, but instead picks out the relevant fields.

Yes, that seems to be the case

If this really works like this, we can add a new field which will be ignored.

👌 You mean ignored when read through software that doesn't know about this field, right?

cbrnr · 2021-02-26T09:55:09Z

I meant ignored by MNE when writing to FIFF - so this field shouldn't be written. However, it would be even better if we wrote that field and programs would just ignore it when reading if they don't know about it. But let's wait for @larsoner, he'll know for sure.

hoechenberger · 2021-02-26T10:00:29Z

I meant ignored by MNE when writing to FIFF - so this field shouldn't be written.

But then you'd lose the info during an I/O roundtrip, which is super confusing IMHO…

Let's wait for @larsoner ;)

cbrnr · 2021-02-26T10:10:26Z

But then you'd lose the info during an I/O roundtrip, which is super confusing IMHO…

I know, but FIFF was never designed to hold all that information. If we can easily add optional fields, then everything is fine. Maybe we should think about a new simple format such as pickling raw?

hoechenberger · 2021-02-26T10:32:49Z

Maybe we should think about a new simple format such as pickling raw?

I do love pickles (Spreewald 🥳) but I don't like pickled data :) It's not really portable and not suitable for long-term storage.

I'd actually prefer something like a simple JSON sidecar file or so…

cbrnr · 2021-02-26T10:34:43Z

I don't like the pickle format either, but two files are also asking for trouble. Maybe we should just store as .mat (I'm only half-joking here)...

larsoner · 2021-02-26T13:11:50Z

If this really works like this, we can add a new field which will be ignored.

But then you'd lose the info during an I/O roundtrip, which is super confusing IMHO…

Agreed, Info should only hold things that will survive I/O round trip. If we need to store more information there we should improve the spec.

Maybe we should think about a new simple format such as pickling raw?

I don't think we plan to jump to a new format anytime soon and even if we do, we need to make FIF work as well as possible so I don't see it really helping with the problem.

We can add things to info the right way -- i.e., by enhancing the fiff-constants spec -- but we need to justify it and do the work of coming up with an appropriate spec.

I agree it would be nice to store what each channel's set of references is more flexibly. For example in sEEG if you have a ton if bipolar-reference pairs, it would be nice to store properly these references and be able to recover them. This case is technically possible now because you can store one ref location in info['chs']['loc'][3:6]. However, our system currently only works if each channel only has a single reference, or all channels use an average reference.

Is there a format that stores such arbitrary reference information well? We could look to that for guidance.

Just to give one proposal -- we recently spec'ed out and added support for an enhanced channel information block, mostly to get us past the 15-char limit for FIF. To make a concrete proposal, then, we could add a new tag there like FIFF_CH_REFS, which is a float32 ndarray of shape (N, 4) containing the x/y/z location (first 3 cols) and weight (last col) for the N references used for the channel. For standard Neuromag data and standard one-ref EEG acquisitions, this would just be (ref_x, ref_y, ref_z, 1.). If you did an average reference using a direct application instead of a projector, this would be all channel locations with a value of 1 / N in the last column. For non-EEG channels it's empty.

hoechenberger · 2021-02-26T16:05:59Z

I agree it would be nice to store what each channel's set of references is more flexibly. For example in sEEG if you have a ton if bipolar-reference pairs, it would be nice to store properly these references and be able to recover them. This case is technically possible now because you can store one ref location in info['chs']['loc'][3:6]. However, our system currently only works if each channel only has a single reference, or all channels use an average reference.

But this would only help you recover the location of the reference, not its name or channel index (in case the channel is present in the data). To find out if the specified reference channel is even still part of your data, you'd have to iterate through all channel locations until you hopefully find the matching coordinates … seems awkward?

Which also highlights that there are two different situations we have to deal with:

reference channels that are present in the data – they'd have a channel index, an associated name, and a location
reference channels that are not (currently) present in the data. They'd only have a channel location (if any), and potentially a name.

When I opened this issue, I was only thinking about addressing case 1, assuming this would be easier. We'd force users to add missing reference channels via mne.add_reference_channels(), reducing the amount of special-casing needed to be done surrounding those references.

To make a concrete proposal, then, we could add a new tag there like FIFF_CH_REFS, which is a float32 ndarray of shape (N, 4) containing the x/y/z location (first 3 cols) and weight (last col) for the N references used for the channel. For standard Neuromag data and standard one-ref EEG acquisitions, this would just be (ref_x, ref_y, ref_z, 1.). If you did an average reference using a direct application instead of a projector, this would be all channel locations with a value of 1 / N in the last column. For non-EEG channels it's empty.

I'm not sure we really need the weights. I'd just assume all refs are weighted equally … I mean, I've never seen anything else during EEG processing. But of course, your experience may differ.

Again, here we're only talking about channel locations only, instead of indices or names. Doesn't that make things unnecessarily complicated for case 1 described above?

larsoner · 2021-02-26T16:27:55Z

To find out if the specified reference channel is even still part of your data, you'd have to iterate through all channel locations until you hopefully find the matching coordinates … seems awkward?

Let's be clear about the "who" is here. When I'm talking about keeping track of the information in the FIF spec, the "who" I'm talking us as maintainers dealing with our internal storage of the information and I/O possibilities.

For the user -- a very different "who" -- we can easily expose some function or something that converts these locations to channel names, or whatever they need.

So to me the I/O spec should be complete (what I was addressing), and the user-facing API is a separate issue.

To find out if the specified reference channel is even still part of your data, you'd have to iterate through all channel locations until you hopefully find the matching coordinates … seems awkward?

For the base case of Neuromag EEG data, you won't find one because there is a single ref electrode and it's not in the data (it would be all zero anyway). So we're already in this situation for a lot of datasets. And I'm not too worried about having to do this iteration, a suitable np.allclose should get us there. And we update set_montage or anything else that updates channel locations to be "channel reference aware". So I don't think it's too difficult for us to internally manage this.

Doesn't that make things unnecessarily complicated for case 1 described above?

Only in terms of how we internally deal with things, not to users. And it actually supports use case (2), and works like things do now ("store the location of the reference" -> "store the locations of any references").

larsoner · 2021-02-26T16:30:27Z

I'm not sure we really need the weights.

I don't know what sorts of refs people have used, but having weights allows for (nearly) arbitrary linear transformations of the data, which is nice. And the storage here is pretty trivial so I don't see any problem with including it, even if we never really end up using it.

hoechenberger · 2021-02-26T16:58:41Z

Thanks for the clarification, @larsoner! If you think the np.allclose() approach would work to construct location <> channel mappings, I'm +1 for your proposal!

hoechenberger · 2021-02-27T08:13:11Z

How would we encode the REST referencing scheme?

larsoner · 2021-02-27T15:48:01Z

REST is weird as it depends on passing a forward etc. We could think about adding a new constant for it or something but this seems like a far less critical case than standard channel-based references

hoechenberger · 2021-02-27T16:22:25Z

REST is weird as it depends on passing a forward etc.

Yep. But a way to store a name – simply the string "REST" – could make sense, don't you think?

larsoner · 2021-02-27T17:01:28Z

Actually on second thought IIRC REST is also known as an infinity reference, so we can signal it's being used even in main just by setting locs[3:6] to np.inf, and we can also do this in the system I proposed. So no need for a chance just for that

hoechenberger · 2021-02-27T18:14:05Z

@larsoner Good idea!

What do you think, how should we go about moving forward here? Clearly I'm lacking a big deal of expertise, but if you think I could somehow contribute, please do let me know!

agramfort · 2021-02-28T13:13:08Z

using locations to know how channels were referenced should not be the primary source of information. I anticipate that we have EEG data without channel locations but we some channel based references. See channel names you see in EEG world (eg saying that it's Cz-M1 referenced to M1). This suggests that the channel block idea (which I like) should contain a list of channels and not locations). Note that it means that it may contains channel names that don't exist in inst.ch_names or info['chs"] and it contains no data...

larsoner · 2021-02-28T16:12:53Z

Note that it means that it may contains channel names that don't exist in inst.ch_names

This is exactly the case for data acquired by Neuromag systems currently for example. So I think just keeping track of channel names won't work for systems that do actually encode physical locations, just like you think keeping track based on locations won't work for systems that don't.

What if we keep track of name+location of each reference? This would just mean one more tag with is a list of names to match the list of locations.

hoechenberger · 2021-02-28T16:30:12Z

What if we keep track of name+location of each reference? This would just mean one more tag with is a list of names to match the list of locations.

Sounds good to me. @agramfort's comment also reminded me of EEG recordings we'd done a few years back with reference electrodes placed on the tip of the nose. These would not have any proper location associated with them.

agramfort · 2021-03-01T08:06:13Z

but to me you will always have names even if they are ad-hoc names like EEG001,EEG002, etc.

…

hoechenberger · 2021-03-01T10:50:05Z

but to me you will always have names even if they are ad-hoc names like EEG001,EEG002, etc.

Yep!

cbrnr · 2021-03-01T11:21:12Z

Whatever you decide is best to represent the reference internally, I would really love to see a public API that simply gives me the reference as a string (e.g. "C3", "average", "REST", ...) or a list of strings (e.g. ["M1", "M2"]). Many EEG data sets do not specify the reference, so in that case we could simply set this to "unknown".

Come to think of it, maybe we shouldn't even use "average", because we could use a list of all channel names instead. This would be more explicit and would avoid ambiguities such as whether or not the original reference channel was added back to the data before the average reference was computed. If references consisting of multiple channels that are weighed differently are a thing, we could also store a list of weights in addition (which by default consists of all 1/N elements, where N is the number of reference electrodes).

larsoner · 2021-03-01T12:16:00Z

but to me you will always have names even if they are ad-hoc names like EEG001,EEG002, etc.

It sounds like you're pushing to store only the names? To me this is still problematic because it's lossy. If we eventually want to do anything having to do with the physical arrangement of the recording setup (e.g., forward modeling), we need the locations of the references, which might not be proper channels (in the sense of having an entry in info['chs']) in the dataset, hence they could be missing/gone.

We have some recordings with just channels and no locations. We have other recordings with (at least) a single ref location and no channel name (Neuromag). And in theory it's not hard to imagine a recording with just reference locations stored in the dataset somehow (e.g., a recording with two EEG data channels, Cz and FCz, each of which was recorded referenced to the average of left and right mastoids), and in practice it's easy to get this after some picking/dropping.

To me I don't see a downside to storing names+locations rather than just names. In terms of implementation it's not much harder: if we set an EEG location based on a channel name (set_montage is really the only place we do this I think?) then we need to check the EEG reference names and set those locations, too.

agramfort · 2021-03-02T10:13:27Z

ok fair enough It seems that storing both names and locations is necessary...

hoechenberger · 2021-03-02T10:36:21Z

I like that!
It's almost like eating the cake and having it too! 🍰

hoechenberger added the ENH label Feb 25, 2021

hoechenberger mentioned this issue Feb 25, 2021

Write EEG/iEEG reference to JSON sidecar mne-tools/mne-bids#713

Open

DMRoberts mentioned this issue Apr 29, 2021

New BIDS file requires testing on pipeline steps NDCLab/pepper-pipeline#59

Closed

yanbin-niu mentioned this issue Jun 18, 2021

Remaining issues with bad channels NDCLab/pepper-pipeline#163

Closed

hoechenberger mentioned this issue Nov 11, 2021

Improve I/O with EEGLAB #9989

Open

3 tasks

larsoner mentioned this issue Nov 25, 2021

CAR reference as a projector doesn't set custom_ref_applied and remains in projs #10032

Closed

dominikwelke mentioned this issue Jun 8, 2022

store digitization for (non-existing) reference channel(s) mne-tools/mne-bids#1012

Open

hoechenberger mentioned this issue Sep 14, 2023

RFC, WIP: Add eeg_reference kwarg to write_raw_bids() mne-tools/mne-bids#695

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Report which custom reference was applied #8962

Report which custom reference was applied #8962

hoechenberger commented Feb 25, 2021

cbrnr commented Feb 26, 2021

hoechenberger commented Feb 26, 2021 •

edited

Loading

hoechenberger commented Feb 26, 2021

cbrnr commented Feb 26, 2021

hoechenberger commented Feb 26, 2021 •

edited

Loading

hoechenberger commented Feb 26, 2021 •

edited

Loading

cbrnr commented Feb 26, 2021

hoechenberger commented Feb 26, 2021 •

edited

Loading

cbrnr commented Feb 26, 2021

hoechenberger commented Feb 26, 2021

cbrnr commented Feb 26, 2021

hoechenberger commented Feb 26, 2021

cbrnr commented Feb 26, 2021

larsoner commented Feb 26, 2021

hoechenberger commented Feb 26, 2021 •

edited

Loading

larsoner commented Feb 26, 2021

larsoner commented Feb 26, 2021

hoechenberger commented Feb 26, 2021 •

edited

Loading

hoechenberger commented Feb 27, 2021

larsoner commented Feb 27, 2021

hoechenberger commented Feb 27, 2021

larsoner commented Feb 27, 2021

hoechenberger commented Feb 27, 2021

agramfort commented Feb 28, 2021

larsoner commented Feb 28, 2021

hoechenberger commented Feb 28, 2021

agramfort commented Mar 1, 2021 via email

hoechenberger commented Mar 1, 2021

cbrnr commented Mar 1, 2021

larsoner commented Mar 1, 2021

agramfort commented Mar 2, 2021 via email

hoechenberger commented Mar 2, 2021

Report which custom reference was applied #8962

Report which custom reference was applied #8962

Comments

hoechenberger commented Feb 25, 2021

cbrnr commented Feb 26, 2021

hoechenberger commented Feb 26, 2021 • edited Loading

hoechenberger commented Feb 26, 2021

cbrnr commented Feb 26, 2021

hoechenberger commented Feb 26, 2021 • edited Loading

hoechenberger commented Feb 26, 2021 • edited Loading

cbrnr commented Feb 26, 2021

hoechenberger commented Feb 26, 2021 • edited Loading

cbrnr commented Feb 26, 2021

hoechenberger commented Feb 26, 2021

cbrnr commented Feb 26, 2021

hoechenberger commented Feb 26, 2021

cbrnr commented Feb 26, 2021

larsoner commented Feb 26, 2021

hoechenberger commented Feb 26, 2021 • edited Loading

larsoner commented Feb 26, 2021

larsoner commented Feb 26, 2021

hoechenberger commented Feb 26, 2021 • edited Loading

hoechenberger commented Feb 27, 2021

larsoner commented Feb 27, 2021

hoechenberger commented Feb 27, 2021

larsoner commented Feb 27, 2021

hoechenberger commented Feb 27, 2021

agramfort commented Feb 28, 2021

larsoner commented Feb 28, 2021

hoechenberger commented Feb 28, 2021

agramfort commented Mar 1, 2021 via email

hoechenberger commented Mar 1, 2021

cbrnr commented Mar 1, 2021

larsoner commented Mar 1, 2021

agramfort commented Mar 2, 2021 via email

hoechenberger commented Mar 2, 2021

hoechenberger commented Feb 26, 2021 •

edited

Loading

hoechenberger commented Feb 26, 2021 •

edited

Loading

hoechenberger commented Feb 26, 2021 •

edited

Loading

hoechenberger commented Feb 26, 2021 •

edited

Loading

hoechenberger commented Feb 26, 2021 •

edited

Loading

hoechenberger commented Feb 26, 2021 •

edited

Loading