Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Refactor _get_raw_paths #749

Merged
merged 28 commits into from
Jul 3, 2023
Merged

Conversation

larsoner
Copy link
Member

@larsoner larsoner commented Jun 28, 2023

Before merging …

  • Changelog has been updated (docs/source/changes.md)

Should be a no-op. Will push commits periodically to check that everything still works and ping when ready for review.

Copy link
Member Author

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I think it's finally ready! It's hard to tell from the diff, but there is a lot of code deduplication and flattening of conditionals etc.

@@ -118,16 +118,6 @@ def get_sessions(config: SimpleNamespace) -> Union[List[None], List[str]]:
return sessions


@functools.lru_cache(maxsize=None)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here was backward I think. The "public" get_runs_all_subjects should just cache under the hood so anytime we use it we benefit.

@larsoner larsoner marked this pull request as ready for review June 29, 2023 19:09
Copy link
Member Author

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hoechenberger did you want to look or are you okay with trusting the green on this one? I've added some comments to help orient you. But really that plus looking at the files themselves (either in the browser with "View file" or locally with a checkout) is much easier than looking at the diff because you can't easily tell the dedents etc.

if filter_chpi:
logger.info(**gen_log_kwargs(message="Filtering cHPI", run=task))
# allow_line_only=True is really mostly for the "noise" run
mne.chpi.filter_chpi(raw_noise_sss, allow_line_only=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of DRY here

raw_noise.compute_psd(fmax=fmax).plot()

assert len(in_files) == 0, in_files.keys()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a bunch more DRY here

)
for subject in get_subjects(config)
for session in get_sessions(config)
for run in get_runs(config=config, subject=subject)
for run, task in get_runs_tasks(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key change is to iterate over run, task pairs. So the experimental runs typically end up with run, cfg.task and then the noise and rest (if present) end up as None, "noise" and None, "rest" respectively

run=run,
out_files=out_files,
)
key = f"raw_task-{task}_run-{run}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And keys for the "data file being processed" are now standardized this way, as raw_task-{task}_run-{run}. The one exception is the reference run gets stored as raw_ref_run still, because it's in addition to / a companion of the data file being processed, and which run/task it's from is not relevant

key = f"raw_task-{task}_run-{run}"
bids_path_in = in_files.pop(key)
if _do_mf_autobad(cfg=cfg):
if key == "raw_task-noise_run-None":
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... so we can check for noise-run-ness using the standard key

)
)
raw_fname = raw_fname["fname"]
if raw_fname is not None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got rid of 5 (!) levels of indentation by changing/refactoring the logic here. Hopefully it's (much) easier to follow when editing the files now...

@@ -416,6 +416,14 @@
```
"""

read_raw_bids_verbose: Optional[Literal["error"]] = None
Copy link
Member Author

@larsoner larsoner Jun 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this one because I was getting a bunch of warnings when processing ds000117:

/home/larsoner/python/mne-bids-pipeline/mne_bids_pipeline/_import_data.py:226: RuntimeWarning: The events file, /mnt/bakraid/data/mne_data/ds000117-full/sub-01/ses-meg/meg/sub-01_ses-meg_task-facerecognition_run-04_events.tsv, contains a "stim_type" column. This column should be renamed to "trial_type" for BIDS compatibility.
  raw = read_raw_bids(
/home/larsoner/python/mne-bids-pipeline/mne_bids_pipeline/_import_data.py:226: RuntimeWarning: Did not find any channels.tsv associated with sub-01_ses-meg_task-facerecognition_run-04.

The search_str was "/mnt/bakraid/data/mne_data/ds000117-full/sub-01/**/meg/sub-01_ses-meg*channels.tsv"
  raw = read_raw_bids(
/home/larsoner/python/mne-bids-pipeline/mne_bids_pipeline/_import_data.py:226: RuntimeWarning: Did not find any meg.json associated with sub-01_ses-meg_task-facerecognition_run-04.

The search_str was "/mnt/bakraid/data/mne_data/ds000117-full/sub-01/**/meg/sub-01_ses-meg*meg.json"

# generally be stuff from this file and joblib
tb = tb[-fi:]
break
tb = "".join(traceback.format_list(tb))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this one gives much nicer tracebacks by omitting the _run.py and joblib stuff, so we get this instead of a 5- or 6-level traceback (here where I've added a raise RuntimeError to _02_find_empty_room):

[15:13:59] │ ❌ init/_02_find_empty_room sub-01 ses-meg run-02 A critical error occurred. The error message was: 

Aborting pipeline run. The full traceback is:

  File "/home/larsoner/python/mne-bids-pipeline/mne_bids_pipeline/steps/init/_02_find_empty_room.py", line 70, in find_empty_room
    raise RuntimeError

for frame in stack:
fname = pathlib.Path(frame.filename)
if "steps" in fname.parts:
return fname
else: # pragma: no cover
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and this can happen when you are running in parallel (e.g. with n_jobs=4). The __mne_bids_pipeline_step__ var will be used.

@@ -29,7 +29,8 @@ def failsafe_run(
) -> Callable:
def failsafe_run_decorator(func):
@functools.wraps(func) # Preserve "identity" of original function
def wrapper(*args, **kwargs):
def __mne_bids_pipeline_failsafe_wrapper__(*args, **kwargs):
__mne_bids_pipeline_step__ = pathlib.Path(inspect.getfile(func)) # noqa
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and we set the __mne_bids_pipeline_step__ here

@@ -416,6 +416,14 @@
```
"""

read_raw_bids_verbose: Optional[Literal["error"]] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you mention this new setting in the changelog, please?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@hoechenberger
Copy link
Member

@larsoner LMK when you think it's ready for merge / review :)

@larsoner
Copy link
Member Author

larsoner commented Jul 3, 2023

@hoechenberger yes it should be ready!

@larsoner
Copy link
Member Author

larsoner commented Jul 3, 2023

(I just ran all of ds000117 with mf_mc = True on this branch -- it's what prompted some of the actual changes like read_raw_bids_verbose and the __mne_bids_pipeline_step__!)

Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hoechenberger I let you merge if happy

thx @larsoner

Copy link
Member

@hoechenberger hoechenberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic

@hoechenberger hoechenberger merged commit fffe29f into mne-tools:main Jul 3, 2023
@larsoner larsoner deleted the refactor branch July 3, 2023 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants