Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Enable smoothing at any analysis level #135

Merged
merged 16 commits into from
May 1, 2019

Conversation

adelavega
Copy link
Collaborator

Closes #134

Main issue of debate is the CLI argument. We could split it into two arguments, but I think its fine how it is.

@adelavega
Copy link
Collaborator Author

adelavega commented Apr 25, 2019

Looks like the fail is unrelated:

ERROR: Error installing 'file:///src/fitlins': editable mode is not supported for pyproject.toml-style projects. pip is processing this project as pyproject.toml-style because it has a pyproject.toml file. Since the project has a setup.py and the pyproject.toml has no "build-backend" key for the "build_system" value, you may pass --no-use-pep517 to opt out of pyproject.toml-style processing. See PEP 517 for details on pyproject.toml-style projects.

@adelavega
Copy link
Collaborator Author

pypa/pip#6433

@@ -96,10 +96,11 @@ def get_parser():
help="use BOLD files with the provided description label")

g_prep = parser.add_argument_group('Options for preprocessing BOLD series')
g_prep.add_argument('-s', '--smoothing', action='store', metavar="TYPE:FWHM",
g_prep.add_argument('-s', '--smoothing', action='store', metavar="TYPE:FWHM:LEVEL",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer the levels to be named, e.g. run, session, etc. So instead of -s iso:5:1, it would be -s run:iso:5.

It doesn't seem likely that we'll want to smooth multiple times, does it?

Dockerfile Outdated
@@ -100,7 +100,7 @@ RUN bash -c "source activate neuro \
&& sync

RUN bash -c "source activate neuro \
&& pip install --no-cache-dir -e \
&& pip install --no-use-pep517 --no-cache-dir -e \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fixing this in #136.

@effigies
Copy link
Collaborator

effigies commented Apr 29, 2019

Rebased the PEP517 fix into #137 and merged. Please merge/rebase the latest master.

Dockerfile Outdated Show resolved Hide resolved
@effigies
Copy link
Collaborator

This requires resolution again. Sorry. Also, any objection to switching to LEVEL:TYPE:FWHM, where LEVEL is a string instead of an int?

@adelavega
Copy link
Collaborator Author

What kind of strings? How do you refer to the group level?

@effigies
Copy link
Collaborator

Step names. So run, session, subject, dataset. And group = dataset.

@adelavega
Copy link
Collaborator Author

Of course (duh). That works for me.

@adelavega
Copy link
Collaborator Author

Okay, cool. This is ready to go. I'm not able to test it right now because for some reason our workstation is down. Give it a go @effigies? Otherwise, I'll test tomorrow if our workstation/server is back online.

Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. A few changes, and I'll test in the morning.

fitlins/cli/run.py Outdated Show resolved Hide resolved
fitlins/workflows/base.py Show resolved Hide resolved
fitlins/workflows/base.py Outdated Show resolved Hide resolved
@effigies
Copy link
Collaborator

effigies commented May 1, 2019

One last thing to consider is what if we have a two-level model with steps "run" and "dataset", but I might for computational reasons say "I want to smooth after the run level, whether that next level is session, subject or dataset", but I don't want to inspect the model and modify my command line for each specific model.

I see a few options:

  1. Quietly fail to smooth if the smoothing level doesn't match some model step.
  2. Warn/error if the smoothing level doesn't match some model step.
  3. Interpret "session" as applying to the first level session or above (and similarly for other levels).
  4. Permit l1, l2, etc. to explicitly indicate a positional step.

Do you have any preference? From a minimal parsing option, 3 is appealing, but I think 4 + 2 will be the least surprising.

effigies and others added 2 commits May 1, 2019 08:59
Co-Authored-By: adelavega <aleph4@gmail.com>
Co-Authored-By: adelavega <aleph4@gmail.com>
@adelavega
Copy link
Collaborator Author

I had similar concerns when I wanted to add negative indexes (e.g. I want to smooth at the last step (-1))

I would say 2 + 4 is a sensible plan, although I think its a bit of a corner case. I suspect most ppl will smooth either at the first or last level.

@tyarkoni
Copy link
Collaborator

tyarkoni commented May 1, 2019

What are valid values for type?

@effigies
Copy link
Collaborator

effigies commented May 1, 2019

What are valid values for type?

Right now, just iso for isotropic. SUSAN smoothing is planned for the future.

@effigies
Copy link
Collaborator

effigies commented May 1, 2019

It's mostly the fact that they're so closely related that it seems weird to break them into different options.

Additionally, if there's any chance people will want to smooth multiple times at different levels, the current approach allows us to do something like --smooth l1:susan:2 dataset:iso:5 or something. (I'm not sure if that's ever something we'd want, but breaking them apart will make that much worse.)

@tyarkoni
Copy link
Collaborator

tyarkoni commented May 1, 2019

One other thing to keep in mind is that smarter algorithms may need additional parameters... e.g., SUSAN has several. I don't know if these need much twiddling in practice, or if the defaults work well pretty much everywhere, but there's something to be said for using a simple, dumb approach as the default, and letting people explicitly indicate if they want to use something more complex.

I have a hard time imagining a scenario where someone would want to smooth twice, so I don't know if that's even worth supporting right off the bat... it just gives people one more researcher degree of freedom to play with something that would be hard to justify on principled grounds.

@adelavega
Copy link
Collaborator Author

Realistically, how many smoothing algorithms will we really support anyways? Probably only SUSAN and Iso, right?

If so, we could just make them two separate arguments with their own sensible defaults.

Tal, would you think smoothing prior to the group level (as we are), or after the first level is most sensible as a default?

@effigies
Copy link
Collaborator

effigies commented May 1, 2019

Sure, let's do FWHM:LEVEL:TYPE, then. I'd check for param[0] == 'iso', though to interpret it as <FWHM>:l1:iso and emit a FutureWarning that the options have changed. This parameter was added because people needed it, so there might be a few scripts that depend on that behavior that I'd rather not break.

@effigies
Copy link
Collaborator

effigies commented May 1, 2019

My feeling is that the expectation is generally going to be to smooth before any fitting, but I'm okay with violating the expectation as long as it's a defensible default.

@adelavega
Copy link
Collaborator Author

I agree that's probably the expectation as well. So I'm fine with that being the default.

@tyarkoni
Copy link
Collaborator

tyarkoni commented May 1, 2019

Agree with all of the above.

@tyarkoni
Copy link
Collaborator

tyarkoni commented May 1, 2019

Actually, just to clarify: my expectation re: smoothing level would be that it happens as early as possible. Where that becomes ambiguous is when the type of input can vary. I.e., ideally, when fitting run-level models, we'd want to smooth the timeseries inputs, and not the output statistical maps—but I'm assuming that for practical reasons, we're currently doing the latter. OTOH, when fitting group-level models where the earliest inputs are subject-level maps, we'd want to smooth the input volumes, and not the output. Not sure what the solution here is... "Smooth the first set of 3D volumes either found or generated during pipeline execution" captures it semantically, but is maybe a bit counterintuitive.

@effigies
Copy link
Collaborator

effigies commented May 1, 2019

The current default is to smooth the BOLD series, not the statistical maps. This PR adds the possibility of moving the smoothing to immediately prior to any stage.

@adelavega
Copy link
Collaborator Author

Yes, and in the results I showed you I was smoothing prior to the group level. From what I understand, smoothing prior to the first level would be primarily for noise, whereas prior to the group level would be mostly anatomical heterogeneity.

And if I remember correctly, we decided to do the latter as its much faster for the number of models we're running.

In the future, it might be best if we could feed in pre-smoothed run-level inputs to fitlins, especially if using SUSAN which is quite slow.

@tyarkoni
Copy link
Collaborator

tyarkoni commented May 1, 2019

Oh, good. That's what I would expect intuitively for a default—smooth the very first set of inputs.

@effigies
Copy link
Collaborator

effigies commented May 1, 2019

@adelavega Can you also merge master when you work on this next? There are tests now, so might as well make sure you're not breaking them.

@adelavega
Copy link
Collaborator Author

Also to clarify, should the default level be "l1" but it also accepts named levels?

@effigies
Copy link
Collaborator

effigies commented May 1, 2019

Yes, I think so.

@adelavega adelavega requested a review from effigies May 1, 2019 18:34
Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic looks good, but nit-picky style changes.

"e.g. `--smothing iso:5` will use an isotropic 5mm FWHM kernel")
"Defaults: LEVEL='subject' TYPE='iso';"
"e.g. `--smoothing 5:dataset:iso will use an isotropic 5mm"
"FWHM on subject-level maps, before evaluating the dataset level.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a couple minor comments, in particular the default should be l1 and the metavar should use square brackets to indicate optional components. I also think we skip around a little bit and want to restructure the help text, so here's the whole thing rewritten:

    g_prep.add_argument('-s', '--smoothing', action='store', metavar="FWHM[:LEVEL:[TYPE]]",
                        help="Smooth BOLD series with FWHM mm kernel prior to fitting at LEVEL. "
                             "Optional analysis LEVEL (default: l1) may be specified numerically "
                             "(e.g., `l1`) or by name (`run`, `subject`, `session` or `dataset`). "
                             "Optional smoothing TYPE (default: iso) must be one of: `iso` (isotropic). "
                             "e.g., `--smoothing 5:dataset:iso` will perform a 5mm FWHM isotropic "
                             "smoothing on subject-level maps, before evaluating the dataset level.")

Feel free to edit for clarity.

if len(smoothing_params) > 2:
if smoothing_params[2] != 'iso':
raise ValueError(
f"Unknown smoothing type {smoothing_params[1]}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again have a bunch of little comments, so here's a replacement for L63-L80:

        smoothing_params = smoothing.split(':', 2)
        # Convert old style and warn; this should turn into an (informative) error around 0.5.0
        if smoothing_params[0] == 'iso':
            smoothing_params = (smoothing_params[1], 'l1', smoothing_params[0])
            warnings.warn(
                "The format for smoothing arguments has changed. Please use "
                f"{':'.join(smoothing_params)} instead of {smoothing}.", FutureWarning)
        # Add defaults to simplify later logic
        if len(smoothing_params) == 1:
            smoothing_params.extend(('l1', 'iso'))
        elif len(smoothing_params) == 2:
            smoothing_params.append('iso')

        smoothing_fwhm, smoothing_level, smoothing_type = smoothing_params
        smoothing_fwhm = float(smoothing_fwhm)
        if smoothing_type not in ('iso'):
            raise ValueError(f"Unknown smoothing type {smoothing_type}")

fitlins/workflows/base.py Outdated Show resolved Hide resolved
@adelavega
Copy link
Collaborator Author

Okay, pushed those changes. FYI, might be easier in the future if you do something like:
git fetch origin pull/135/head:enh/smoothing
you can then make your changes locally and you should have the power to push to my branch.

@effigies
Copy link
Collaborator

effigies commented May 1, 2019

Cool. Wanted to give you the opportunity to disagree though...

@effigies effigies changed the title ENH: Add smoothing to higher level models ENH: Enable smoothing at any analysis level May 1, 2019
@effigies effigies merged commit 20af0e2 into poldracklab:master May 1, 2019
@adelavega
Copy link
Collaborator Author

I'm a very agreeable guy...

@adelavega
Copy link
Collaborator Author

It would be cool if you could suggest an entire commit though. It's a bit too fussy with the web interface.

@adelavega adelavega deleted the enh/smoothing branch May 1, 2019 19:36
@effigies
Copy link
Collaborator

effigies commented May 1, 2019

Sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Smoothing at higher levels
3 participants