Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH]: Add support for subject(s) file for junifer run #182

Merged
merged 19 commits into from
Oct 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
9075c93
update: add support for reading elements from file
synchon Aug 24, 2023
66ea53a
update: add test for reading elements from file
synchon Aug 24, 2023
517a1c5
update: add filter() method for BaseDataGrabber
synchon Aug 24, 2023
744b36a
update: add tests for BaseDataGrabber.filter()
synchon Aug 24, 2023
3c79b64
update: adapt code to allow element filtering while fitting MarkerCol…
synchon Aug 24, 2023
96d32a0
chore: improve type annotation and docstring in CLI functions
synchon Aug 24, 2023
9e011f2
chore: improve commentary in CLI functions
synchon Aug 24, 2023
f19f110
chore: improve log messages in CLI functions
synchon Aug 24, 2023
4aa1d49
chore: improve docstring in CLI function tests
synchon Aug 24, 2023
e823da4
chore: improve type annotation and docstring in BaseDataGrabber
synchon Aug 24, 2023
7e24f46
fix: correct element for test_run_single_element_with_preprocessing()
synchon Sep 6, 2023
dc95cb0
docs: add info about specifying elements via a file for junifer run
synchon Sep 6, 2023
2d24214
chore: add changelog 182.enh
synchon Sep 6, 2023
8c757dc
update: make api.cli._parse_elements_file() return list of tuple usin…
synchon Oct 6, 2023
430ea63
update: slight refactor in element parsing logic in api.cli._parse_el…
synchon Oct 6, 2023
17703ff
chore: add edge cases for test_run_using_element_file()
synchon Oct 6, 2023
569782f
fix: remove trailing whitespace from cell entries in dataframe when p…
synchon Oct 10, 2023
123d815
update: add test for multi-element access via element file
synchon Oct 10, 2023
8666562
chore: update docstrings for tests in test_cli.py
synchon Oct 10, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/changes/newsfragments/182.enh
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Support element(s) to be specified via text file for ``--element`` option of ``junifer run`` by `Synchon Mandal`_
50 changes: 49 additions & 1 deletion docs/using/running.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,20 +29,68 @@ The ``run`` command accepts the following additional arguments:
* ``--element``: The *element* to run. If not specified, all elements will be
run. This parameter can be specified multiple times to run multiple elements.
If the *element* requires several parameters, they can be specified by
separating them with ``,``.
separating them with ``,``. It also accepts a file (e.g., ``elements.txt``)
containing complete or partial element(s).

Example of running two elements:
--------------------------------

.. code-block:: bash

junifer run config.yaml --element sub-01 --element sub-02

You can also specify the elements via a text file like so:

.. code-block:: bash

junifer run config.yaml --element elements.txt

And the corresponding ``elements.txt`` would be like so:

.. code-block:: text

sub-01
sub-02

Example of elements with multiple parameters and verbose output:
----------------------------------------------------------------

.. code-block:: bash

junifer run --verbose info config.yaml --element sub-01,ses-01

You can also specify the elements via a text file like so:

.. code-block:: bash

junifer run --verbose info config.yaml --element elements.txt

And the corresponding ``elements.txt`` would be like so:

.. code-block:: text

sub-01,ses-01

In case you wanted to run for all possible sessions (e.g., ``ses-01``,
``ses-02``, ``ses-03``) but only for ``sub-01``, you could also do:

.. code-block:: bash

junifer run --verbose info config.yaml --element sub-01

or,

.. code-block:: bash

junifer run --verbose info config.yaml --element elements.txt

and then the ``elements.txt`` would be like so:

.. code-block:: text

sub-01


.. _collect:

Collecting Results
Expand Down
79 changes: 65 additions & 14 deletions junifer/api/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,10 @@
import subprocess
import sys
from pathlib import Path
from typing import Dict, List, Union
from typing import Dict, List, Tuple, Union

import click
import pandas as pd

from ..utils.logging import (
configure_logging,
Expand All @@ -32,27 +33,45 @@
)


def _parse_elements(element: str, config: Dict) -> Union[List, None]:
def _parse_elements(element: Tuple[str], config: Dict) -> Union[List, None]:
"""Parse elements from cli.

Parameters
----------
element : str
The element to operate on.
element : tuple of str
The element(s) to operate on.
config : dict
The configuration to operate using.

Returns
-------
list
The element(s) as list.
list or None
The element(s) as list or None.

Raises
------
ValueError
If no element is found either in the command-line options or
the configuration file.

Warns
-----
RuntimeWarning
If elements are specified both via the command-line options and
the configuration file.

"""
logger.debug(f"Parsing elements: {element}")
# Early return None to continue with all elements
if len(element) == 0:
return None
# TODO: If len == 1, check if its a file, then parse elements from file
elements = [tuple(x.split(",")) if "," in x else x for x in element]
# Check if the element is a file for single element;
# if yes, then parse elements from it
if len(element) == 1 and Path(element[0]).resolve().is_file():
elements = _parse_elements_file(Path(element[0]).resolve())
else:
# Process multi-keyed elements
elements = [tuple(x.split(",")) if "," in x else x for x in element]
logger.debug(f"Parsed elements: {elements}")
if elements is not None and "elements" in config:
warn_with_log(
Expand All @@ -61,19 +80,49 @@ def _parse_elements(element: str, config: Dict) -> Union[List, None]:
"over the configuration file. That is, the elements specified "
"in the command line will be used. The elements specified in "
"the configuration file will be ignored. To remove this warning, "
'please remove the "elements" item from the configuration file.'
"please remove the `elements` item from the configuration file."
)
elif elements is None:
# Check in config
elements = config.get("elements", None)
if elements is None:
raise_error(
"The 'elements' key is set in the configuration, but its value"
" is 'None'. It is likely that there is an empty 'elements' "
"The `elements` key is set in the configuration, but its value"
" is `None`. It is likely that there is an empty `elements` "
"section in the yaml configuration file."
)
return elements


def _parse_elements_file(filepath: Path) -> List[Tuple[str, ...]]:
"""Parse elements from file.

Parameters
----------
filepath : pathlib.Path
The path to the element file.

Returns
-------
list of tuple of str
The element(s) as list.

"""
# Read CSV into dataframe
csv_df = pd.read_csv(
filepath,
header=None, # no header # type: ignore
index_col=False, # no index column
skipinitialspace=True, # no leading space after delimiter
)
# Remove trailing whitespace in cell entries
csv_df_trimmed = csv_df.apply(
lambda x: x.str.strip() if x.dtype == "object" else x
)
# Convert to list of tuple of str
return list(map(tuple, csv_df_trimmed.to_numpy().astype(str)))


def _validate_verbose(
ctx: click.Context, param: str, value: str
) -> Union[str, int]:
Expand Down Expand Up @@ -133,7 +182,9 @@ def cli() -> None: # pragma: no cover
callback=_validate_verbose,
default="info",
)
def run(filepath: click.Path, element: str, verbose: Union[str, int]) -> None:
def run(
filepath: click.Path, element: Tuple[str], verbose: Union[str, int]
) -> None:
"""Run command for CLI.

\f
Expand All @@ -142,8 +193,8 @@ def run(filepath: click.Path, element: str, verbose: Union[str, int]) -> None:
----------
filepath : click.Path
The filepath to the configuration file.
element : str
The element to operate using.
element : tuple of str
The element to operate on.
verbose : click.Choice
The verbosity level: warning, info or debug (default "info").

Expand Down
4 changes: 3 additions & 1 deletion junifer/api/functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,9 @@ def run(
# Fit elements
with datagrabber_object:
if elements is not None:
for t_element in elements:
for t_element in datagrabber_object.filter(
elements # type: ignore
):
mc.fit(datagrabber_object[t_element])
else:
for t_element in datagrabber_object:
Expand Down
113 changes: 110 additions & 3 deletions junifer/api/tests/test_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,13 @@
# License: AGPL

from pathlib import Path
from typing import Tuple
from typing import List, Tuple

import pytest
from click.testing import CliRunner
from ruamel.yaml import YAML

from junifer.api.cli import collect, run, selftest, wtf
from junifer.api.cli import _parse_elements_file, collect, run, selftest, wtf


# Configure YAML class
Expand All @@ -35,7 +35,16 @@
def test_run_and_collect_commands(
tmp_path: Path, elements: Tuple[str, ...]
) -> None:
"""Test run and collect commands."""
"""Test run and collect commands.

Parameters
----------
tmp_path : pathlib.Path
The path to the test directory.
elements : tuple of str
The parametrized elements to operate on.

"""
# Get test config
infile = Path(__file__).parent / "data" / "gmd_mean.yaml"
# Read test config
Expand Down Expand Up @@ -74,6 +83,104 @@ def test_run_and_collect_commands(
assert collect_result.exit_code == 0


@pytest.mark.parametrize(
synchon marked this conversation as resolved.
Show resolved Hide resolved
"elements",
[
"sub-01",
"sub-01\nsub-02",
" sub-01 ",
"sub-01\n sub-02",
],
)
def test_run_using_element_file(tmp_path: Path, elements: str) -> None:
"""Test run command using element file.

Parameters
----------
tmp_path : pathlib.Path
The path to the test directory.
elements : str
The parametrized elements to write to the element file.

"""
# Create test file
test_file_path = tmp_path / "elements.txt"
with open(test_file_path, "w") as f:
f.write(elements)

# Get test config
infile = Path(__file__).parent / "data" / "gmd_mean.yaml"
# Read test config
contents = yaml.load(infile)
# Working directory
workdir = tmp_path / "workdir"
contents["workdir"] = str(workdir.resolve())
# Output directory
outdir = tmp_path / "outdir"
# Storage
contents["storage"]["uri"] = str(outdir.resolve())
# Write new test config
outfile = tmp_path / "in.yaml"
yaml.dump(contents, stream=outfile)
# Run command arguments
run_args = [
str(outfile.absolute()),
"--verbose",
"debug",
"--element",
str(test_file_path.resolve()),
]
# Invoke run command
run_result = runner.invoke(run, run_args)
# Check
assert run_result.exit_code == 0


@pytest.mark.parametrize(
"elements, expected_list",
[
("sub-01,ses-01", [("sub-01", "ses-01")]),
(
"sub-01,ses-01\nsub-02,ses-01",
[("sub-01", "ses-01"), ("sub-02", "ses-01")],
),
("sub-01, ses-01", [("sub-01", "ses-01")]),
(
"sub-01, ses-01\nsub-02, ses-01",
[("sub-01", "ses-01"), ("sub-02", "ses-01")],
),
(" sub-01 , ses-01 ", [("sub-01", "ses-01")]),
(
" sub-01 , ses-01 \n sub-02, ses-01 ",
[("sub-01", "ses-01"), ("sub-02", "ses-01")],
),
],
)
def test_multi_element_access(
tmp_path: Path, elements: str, expected_list: List[Tuple[str, ...]]
) -> None:
"""Test mulit-element parsing.

Parameters
----------
tmp_path : pathlib.Path
The path to the test directory.
elements : str
The parametrized elements to write to the element file.
expected_list : list of tuple of str
The parametrized list of element tuples to expect.

"""
# Create test file
test_file_path = tmp_path / "elements_multi.txt"
with open(test_file_path, "w") as f:
f.write(elements)
# Load element file
read_elements = _parse_elements_file(test_file_path)
# Check
assert read_elements == expected_list


def test_wtf_short() -> None:
"""Test short version of wtf command."""
# Invoke wtf command
Expand Down
2 changes: 1 addition & 1 deletion junifer/api/tests/test_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ def test_run_single_element_with_preprocessing(tmp_path: Path) -> None:
preprocessor={
"kind": "fMRIPrepConfoundRemover",
},
elements=["sub-001"],
elements=["sub-01"],
)
# Check files
files = list(outdir.glob("*.sqlite"))
Expand Down
Loading