-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add ability for prep to create datasets where the source for training samples $x$ is audio #652
Closed
4 tasks
Tracked by
#614
Labels
ENH: enhancement
enhancement; new feature or request
Comments
This was referenced Mar 18, 2023
Pasting in the from typing import Callable, Literal
...
from .helper import x_vectors_from_df
class WindowDataset(VisionDataset):
...
# class attribute, constant used by several methods
# with window_inds, to mark invalid starting indices for windows
INVALID_WINDOW_VAL = -1
VALID_SPLITS = ("train", "val" "test", "all")
def __init__(
self,
root: str | pathlib.Path,
x_source: Literal['audio', 'spect'],
window_inds: npt.NDArray,
source_ids: npt.NDArray,
source_inds: npt.NDArray,
source_paths: list | npt.NDArray,
annots: list,
labelmap: dict,
timebin_dur: float,
window_size: int,
spect_key: str = "s",
timebins_key: str = "t",
transform: Callable | None = None,
target_transform: Callable | None = None,
):
r"""Initialize a WindowDataset instance.
Parameters
----------
root : str, Path
Path to a .csv file that represents the dataset.
Name 'root' is used for consistency with torchvision.datasets.
x_source: str
One of {'audio', 'spect'}. The source
of the data, either audio files ('audio')
or spectrograms in array files ('spect'),
from which we take windows.
These windows become the samples :math:`x`
that are inputs for the network during training.
...
"""
... # implement as described above using `x_source` for control flow
# note use of constant in pre-conditions
def x_vectors_from_df(
df: pd.DataFrame,
x_source: Literal['audio', 'spect'],
split: str,
window_size: int,
audio_format: str = "wav",
spect_key: str = "s",
timebins_key: str = "t",
crop_dur: int | None = None,
timebin_dur: float | None = None,
labelmap: dict | None = None,
) -> tuple[npt.NDArray, npt.NDArray, npt.NDArray]:
r"""Get source_ids and spect_ind_vector from a dataframe
that represents a dataset of vocalizations.
See ``vak.datasets.WindowDataset`` for a
detailed explanation of these vectors.
Parameters
----------
df : pandas.DataFrame
That represents a dataset of vocalizations.
x_source: str
One of {'audio', 'spect'}. The source
of the data, either audio files ('audio')
or spectrograms in array files ('spect'),
from which we take windows.
These windows become the samples :math:`x`
that are inputs for the network during training.
window_size : int
Size of the window, in number of time bins,
that is taken from the audio array
or spectrogram to become a training sample.
audio_format : str
Valid audio file format. One of {"wav", "cbin"}.
Defaults to "wav".
spect_key : str
Key to access spectograms in array files.
Default is "s".
timebins_key : str
Key to access time bin vector in array files.
Default is "t".
crop_dur : float
Duration to which dataset should be "cropped". Default is None,
in which case entire duration of specified split will be used.
timebin_dur : float
Duration of a single time bin in spectrograms. Default is None.
Used when "cropping" dataset with ``crop_dur``, and required if a
value is specified for that parameter.
labelmap : dict
Dict that maps labels from dataset to a series of consecutive integers.
To create a label map, pass a set of labels to the `vak.utils.labels.to_map` function.
Used when "cropping" dataset with ``crop_dur``
to make sure all labels in ``labelmap`` are still
in the dataset after cropping.
Required if a value is specified for ``crop_dur``.
Returns
-------
source_ids : numpy.ndarray
Represents the "id" of any spectrogram,
i.e., the index into spect_paths that will let us load it.
source_inds : numpy.ndarray
Valid indices of windows we can grab from each
audio array or spectrogram.
window_inds : numpy.ndarray
Vector of all valid starting indices of all windows in the dataset.
This vector is what is used by PyTorch to determine
the number of samples in the dataset, via the
``WindowDataset.__len__`` method.
Without cropping, a dataset with ``t`` total time bins
across all audio arrays or spectrograms will have
(``t`` - ``window_size``) possible windows
with indices (0, 1, 2, ..., t-1).
But cropping with ``crop_dur`` will
remove some of these indices.
"""
from .class_ import WindowDataset # avoid circular import
# ---- pre-conditions
if x_source not in constants.VALID_X_SOURCES:
raise ValueError(
f"`x_source` must be one of {constants.VALID_X_SOURCES} but was: {x_source}"
) |
I ended up just pushing a branch "add-prep-audio-dataset" with changes I'd made in WindowDataset + helper for audio format, will find to incorporate those changes via git suffering later |
4 tasks
NickleDave
added a commit
that referenced
this issue
Jun 25, 2023
…ixes #630 #652 #667 (#670) * WIP: Add vak/nets/das sub-package with DASNet * Add script to generate test data for DAS * DEV: Include DAS data in source test data Decide to include in 'source' not 'generated' since it isn't generated by vak proper, and I don't want to have to rerun this every time we generate the generated test data. * TST: Add tests/fixtures/das.py * TST: Add tests/test_nets/test_das/ * Add src/vak/models/das.py * DOC: Add comment about Conv2dTF to nets/tweetynet module * CLN/DOC: Revise docstrings in models/tweetynet.py * Add tests/test_models/test_das.py * Add constants.VALID_X_SOURCES, used in datasets.window_dataset * CLN: Rename internal WindowedFrameClassificationModel variable -> `out` * WIP: Add src/vak/datasets/das.py * Rewrite VocalDataset as FrameClassificationEvalDataset * Rewrite WindowDataset as FrameClassificationWindowDataset * fixup Rewrite WindowDataset as FrameClassificationWindowDataset * Fix WindowedFrameClassificationModel.validation_step to handle case where batch has 4 dimensions (one extra) * Fix vak.prep.prep_helper.move_files_into_split_subdirs so it does not move files in 'None' split * WIP: Rewrite vak.prep.prep to create datasets as concatenated arrays * WIP: Add function to prep_helper to convert spectrogram dataset to concatenated arrays * WIP: Add function to prep_helper: make_arrays_for_each_split * WIP: Writing vak.prep.prep_helper.make_arrays_for_each_split * WIP: Writing unit test for vak.prep.prep_helper.make_arrays_for_each_split * Move frame_classification_eval_dataset into vak/datasets/frame_classification * Move window_dataset into vak/dataset/frame_classification/ * Add vak/datasets/frame_classification/constants.py with names of arrays that are sources of data for the frame classification dataset classes * Add src/vak/datasets/frame_classification/__init__.py with imports * Fix relative imports in src/vak/datasets/frame_classification/window_dataset/helper.py * Fix import in src/vak/datasets/frame_classification/window_dataset/__init__.py -- FrameClassificationWindowDataset * Fix import in src/vak/datasets/frame_classification/__init__.py, FrameClassificationWindowDataset * Change imports in src/vak/datasets/__init__.py, import frame_classification sub-package * Use filenames from vak.datasets.frame_classification.constants in vak.prep.prep_helper.make_arrays_for_each_split * Rename VocalDataset -> FrameClassificationEvalDataset throughtout src/vak/ * WIP: Write more of unit test for vak.prep.prep_helper.make_arrays_for_each_split * Rename WindowDataset -> FrameClassificationWindowDataset throughtout src/vak/ * Fix imports from (moved/renamed) modules vocal_dataset, window_dataset in src/vak/ * Fix import of FrameClassificationEvalDataset in src/vak/predict.py * Factor out make_arrays helper function in src/vak/prep/prep_helper.py * Further rewrite make_frame_classification_arrays_from_spectrogram_dataset, using common.annotation.from_df * Make further fixes to unit test: test_make_frame_classification_arrays_from_spectrogram_dataset * Remove out-dated comment from src/vak/prep/prep_helper.py * WIP: Add src/vak/prep/audio_dataset.py * Add dataset_type and input_type attributes to PrepConfig * Add options 'dataset_type' and 'input_type' to vak/config/valid.toml * Add vak/prep/frame_classification.py with prep function * Rewrite vak/prep/prep.py to call dataset prep functions, e.g. frame_classification.prep * Rename env files in tests/scripts used to generate DAS test data * Add docstring at top of tests/scripts/generate_das_test_data.py * Rename env in das-test-data-env.yml * Make prep.frame_classification sub-package, move frame_classification and prep_helper into it, fix imports, rename prep_helper -> helper * Fix import of prep_helper -> helper in frame_classification/frame_classification.py * Move tests/test_prep/test_prep_helper.py into new sub-package: tests/test_prep/test_frame_classification * Move test_prep into tests/test_prep/test_frame_classification/, fix name vak.prep.prep -> frame_classification.prep * Fix/add imports in vak/prep/__init__.py * Add module docstring in prep/spectrogram_dataset/__init__.py * Add src/vak/prep/constants.py * Use constants.VALID_PURPOSES in vak/prep/prep.py * Remove VALID_PURPOSES from vak.prep.frame_classification.helper, moved into vak.prep.constants * Use constants.VALID_PURPOSES in prep/frame_classification/frame_classification.py * Fix docstring at top of test_prep/test_frame_classification/test_prep.py * Use metadata to get timebin duration in vak/eval/eval.py * Add tests/test_prep/test_prep.py with unit test that mocks prep to test it calls correct dataset prep functions * Remove default values for dataset_type and input_type from PrepConfig * Import constants module in vak/prep/__init__.py * Remove INPUT_TYPES, DATASET_TYPE_FUNCTION_MAP and DATASET_TYPES from vak/prep/prep.py -- moving into constants * Add INPUT_TYPES, DATASET_TYPE_FUNCTION_MAP and DATASET_TYPES to vak/prep/constants.py * Use prep.constants to check pre-conditions in frame_classification.py * Use prep.constants for dataset_type and input_type validators in PrepConfig * Add dataset_type and input_type options to configs in tests/data_for_tests/configs/ * Fix cli/prep.py to pass dataset_type and input_type args to vak.prep.prep * Use constants to get DATASET_TYPES and INPUT_TYPES in vak/prep/prep.py * Fix dataset type and if-else statements in vak/prep/prep.py * Pass input_type arg into frame_classification.prep inside vak.prep.prep * Fix arg order and add missing args to frame_classification.prep in vak.prep.prep * Fix ref to class variable that was remove in window_dataset/helper.py, replace with constant in module * Use constants from vak.datasets.frame_classification.constans in frame_classification_eval_dataset.py * Remove INDS_IN_SOURCE_ARRAY_FILENAME from vak.datasets.frame_classification.constants, no longer used * Fix imports, add __all__ in vak/datasets/frame_classification/__init__.py * Use datasets.frame_classification.constants to refer to array name constants in src/vak/prep/frame_classification/helper.py * Use datasets.frame_classification.constants to refer to array name constants in vak.datasets.frame_classificaton.window_dataset.class_ * Fix error message formatting in src/vak/prep/frame_classification/frame_classification.py * Add functions to prep/frame_classification/helper.py and use there - Add `sort_source_paths_and_annots_by_label_freq` - Add `crop_arrays_keep_classes` - If `annots` are passed into `make_frame_classification_arrays_from_spect_and_annot_paths` then that function calls `sort_source_paths_and_annots_by_label_freq` to sort before (potentially) cropping - Remove `inds_in_source_vec` from functions in this module, it is no longer used * Call `make_frame_classification_arrays_from_spectrogram_dataset` in vak.prep.frame_classification.prep * Make fixes in vak/prep/frame_classification/helper.py - Check `if frame_labels is not None` to avoid numpy error about truth values of a narray - Save all the arrays we create (forgot to save ``inputs``) - Copy `source_paths` and `annots` inside `sort_source_paths_and_annots_by_label_freq` so we don't remove items from originals, which breaks downstream logic -- other functions that uses these arrays/lists * Rewrite `make_learncurve_splits_from_dataset_df` to make and save arrays the same way `frame_classification.prep` does, but for each training duration / replicate in the learning curve. We save the arrays in sub-directories inside `./dataset_path/learncurve/`. * Remove `window_size` parameter from `prep.frame_classification.prep`, no longer needed" * Remove `window_size` parameter from `vak.prep.prep`, no longer needed * Remove `window_size` arg passed into `vak.prep.prep` inside `vak.cli.prep` -- arg no longer exists" * Fix log message in vak/prep/frame_classification/helper.py to avoid error * Chang evak.prep.spectrogram_dataset.spect_helper.make_dataframe_of_spect_files to convert mat files to npz and save in spect_output_dir * Pass `spect_output_dir` arg into `make_dataframe_of_spect_files` inside `vak.prep.spectrogram_dataset.prep_spectrogram_dataset` * Fix how make_dataframe_of_spect_files saves npz files -- need to add extension so it's correct in returned dataframe * Change prep_spectrogram_dataset to always make spect_output_dir with timestamp * Fix vak.train.train to make train_dataset by calling FrameClassificationWindowDataset.from_dataset_path * Add `learncurve` module to `common` with helper function `get_train_dur_replicate_split_name` * Rewrite `vak.prep.learncurve.make_learncurve_splits_from_dataset_df` We now make one directory in `dataset_path` for every split, and save arrays in those. This means `WindowDataset.from_dataset_path` can work the same way for these splits/folds used in a learning curve and the overall splits of the dataset, instead of needing to add logic that special cases learning curve. It works because the learncurve split name becomes a sub-directory of `dataset_path`, just like `train`/`test`/`val` splits. We no longer make a dataframe / csv for each split and then make a json file where the keys are training set durations + replicates and the values are the csv paths. Instead we make a single dataframe containing all the split dataframes, and we save that over the original `dataset_df` passed in, located at `csv_path`. * Add train_dur and replicate_num columns to csv saved by `vak.prep.learncurve.make_learncurve_splits_from_dataset_df` -- used by vak.learncurve function * Add `split` parameter to `vak.train.train`, will be used by `vak.learncurve.learncurve` * Rewrite `vak.learncurve.learncurve` to use train_durs and replicate_nums from dataset_df, and get split name from `common.learncurve` then pass it into `vak.train.train` * Fix FrameClassificationWindowDataset to load timebin_dur from metadata and use that for duration property * Fix FrameClassificationEvalDataset to load timebin_dur from metadata and use that for duration property * fixup window_dataset/class_.py * fixup src/vak/datasets/frame_classification/frame_classification_eval_dataset.py * Add `shape` property back to FrameClassificationWindowDataset * Fix `val_dataset` in `vak.train.train` to use `from_dataset_path` classmethod * Rewrite as , uses arrays from frame classification dataset * Use `StandardizeSpect.fit_dataset_path` classmethod in vak.train.train * Fix `vak.common.learncurve.get_train_dur_replicate_split_name` to cast `train_dur` to float and `replicate_num` to int * Add `shape` property back to FrameClassificationEvalDataset * In vak.eval.eval use FrameClassificationEvalDataset.from_dataset_path, remove unneeded use of dataset_csv_path, move metadata to where it's used * Fix how vak.learncurve.learncurve builds dataframe of results * Change `train_set_dur` -> `train_dur` in vak.learncurve.learncurve for consistency * Delete datasets/frame_classification/window_dataset/helper.py, no longer used * Rename FrameClassificationWindowDataset -> WindowDataset, move up from sub-package into module inside vak.datasets.frame_classification" * Rename FrameClassificationEvalDataset -> FramesDataset * Fix names -> `WindowDataset`, `FramesDataset` throughout rest of package * Re-organize and make fixes in vak/prep - Rename `make_frame_classification_arrays_from_spect_and_annot_paths` to `make_frame_classification_arrays_from_source_paths_and_annots` - Add `spect_key` and `timebins_key` parameters to that function - move `learncurve` module into `frame_classification` since it specifically makes learncurve datasets for a frame classification model - use `spect_key` and `timebins_key` in `learncurve.make_learncurve_splits_from_dataset_df` when calling `make_frame_classification_arrays_from_source_paths_and_annots` * Fix relative imports in datasets/frame_classification/window_dataset.py * Rewrite/rename `prep.audio_dataset.make_dataframe_of_audio_files` -> `prep_audio_dataset` * WIP: Rewrite vak.prep.frame_classification.prep to build datasets from audio also * Add pre-conditions for input type and revise log messages in prep/frame_classification/frame_classification.py * Fix imports in vak/prep/__init__.py * Fix imports in vak/prep/audio_dataset.py * Make fixes so that vak.prep.audio_dataset.prep_audio_dataset runs without crash * BUG: Fix `prep_audio_dataset` to use `audio_path` not `audio_file` in nested function for dask parallelization * Move `datasets.seq.validators` into `prep`, rewrite to work with audio - Have it take `dataset_df` instead of `dataset_csv_path`, etc., as returned by `prep_spectrogram_dataset` or `prep_audio_dataset`. - We already have 'duration' as a column in `dataset_df` so that's all we need, no reason to re-open files - Note this means we need to *keep* the 'duration' column! - Make some docstring fixes too * Clean up type hints + docstrings + error message for vak.common.annotation.has_unlabeled * fixup src/vak/prep/sequence_dataset.py * Fix frame_classificaton.prep to use sequence_dataset.has_unlabeled_segments * Delete function `prep.frame_classification.helper.move_files_into_split_subdirs`, no longer used * Move functions from frame_classifcation.helper into new module prep.dataset_df_helper * Delete functions from prep/audio_dataset.py that just repeat logic of functions that were already in datasets.frame_classification.helper * Rename/rewrite frame_classification/helper.py -> dataset_arrays.py; functions now work with both input types, spect and audio * Fix type annotations + missing import in prep/dataset_df_helper.py * Fix imports in prep/frame_classification/frame_classification.py * Fix import and function name and add arg to docstring in prep/frame_classification/learncurve.py * Add arg to docstring and fix function name in prep/frame_classification/dataset_arrays.py * Fix imports and __all__ in vak/prep/frame_classification/__init__.py * Fix import in vak/prep/__init__.py * Fix how vak.prep.frame_classification.dataset_arrays.make_from_dataset_df validates and handles input_type * Add missing parameter `audio_format` to `make_learncurve_splits_from_dataset_df`, fix args to `make_from_source_paths_and_annots` in /prep/frame_classification/learncurve.py * Add missing args in calls to functions in src/vak/prep/frame_classification/frame_classification.py * Fix arg order in call to make_from_source_paths_and_annots in prep/frame_classification/dataset_arrays.py * Make sure that vak.prep.sequence_dataset.has_unlabeled segments returns type bool, not np.bool_ * Fix how we concatenate inputs in prep/frame_classification/dataset_arrays.py, to handle 1-d audio vectors too * Remove `validate_and_get_timebin_dur` from vak.prep.dataset_df_helper; since it is specific to frame classification prep, will make validator module in that sub-package instead * Add prep/frame_classification/validators.py with function `validate_and_get_frame_dur` * Import validators module in prep/frame_classification/__init__.py * In vak.prep.frame_classification.prep, use validators.validate_and_get_frame_dur, and rename `timebin_dur` -> `frame_dur` * Fix docstrings in prep/frame_classification/dataset_arrays.py * Rename Metadata attribute `timebin_dur` -> `frame_dur`, and fix docstring accordingly * Rename timebin_dur -> frame_dur in datasets/frame_classification, vak/train, vak/eval, vak/learncurve, and vak/predict * Make fixes in prep/frame_classification/learncurve: use input_type to get source_paths, and rename timebin_dur -> frame_dur * Fix/add columns to dataframe returned by prep_audio_dataset: add sample_dur * Fix pre-conditions in prep.frame_classification.frame_classification.prep * Rename vak.prep.frame_classification.prep -> prep_frame_classification_dataset to be extra explicit in code inside sub-package. Remove some __all__s we don't really need * Fix models.get to correctly handle DAS model * WIP: Rewrite prep/frame_classification/dataset_arrays functions to save npy files in split sub-directories * Get dataset_arrays functions to point where I'm ready to test * Rename/add constants in src/vak/datasets/frame_classification/constants.py * Fix/revise docstring in src/vak/prep/audio_dataset.py * Fix/revise docstring in src/vak/prep/spectrogram_dataset/audio_helper.py * Remove commented-out code from src/vak/prep/frame_classification/dataset_arrays.py * Rewrite vak.prep.frame_classification.learncurve.make_learncurve_splits_from_dataset_df to use vak.prep.frame_classification.dataset_arrays.make_npy_files_for_each_split * Use dataset_arrays.make_npy_files_for_each_split in frame_classification.prep, save dataset_df to csv and save metadata at very end of prep function * Make fixes in src/vak/prep/frame_classification/dataset_arrays.py * Remove extra arg in src/vak/prep/frame_classification/frame_classification.py * Add constants +FRAMES_NPY_PATH_COL_NAME and FRAME_LABELS_NPY_PATH_COL_NAME to datasets/frame_classification/constants.py * Use constants FRAMES_NPY_PATH_COL_NAME and FRAME_LABELS_NPY_PATH_COL_NAME in prep/frame_classification/dataset_arrays.py * Rewrite datasets.frame_classification.FramesDataset to work with npy files * Rewrite datasets.frame_classification.WindowDataset to work with npy files * Make fixes in src/vak/datasets/frame_classification/window_dataset.py * Fix keys used to unpack batch in val_step of src/vak/models/windowed_frame_classification_model.py * Fix keys for items in item_transforms of src/vak/transforms/defaults.py * Fix StandardizeSpect method fit_dataset_path to work with data as npy arrays * Fix FramesDataset to use correct split from dataset_df * Fix WindowDataset to use correct split from dataset_df * Fix how prep/frame_classification/learncurve.py concatenates dataframes * Fix variable name in vak.transforms.SpectScaler.fit_dataset_path * Fix vak.prep.frame_classification.dataset_arrays.make_npy_files_for_each_split to convert cbins to ~wav-like float64 values * Fix how we determine n_audio_channels and num_samples for DAS in models.get * Fix glob used by fixture in tests/fixtures/spect.py * WIP: Add tests/test_prep/test_frame_classification/test_dataset_arrays.py * Add tests/test_prep/test_audio_dataset.py * Add tests/test_prep/test_frame_classification/test_validators.py * Add constants in models/_api used to map model family name to model name + class * Add src/vak/train/frame_classification.py * Rewrite src/vak/train/train.py to call family-specific model training functions * Remove args in cli/train.py that were removed from vak.train.train * Add src/vak/eval/frame_classification.py * Rewrite src/vak/eval/eval.py to call family-specific model evaluation functions * Remove args in cli/eval.py that were removed from vak.eval.eval * Add module-level docstring and import annotations from __future__ in vak/train/train.py * Rename vak.datasets.metadata.Metadata -> datasets.frame_classification_dataset.FrameClassificationDatasetMetadata, add attribute 'input_type' * Fix prep/frame_classification/frame_classification.py to use FrameClassificationDatasetMetadata and to save input_type with it * Fix docstring in src/vak/datasets/frame_classification/metadata.py * Fix window_dataset to use FrameClassificationDatasetMetadata * Fix frames_dataset to use FrameClassificationDatasetMetadata, add input_type as parameter to dataset __init__, and use to add source_paths attribute that is used by __getitem__ * Fix WindowedFrameClassification.predict_step to use 'frames' and 'source_path' from batch * Add audio_format and spect_format attributes to FrameClassificationDatasetMetadata * Add audio_format and spect_format when creating FrameClassificationDatasetMetadata in prep_frame_classification_dataset * Add src/vak/predict/frame_classification.py, fix so function works with both audio and spectrograms * Rewrite src/vak/predict/predict.py to call model family-specific prediction functions * Fix src/vak/eval/frame_classification.py to use FrameClassificationDatasetMetadata * Fix src/vak/train/frame_classification.py to use FrameClassificationDatasetMetadata * Move helper functions from vak/learncurve/learncurve.py into vak/learncurve/dirname.py * Add src/vak/learncurve/frame_classification.py * Add module-level docstring to src/vak/learncurve/frame_classification.py * Rewrite src/vak/learncurve/learncurve.py to call model family-specific learncurve functions * Remove args in cli/learncurve.py that were removed from vak.learncurve.learncurve * Fix import in src/vak/datasets/__init__.py * Fix FrameClassificationDatatsetMetadata validators for audio_format and spect_format to be optional * Fix how we check if attribute is None in src/vak/datasets/frame_classification/frames_dataset.py * Fix Metadata -> FrameClassificationDatasetMetadata in src/vak/transforms/transforms.py * Fix arg names and keys in returned items in EvalItemTransform and PredictItemTransform in src/vak/transforms/defaults.py * Rename vak/transforms/labeled_timebins -> vak/transforms/frame_labels. Rewrite docstrings accordingly * Rename transforms.labeled_timebins -> transforms.frame_labels throughout rest of code base * fixup rename labeled_timebins -> frame_labels * Fix Metadata -> FrameClassificationDatasetMetadata in vak/learncurve/learncurve.py
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
To be able to train DAS models we need the inputs to the network$x$ to be audio.
x_source
option that can be either 'spect' or 'audio'vak.io.dataframe
take this argument: if 'spect' the behavior should be the same, if 'audio' then we only get the audio files from data dir and put them in the output dataset dir -- i.e. do this after ENH: Have prep create a directory with standardized format for each prepared dataset #650 so we're not dealing with the issue of audio files being everywherex_source
as part ofmeta.json
?x_source
parameter toWindowDataset
(and later any other base dataset classes e.g.FileDataset
)x_source
toWindowDataset
to use for control flow, e.g. ifx_source
is'audio'
then our window slice should only have 1 dimension, but if it is'spect'
then we also need to a slice with two dimensions (with a colon, "all frequency bins")x_source
ahead of time, while prepping the dataset--i.e. we could just inferx_source
dynamically but I'd rather use a parameter likex_source
to make the class' internals more explicitThe text was updated successfully, but these errors were encountered: