Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Audio file <> with offset 1472.0742892449407 and duration 10.0 is empty! #21

Open
cyrusvahidi opened this issue Nov 11, 2023 · 4 comments

Comments

@cyrusvahidi
Copy link

runtimeError: Audio file <> with offset 1472.0742892449407 and duration 10.0 is empty!

I've been getting this error about 1-2 hours into training. It happens for a different audio file every time.

Will look into it. Wonder if you know whether it's a bug or an issue with the audio files?

@hugofloresgarcia
Copy link
Owner

hmm, might be a problem with audiotools or the encoding of your audio files since you mention it happening with many audio files. what format are your audio files encoded in?

I've gotten this issue before. It's usually been due to a corrupt audio file, though that may not be the case here. Would you mind sharing a full stack trace + audio file?

@cyrusvahidi
Copy link
Author

they're all are mp3s

RuntimeError: Caught RuntimeError in DataLoader worker process 2.
Original Traceback (most recent call last):
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/audiotools/data/datasets.py", line 419, in __getitem__
    item[keys[0]] = loader(**loader_kwargs)
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/audiotools/data/datasets.py", line 103, in __call__
    signal = AudioSignal.salient_excerpt(
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/audiotools/core/audio_signal.py", line 281, in salient_excerpt
    excerpt = cls.excerpt(audio_path, state=state, **kwargs)
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/audiotools/core/audio_signal.py", line 221, in excerpt
    signal = cls(audio_path, offset=offset, duration=duration, **kwargs)
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/audiotools/core/audio_signal.py", line 154, in __init__
    self.load_from_file(
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/audiotools/core/audio_signal.py", line 510, in load_from_file
    raise RuntimeError(
RuntimeError: Audio file <>.mp3 with offset 1472.0742892449407 and duration 10.0 is empty!

The audio file is 24:32 minutes long at 44.1 kHz. I figure that the offset 9 is out of bounds, since it starts at 24:53.

@cyrusvahidi
Copy link
Author

cyrusvahidi commented Nov 13, 2023

I just managed to complete 100K iterations of coarse. Now moving to c2f I get the same error:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /homes/cv300/Documents/vampnet/scripts/exp/train.py:680 in <module>                              │
│                                                                                                  │
│   677 │   │   with Accelerator() as accel:                                                       │
│   678 │   │   │   if accel.local_rank != 0:                                                      │
│   679 │   │   │   │   sys.tracebacklimit = 0                                                     │
│ ❱ 680 │   │   │   train(args, accel)                                                             │
│   681                                                                                            │
│                                                                                                  │
│ /homes/cv300/venvs/sd2/lib/python3.9/site-packages/argbind/argbind.py:159 in cmd_func            │
│                                                                                                  │
│   156 │   │   │   │   else:                                                                      │
│   157 │   │   │   │   │   scope = None                                                           │
│   158 │   │   │   │   print(_format_func_debug(prefix, kwargs, scope))                           │
│ ❱ 159 │   │   │   return func(*cmd_args, **kwargs)                                               │
│   160 │   │                                                                                      │
│   161 │   │   if is_class:                                                                       │
│   162 │   │   │   setattr(object_or_func, "__init__", cmd_func)                                  │
│                                                                                                  │
│ /homes/cv300/Documents/vampnet/scripts/exp/train.py:659 in train                                 │
│                                                                                                  │
│   656 │   │   │   │   save_samples(state, val_idx, writer)                                       │
│   657 │   │   │                                                                                  │
│   658 │   │   │   if tracker.step % val_freq == 0 or last_iter:                                  │
│ ❱ 659 │   │   │   │   validate(state, val_dataloader, accel)                                     │
│   660 │   │   │   │   checkpoint(                                                                │
│   661 │   │   │   │   │   state=state,                                                           │
│   662 │   │   │   │   │   save_iters=save_iters,                                                 │
│                                                                                                  │
│ /homes/cv300/venvs/sd2/lib/python3.9/site-packages/audiotools/ml/decorators.py:375 in decorated  │
│                                                                                                  │
│   372 │   │   def decorator(fn):                                                                 │
│   373 │   │   │   @wraps(fn)                                                                     │
│   374 │   │   │   def decorated(*args, **kwargs):                                                │
│ ❱ 375 │   │   │   │   output = fn(*args, **kwargs)                                               │
│   376 │   │   │   │   if self.rank == 0:                                                         │
│   377 │   │   │   │   │   nonlocal value_type, label                                             │
│   378 │   │   │   │   │   metrics = self.metrics[label][value_type]                              │
│                                                                                                  │
│ /homes/cv300/Documents/vampnet/scripts/exp/train.py:319 in validate                              │
│                                                                                                  │
│   316                                                                                            │
│   317                                                                                            │
│   318 def validate(state, val_dataloader, accel):                                                │
│ ❱ 319 │   for batch in val_dataloader:                                                           │
│   320 │   │   output = val_loop(state, batch, accel)                                             │
│   321 │   # Consolidate state dicts if using ZeroRedundancyOptimizer                             │
│   322 │   if hasattr(state.optimizer, "consolidate_state_dict"):                                 │
│                                                                                                  │
│ /homes/cv300/venvs/sd2/lib/python3.9/site-packages/torch/utils/data/dataloader.py:633 in         │
│ __next__                                                                                         │
│                                                                                                  │
│    630 │   │   │   if self._sampler_iter is None:                                                │
│    631 │   │   │   │   # TODO(https://github.com/pytorch/pytorch/issues/76750)                   │
│    632 │   │   │   │   self._reset()  # type: ignore[call-arg]                                   │
│ ❱  633 │   │   │   data = self._next_data()                                                      │
│    634 │   │   │   self._num_yielded += 1                                                        │
│    635 │   │   │   if self._dataset_kind == _DatasetKind.Iterable and \                          │
│    636 │   │   │   │   │   self._IterableDataset_len_called is not None and \                    │
│                                                                                                  │
│ /homes/cv300/venvs/sd2/lib/python3.9/site-packages/torch/utils/data/dataloader.py:1325 in        │
│ _next_data                                                                                       │
│                                                                                                  │
│   1322 │   │   │   # Check if the next sample has already been generated                         │
│   1323 │   │   │   if len(self._task_info[self._rcvd_idx]) == 2:                                 │
│   1324 │   │   │   │   data = self._task_info.pop(self._rcvd_idx)[1]                             │
│ ❱ 1325 │   │   │   │   return self._process_data(data)                                           │
│   1326 │   │   │                                                                                 │
│   1327 │   │   │   assert not self._shutdown and self._tasks_outstanding > 0                     │
│   1328 │   │   │   idx, data = self._get_data()                                                  │
│                                                                                                  │
│ /homes/cv300/venvs/sd2/lib/python3.9/site-packages/torch/utils/data/dataloader.py:1371 in        │
│ _process_data                                                                                    │
│                                                                                                  │
│   1368 │   │   self._rcvd_idx += 1                                                               │
│   1369 │   │   self._try_put_index()                                                             │
│   1370 │   │   if isinstance(data, ExceptionWrapper):                                            │
│ ❱ 1371 │   │   │   data.reraise()                                                                │
│   1372 │   │   return data                                                                       │
│   1373 │                                                                                         │
│   1374 │   def _mark_worker_as_unavailable(self, worker_id, shutdown=False):                     │
│                                                                                                  │
│ /homes/cv300/venvs/sd2/lib/python3.9/site-packages/torch/_utils.py:644 in reraise                │
│                                                                                                  │
│   641 │   │   │   # If the exception takes multiple arguments, don't try to                      │
│   642 │   │   │   # instantiate since we don't know how to                                       │
│   643 │   │   │   raise RuntimeError(msg) from None                                              │
│ ❱ 644 │   │   raise exception                                                                    │
│   645                                                                                            │
│   646                                                                                            │
│   647 def _get_available_device_type():                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Caught RuntimeError in DataLoader worker process 5.
Original Traceback (most recent call last):
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/audiotools/data/datasets.py", line 419, in __getitem__
    item[keys[0]] = loader(**loader_kwargs)
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/audiotools/data/datasets.py", line 103, in __call__
    signal = AudioSignal.salient_excerpt(
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/audiotools/core/audio_signal.py", line 281, in salient_excerpt
    excerpt = cls.excerpt(audio_path, state=state, **kwargs)
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/audiotools/core/audio_signal.py", line 221, in excerpt
    signal = cls(audio_path, offset=offset, duration=duration, **kwargs)
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/audiotools/core/audio_signal.py", line 154, in __init__
    self.load_from_file(
  File "/homes/cv300/venvs/sd2/lib/python3.9/site-packages/audiotools/core/audio_signal.py", line 510, in load_from_file
    raise RuntimeError(
RuntimeError: Audio file /import/c4dm-05/cv/x.mp3 with offset 1335.1792652140298 and duration 3.0 is empty!

I guess this could be an issue with audiotools

@hugofloresgarcia
Copy link
Owner

yeah, looks like an issue with audiotools and the way that it gets excerpts from audio files (see audiotools.util.info and AudioSignal.salient_excerpt) . I'm currently taking a break, but I'll try to dig a bit deeper into this this weekend!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants