Update convention section in README

pytorch · Jul 20, 2021 · 89de503 · 89de503
1 parent 380800c
commit 89de503
Show file tree

Hide file tree

Showing 2 changed files with 38 additions and 39 deletions.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -119,6 +119,44 @@ make html
 
 The built docs should now be available in `docs/build/html`
 
+## Conventions
+
+With torchaudio being a machine learning library and built on top of PyTorch,
+torchaudio is standardized around the following naming conventions. Tensors are
+assumed to have "channel" as the first dimension and time as the last
+dimension (when applicable). This makes it consistent with PyTorch's dimensions.
+For size names, the prefix `n_` is used (e.g. "a tensor of size (`n_freq`, `n_mels`)")
+whereas dimension names do not have this prefix (e.g. "a tensor of
+dimension (channel, time)")
+
+* `waveform`: a tensor of audio samples with dimensions (channel, time)
+* `sample_rate`: the rate of audio dimensions (samples per second)
+* `specgram`: a tensor of spectrogram with dimensions (channel, freq, time)
+* `mel_specgram`: a mel spectrogram with dimensions (channel, mel, time)
+* `hop_length`: the number of samples between the starts of consecutive frames
+* `n_fft`: the number of Fourier bins
+* `n_mels`, `n_mfcc`: the number of mel and MFCC bins
+* `n_freq`: the number of bins in a linear spectrogram
+* `f_min`: the lowest frequency of the lowest band in a spectrogram
+* `f_max`: the highest frequency of the highest band in a spectrogram
+* `win_length`: the length of the STFT window
+* `window_fn`: for functions that creates windows e.g. `torch.hann_window`
+
+Transforms expect and return the following dimensions.
+
+* `Spectrogram`: (channel, time) -> (channel, freq, time)
+* `AmplitudeToDB`: (channel, freq, time) -> (channel, freq, time)
+* `MelScale`: (channel, freq, time) -> (channel, mel, time)
+* `MelSpectrogram`: (channel, time) -> (channel, mel, time)
+* `MFCC`: (channel, time) -> (channel, mfcc, time)
+* `MuLawEncode`: (channel, time) -> (channel, time)
+* `MuLawDecode`: (channel, time) -> (channel, time)
+* `Resample`: (channel, time) -> (channel, time)
+* `Fade`: (channel, time) -> (channel, time)
+* `Vol`: (channel, time) -> (channel, time)
+
+Here, and in the documentation, we use an ellipsis "..." as a placeholder for the rest of the dimensions of a tensor, e.g. optional batching and channel dimensions.
+
 ## License
 
 By contributing to Torchaudio, you agree that your contributions will be licensed

diff --git a/README.md b/README.md
@@ -138,45 +138,6 @@ API Reference
 
 API Reference is located here: http://pytorch.org/audio/
 
-Conventions
------------
-
-With torchaudio being a machine learning library and built on top of PyTorch,
-torchaudio is standardized around the following naming conventions. Tensors are
-assumed to have "channel" as the first dimension and time as the last
-dimension (when applicable). This makes it consistent with PyTorch's dimensions.
-For size names, the prefix `n_` is used (e.g. "a tensor of size (`n_freq`, `n_mel`)")
-whereas dimension names do not have this prefix (e.g. "a tensor of
-dimension (channel, time)")
-
-* `waveform`: a tensor of audio samples with dimensions (channel, time)
-* `sample_rate`: the rate of audio dimensions (samples per second)
-* `specgram`: a tensor of spectrogram with dimensions (channel, freq, time)
-* `mel_specgram`: a mel spectrogram with dimensions (channel, mel, time)
-* `hop_length`: the number of samples between the starts of consecutive frames
-* `n_fft`: the number of Fourier bins
-* `n_mel`, `n_mfcc`: the number of mel and MFCC bins
-* `n_freq`: the number of bins in a linear spectrogram
-* `min_freq`: the lowest frequency of the lowest band in a spectrogram
-* `max_freq`: the highest frequency of the highest band in a spectrogram
-* `win_length`: the length of the STFT window
-* `window_fn`: for functions that creates windows e.g. `torch.hann_window`
-
-Transforms expect and return the following dimensions.
-
-* `Spectrogram`: (channel, time) -> (channel, freq, time)
-* `AmplitudeToDB`: (channel, freq, time) -> (channel, freq, time)
-* `MelScale`: (channel, freq, time) -> (channel, mel, time)
-* `MelSpectrogram`: (channel, time) -> (channel, mel, time)
-* `MFCC`: (channel, time) -> (channel, mfcc, time)
-* `MuLawEncode`: (channel, time) -> (channel, time)
-* `MuLawDecode`: (channel, time) -> (channel, time)
-* `Resample`: (channel, time) -> (channel, time)
-* `Fade`: (channel, time) -> (channel, time)
-* `Vol`: (channel, time) -> (channel, time)
-
-Here, and in the documentation, we use an ellipsis "..." as a placeholder for the rest of the dimensions of a tensor, e.g. optional batching and channel dimensions.
-
 Contributing Guidelines
 -----------------------