U2 updates #25

sanchit-gandhi · 2023-05-18T08:45:04Z

Small updates to U2 (some formatting, some updating of the code samples)

sanchit-gandhi · 2023-05-18T08:45:26Z

chapters/en/chapter2/audio_classification_pipeline.mdx

@@ -35,18 +36,28 @@ classifier = pipeline(
 )
 ```

-This pipeline expects the audio data as a NumPy array. All the preprocessing of the raw audio data will be conveniently
-handled for us by the pipeline. Let's pick an example to try it out:
+All the preprocessing of the raw audio data will be conveniently handled for us by the pipeline, including any resampling.


(Once huggingface/transformers#23445 is merged)

sanchit-gandhi · 2023-05-18T08:45:52Z

chapters/en/chapter2/audio_classification_pipeline.mdx


 minds = load_dataset("PolyAI/minds14", name="en-AU", split="train")
-minds = minds.cast_column("audio", Audio(sampling_rate=16_000))


Think it's cleaner if we don't have to do any data pre-/post-processing and let the pipeline handle this

Same as above, IMO it is good to reinforce the idea of sampling rates.

sanchit-gandhi · 2023-05-18T08:46:20Z

chapters/en/chapter2/asr_pipeline.mdx


 ```py
 example = minds[0]
 example["transcription"]
 "ich möchte gerne Geld auf mein Konto einzahlen"
 ```

-Find a pre-trained ASR model for German language on the 🤗 Hub, instantiate a pipeline, and transcribe the example:
+Next, we can find a pre-trained ASR model for German language on the 🤗 Hub, instantiate a pipeline, and transcribe the example.
+Here, we'll use the checkpoint [maxidl/wav2vec2-large-xlsr-german](https://huggingface.co/maxidl/wav2vec2-large-xlsr-german):


I personally like having links to the checkpoints on the Hub so that I can look at the model cards

Lovely, how about we swap the community checkpoint to an official DE checkpoint for XLSR: https://huggingface.co/facebook/wav2vec2-large-xlsr-53-german

sanchit-gandhi · 2023-05-18T08:52:41Z

chapters/en/chapter2/asr_pipeline.mdx

 the right format for a model
 - if the result isn't ideal, this still gives you a quick baseline for future fine-tuning
 - once you fine-tune a model on your custom data and share it on Hub, the whole community will be able to use it quickly
-and effortlessly via the `pipeline()` method making AI more accessible.
-
+and effortlessly via the `pipeline()` method, making AI more accessible


Consistency with previous bullet points

Vaibhavs10

Very cool! 2 small nits about the sampling rate and the XLSR model used for transcription.
Think it makes sense to use the official checkpoints where possible, helps build credibility IMO.

Vaibhavs10 · 2023-06-07T09:01:12Z

chapters/en/chapter2/asr_pipeline.mdx


 minds = load_dataset("PolyAI/minds14", name="de-DE", split="train")
-minds = minds.cast_column("audio", Audio(sampling_rate=16_000))


Personal preference: It makes sense to explicitly resample here, just to reinforce the idea of sampling rates to the attendee.

We can later on explicitly write that sampling_rate handles different rates automagically.

WDYT?

Vaibhavs10 · 2023-06-07T09:03:58Z

chapters/en/chapter2/asr_pipeline.mdx


 ```py
 example = minds[0]
 example["transcription"]
 "ich möchte gerne Geld auf mein Konto einzahlen"
 ```

-Find a pre-trained ASR model for German language on the 🤗 Hub, instantiate a pipeline, and transcribe the example:
+Next, we can find a pre-trained ASR model for German language on the 🤗 Hub, instantiate a pipeline, and transcribe the example.
+Here, we'll use the checkpoint [maxidl/wav2vec2-large-xlsr-german](https://huggingface.co/maxidl/wav2vec2-large-xlsr-german):


Lovely, how about we swap the community checkpoint to an official DE checkpoint for XLSR: https://huggingface.co/facebook/wav2vec2-large-xlsr-53-german

Vaibhavs10 · 2023-06-07T09:06:10Z

chapters/en/chapter2/audio_classification_pipeline.mdx


 minds = load_dataset("PolyAI/minds14", name="en-AU", split="train")
-minds = minds.cast_column("audio", Audio(sampling_rate=16_000))


Same as above, IMO it is good to reinforce the idea of sampling rates.

MKhalusova

LGTM! Sorry about delay. Feel free to merge

sanchit-gandhi added 2 commits May 18, 2023 09:38

classification

c37b5ed

asr

49c22e6

sanchit-gandhi commented May 18, 2023

View reviewed changes

MKhalusova self-requested a review May 19, 2023 15:49

Merge branch 'main' into u2-updates

d3fce18

sanchit-gandhi mentioned this pull request Jun 6, 2023

Allow dict input for audio classification pipeline huggingface/transformers#23445

Merged

5 tasks

Vaibhavs10 approved these changes Jun 7, 2023

View reviewed changes

MKhalusova approved these changes Jul 10, 2023

View reviewed changes

MKhalusova closed this Aug 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

U2 updates #25

U2 updates #25

sanchit-gandhi commented May 18, 2023

sanchit-gandhi May 18, 2023

sanchit-gandhi May 18, 2023

Vaibhavs10 Jun 7, 2023

sanchit-gandhi May 18, 2023

Vaibhavs10 Jun 7, 2023

sanchit-gandhi May 18, 2023

Vaibhavs10 left a comment

Vaibhavs10 Jun 7, 2023

Vaibhavs10 Jun 7, 2023

Vaibhavs10 Jun 7, 2023

MKhalusova left a comment •

edited

Loading


		minds = load_dataset("PolyAI/minds14", name="en-AU", split="train")
		minds = minds.cast_column("audio", Audio(sampling_rate=16_000))

U2 updates #25

U2 updates #25

Conversation

sanchit-gandhi commented May 18, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Vaibhavs10 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MKhalusova left a comment • edited Loading

Choose a reason for hiding this comment

MKhalusova left a comment •

edited

Loading