What was the intended way to use feature normalization? #207

janvainer · 2021-02-26T15:37:29Z

janvainer
Feb 26, 2021

The global feature stats are computed as follows:

cuts = CutSet.from_json("manifests/cuts.json.gz")
stats = cuts.compute_global_feature_stats()

I was wondering how exactly were the stats intended to be used. Some options that come to mind:

Make stats an attribute of CutSet and further automatically normalize features during loading

cuts = cuts.transform_features(lambda x: normalize(x, stats))
# or
cuts = cuts.normalize_features(**stats)
# queries called on cuts will return normalized features, but features on disc stay unnormalized
# basically follows online augmentations pattern

Use stats for some DataSet initialization and normalize features manually before returning item to DataLoader
Use stats as params for first layer in a neural net (eg first layer is a Normalization layer with fixed params)

Are there any plans to support the first option? Or is it already possible with mixing in some way?
I will be happy update the docstring for feature normalization once this becomes clear :)

pzelasko · 2021-02-26T15:49:42Z

pzelasko
Feb 26, 2021
Maintainer

The intention is definitely for the normalization to be applied dynamically. I am starting to be a bit concerned that if we keep adding things like these as fields to CutSet it is going to grow into something very messy. So I am leaning more towards options 2 or 3.

We have sth similar to option 2 right now for noise mixing and cut concatenation (see K2SpeechRecognitionDataset -> cut_transforms). To add feature normalization we'd need a separate field like feature_transforms (where we could add things like SpecAug/reverb/dynamic compression/etc.); I'm not sure if that becomes too complex from the user's perspective or not? Open to discussion.

Option 3 with layers is not bad either, and possibly more helpful for packaging the model for deployment. But then you could further argue it also makes sense to add the feature extraction as a layer, which we don't support at this time.

@danpovey WDYT?

6 replies

pzelasko Feb 26, 2021
Maintainer

Yeah I'd like the feature transforms "API" in Lhotse to be compatible with torchaudio transforms.

janvainer Feb 26, 2021
Author

From user perspective, I would probably be most comfortable with the first option. It feels like it more closely follows the pattern of keeping functionality related to data transformations near the data. For inference, I would be completely fine with extracting the stats and using them in some custom layer, because a layer for feature computation will have to be added to the network anyways. But I see your point regarding additional CutSet fields.

The dataset option is probably fine too. Especially if it would be a standard field of Lhotse's datasets - thumbs u for the feature_transforms option :) . It could be also accomplished with the dataloader's collate function, but the main program would not be as clean anymore.

pzelasko Feb 26, 2021
Maintainer

I'm OK with cut_transforms and feature_transforms being standard fields in all Lhotse datasets. At this point, I'd really like to avoid adding any data fields to CutSet, but we can think about it in the future - at some point, it will be good to review the design decisions and make necessary adjustments before Lhotse claims that it's "stable".

pzelasko Feb 26, 2021
Maintainer

Also, we'll be doing a major revision of "what's where" when we start adding support for TF data API.

janvainer Feb 27, 2021
Author

I created a draft PR #211 for how features and tokens manipulation could be handled via stateful Collate functions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What was the intended way to use feature normalization? #207

{{title}}

Replies: 1 comment 6 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

What was the intended way to use feature normalization? #207

janvainer Feb 26, 2021

Replies: 1 comment · 6 replies

pzelasko Feb 26, 2021 Maintainer

pzelasko Feb 26, 2021 Maintainer

janvainer Feb 26, 2021 Author

pzelasko Feb 26, 2021 Maintainer

pzelasko Feb 26, 2021 Maintainer

janvainer Feb 27, 2021 Author

janvainer
Feb 26, 2021

Replies: 1 comment 6 replies

pzelasko
Feb 26, 2021
Maintainer

pzelasko Feb 26, 2021
Maintainer

janvainer Feb 26, 2021
Author

pzelasko Feb 26, 2021
Maintainer

pzelasko Feb 26, 2021
Maintainer

janvainer Feb 27, 2021
Author