Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Speech Activity Detection #11

Open
wants to merge 245 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
245 commits
Select commit Hold shift + click to select a range
8a55b0a
Bug fix in nnet3-latgen-faster which missed uttspk option
vimalmanohar Nov 23, 2016
12c619f
Bug fix in sparse-matrix.cc
vimalmanohar Nov 23, 2016
21e6e9f
asr_diarization: Adding get_frame_shift.sh
vimalmanohar Nov 24, 2016
ecdae90
Pass --no-text option to validate data dir in speed perturbation
vimalmanohar Nov 24, 2016
5b7f150
Print Cuda profile in nnet3-compute
vimalmanohar Nov 23, 2016
a1a5e0e
asr_diarization: Fix stats printing
vimalmanohar Nov 6, 2016
5d16205
asr_diarization: Add --skip-dims option to apply-cmvn-sliding
vimalmanohar Nov 24, 2016
4c5cd54
asr_diarization: Adding length-tolerace to extract ivector scripts
vimalmanohar Nov 22, 2016
a71da1a
asr_diarization: Adding --do-average option to matrix-sum-rows
vimalmanohar Nov 25, 2016
8fa2b21
asr_diarization: Added weight-pdf-post, vector-to-feat, kaldi-matrix …
vimalmanohar Sep 24, 2016
fb43c8c
asr_diarization: Modify subsegment_feats and add fix_subsegmented_fea…
vimalmanohar Sep 25, 2016
d7e0b7f
asr_diarization: Utility scripts get_reco2utt, get_utt2dur and get_se…
vimalmanohar Aug 30, 2016
4c64613
asr_diarization: SAD post-processing
vimalmanohar Nov 25, 2016
1913caf
asr_diarization: Modify modify_speaker_info to add --respect-recordin…
vimalmanohar Nov 24, 2016
bd98bea
asr_diarization: Modify subset_data_dir.sh, copy_data_dir.sh to copy …
vimalmanohar Nov 24, 2016
64d863a
asr_diarization: Moved evaluate_segmentation.pl to steps/segmentation
vimalmanohar Nov 24, 2016
7478ae1
asr_diarization: Modify perturb_data_dir_volume.sh to write reco2vol …
vimalmanohar Nov 24, 2016
aaa35ff
asr_diarization: Get reverberated version of scp
vimalmanohar Nov 24, 2016
e04f86f
asr_diarization: Adding script split_data_on_reco.sh
vimalmanohar Nov 24, 2016
3dc4692
asr_diarization: add per-reco option to split_data.sh
vimalmanohar Nov 24, 2016
bfec702
asr_diarization: Added deriv weights and xent per dim objective
vimalmanohar Nov 23, 2016
99dcd96
asr_diarization: Adding compress format option
vimalmanohar Nov 23, 2016
fb4737e
asr_diarization: nnet3-get-egs etc. modified with deriv weights and c…
vimalmanohar Nov 23, 2016
9436732
asr_diarization: Log and Exp component
vimalmanohar Dec 7, 2016
828544e
asr_diarization: Adding ScaleGradientComponent
vimalmanohar Nov 23, 2016
b80cf24
asr_diarization: Adding AddGradientSacaleLayer to components.py
vimalmanohar Nov 24, 2016
9ef5422
asr_diarization: Adding get_egs changes into get_egs_targets
vimalmanohar Nov 24, 2016
3827e1c
asr_diarization: Multiple outputs in nnet3
vimalmanohar Nov 23, 2016
e9535d8
raw_python_script: Made LSTM and TDNN raw configs similar
vimalmanohar Nov 24, 2016
7806dd6
asr_diarization: Create prepare_unsad_data.sh
vimalmanohar Nov 23, 2016
b281cea
asr_diarization: Temporary changes to mfcc_hires_bp.conf and path.sh …
vimalmanohar Nov 24, 2016
30bb964
asr_diarization: Modified reverberation script by moving some functio…
vimalmanohar Nov 24, 2016
9ca5aa0
asr_diarization: Add extra_egs_copy_cmd
vimalmanohar Nov 24, 2016
c5796c3
asr_diarization: Create get_egs.py supporting multiple targets
vimalmanohar Nov 24, 2016
4baeb72
asr_diarization: Modify the egs binaries and utilities to support mul…
vimalmanohar Nov 24, 2016
687b0f1
asr_diarization: Adding local/snr/make_sad_tdnn_configs.py and stats …
vimalmanohar Nov 24, 2016
fbc0333
asr_diarization: compute_output.sh, SAD decoding scripts and do_segme…
vimalmanohar Nov 24, 2016
e80e4a9
asr_diarization: Adding min-extra-left-context
vimalmanohar Nov 19, 2016
b79f0fa
asr_diarization: Segmentation tools
vimalmanohar Nov 27, 2016
8c11f77
asr_diarization: Adding do_corruption_data_dir.sh for corruption with…
vimalmanohar Nov 30, 2016
5fccac1
asr_diarization: Add do_corruption_data_dir_music.sh for corruption w…
vimalmanohar Nov 30, 2016
265c216
asr_diarization: Recipe for music-id on broadcast news
vimalmanohar Nov 30, 2016
82bfc5a
asr_diarization: Utilities invert_vector.pl and vector_get_max.pl
vimalmanohar Nov 25, 2016
709ac92
asr_diarization: Recipe for segmentation on AMI SDM dev set
vimalmanohar Nov 30, 2016
64d1456
asr_diarization: Fisher recipe from data preparation, training nnet a…
vimalmanohar Nov 30, 2016
99ce6c8
asr_diarization: created compute-snr-targets
vimalmanohar Nov 24, 2016
ef36cf5
asr_diarization: make_snr_targets.sh
vimalmanohar Nov 24, 2016
c3da17a
asr_diarization: Added script to get DCT matrix
vimalmanohar Nov 24, 2016
83cbdd6
asr_diarization_clean: Adding run_train_sad.sh
vimalmanohar Nov 30, 2016
1d1610c
asr_diarization: Modified online ivector extraction to accept frame w…
vimalmanohar Sep 6, 2016
88970c8
asr_diarization: Added script to resolve CTM overlaps
vimalmanohar Sep 2, 2016
eb727f1
asr_diarization: AMI script without ivectors
vimalmanohar Sep 2, 2016
8d529d9
asr_diarization: Adding run_tdnn_1a.sh
vimalmanohar Dec 10, 2016
37f5713
asr_diarization: Modified AMI scoring script to add overlap resolutio…
vimalmanohar Sep 10, 2016
c5aeb91
asr_diarization: Get times per word from mbr sausage links
vimalmanohar Sep 10, 2016
ae14b71
asr_diarization: Adding a general wrapper to chain decoding script
vimalmanohar Sep 10, 2016
194030d
asr_diarization: Adding RT
vimalmanohar Nov 22, 2016
32977de
asr_diarization: Adding ami_normalize_transcripts.pl
vimalmanohar Dec 10, 2016
4219de1
asr_diarization: Two-stage decoding baseline AMI
vimalmanohar Sep 21, 2016
318b52e
asr_diarization: Adding modify_stm.pl to remove beginning and end fro…
vimalmanohar Dec 10, 2016
f141eb0
asr_diarization: Removing sctk and other additions to AMI path.sh
vimalmanohar Nov 22, 2016
d95b27e
asr_diarization: Updating AMI nnet3 recipes
vimalmanohar Nov 18, 2016
0ad80e2
asr_diarization: Add initial training scripts for overlapped speech d…
vimalmanohar Dec 10, 2016
c31e895
asr_diarization: Babel changes
vimalmanohar Dec 10, 2016
3408934
diarization: Adding num-fft-bins
vimalmanohar Dec 10, 2016
0755e2c
asr_diarization: Removing stats component from make_jesus_configs.py
vimalmanohar Dec 10, 2016
3cda27b
asr_diarzation: Raw nnet3 changes
vimalmanohar Dec 10, 2016
d2e7742
asr_diarzation: Minor consmetic change
vimalmanohar Dec 10, 2016
ebb3c5a
asr_diarization: fixing bug in reverberate_data_dir.py
vimalmanohar Dec 10, 2016
f99d7cd
asr_diarization: Adding learning rate facor
vimalmanohar Dec 10, 2016
b472987
asr_diarization: Adding dropout
vimalmanohar Dec 10, 2016
df3319e
asr_diarization: Adding more info in nnet3-info
vimalmanohar Dec 10, 2016
99c8845
asr_diarization: Minor bug fix in AMI run_cleanup_segmentation.sh
vimalmanohar Dec 10, 2016
26b49b1
Adding missing break in nnet-test-utils
vimalmanohar Dec 10, 2016
ad1c10c
asr_diarization: adding compute-fscore binary
vimalmanohar Dec 10, 2016
62e18da
asr_diarization: Create make_overlapped_data_dir.py for overlapped sp…
vimalmanohar Nov 24, 2016
be41b74
asr_diarization: Added do_corruption_data_dir_overlapped_speech.sh
vimalmanohar Nov 24, 2016
a1be1dd
asr_diarization: Added train_sad_ovlp{,_prob}.sh
vimalmanohar Nov 24, 2016
bbbd4ed
asr_diarization: New copy-egs-overlap-detection in nnet3bin/Makefile
vimalmanohar Nov 22, 2016
25a21f9
Modifying do_corruption_data_dir_overlapped_speech.sh
vimalmanohar Dec 10, 2016
c7ba208
dropout_schedule: Changing default in dropout-schedule option
vimalmanohar Dec 11, 2016
851e98a
Bug fix in xconfig/basic_layers.py
vimalmanohar Dec 13, 2016
693bd14
asr_diarization: Addind stats_layer to xconfigs
vimalmanohar Dec 13, 2016
ed938f6
asr_diarization: Making xconfigs support more general networks
vimalmanohar Dec 13, 2016
c6ade8c
asr_diarization: Update do_corruption_data_dir.sh with better default…
vimalmanohar Dec 13, 2016
d639b31
asr_diarization: Update do_corruption_data_dir_overlapped_speech.sh w…
vimalmanohar Dec 13, 2016
8668bcf
asr_diarization: Moving ../egs/aspire/s5/local/segmentation/train_sta…
vimalmanohar Dec 13, 2016
a7db707
asr_diarization: Bug fix in random extra contexts
vimalmanohar Dec 13, 2016
0dc172c
asr_diarization: New tuning scrits for music id
vimalmanohar Dec 13, 2016
869b669
asr_diarization: New overlap detection with stats script
vimalmanohar Dec 13, 2016
3b6b460
asr_diarization: remove junk from ami path.sh
vimalmanohar Dec 13, 2016
7cb6d56
asr_diarization: Adding segmentation-init-from-additive-signals-info
vimalmanohar Dec 14, 2016
8f1ee41
asr_diarization: Update make_overlapped_data_dir.py and data_dir_Mani…
vimalmanohar Dec 14, 2016
7b3723d
asr_diarization: Update reverberate_data_dir.py
vimalmanohar Dec 14, 2016
b9328f7
asr_diarization: Better way of checking vol perturbation
vimalmanohar Dec 14, 2016
1ac065d
asr_diarization: Updated train_stats_sad_overlap_1a.sh
vimalmanohar Dec 14, 2016
7c6e40a
asr_diarization: New version of corruption_data_dir for overlapped_sp…
vimalmanohar Dec 14, 2016
ac7e716
asr_diarization: Updated AMI segmentation recipe
vimalmanohar Dec 14, 2016
bd499c8
asr_diarization: Restructuring do_segmentation_data_dir.sh
vimalmanohar Dec 16, 2016
eaa31a4
asr_diarization: SAD on Aspire
vimalmanohar Dec 16, 2016
c9a8da1
asr_diaization: Changes to run_segmentation_ami based on restructuring
vimalmanohar Dec 16, 2016
5d0b828
Bug fix in basic_layers.py
vimalmanohar Dec 16, 2016
93fe5b3
asr_diarization: Minor fix in get_egs_multiple_targets.py
vimalmanohar Dec 16, 2016
87511da
asr_diarization: Change the way do_corruption_data_dir_overlapped_spe…
vimalmanohar Dec 16, 2016
3368166
Bug fix in nnet3 training
vimalmanohar Dec 18, 2016
56f087b
asr_diarization: Adding multilingual egs
vimalmanohar Dec 19, 2016
34e34e9
asr_diarization: Add fake targets to get-egs-multiple-targets
vimalmanohar Dec 19, 2016
2d4eeeb
asr_diarization: Support scaling of nnet3 egs feats
vimalmanohar Dec 20, 2016
c1799f1
asr_diarization: Fix bugs and restructure multiple egs targets source
vimalmanohar Dec 20, 2016
cfb71e5
asr_diarization: Minor fixes to get_egs_multiple_targets
vimalmanohar Dec 20, 2016
54cc836
asr_diarization: Add objective-scale to xconfig output
vimalmanohar Dec 20, 2016
eb04aab
asr_diarization: Support multitask egs at script level
vimalmanohar Dec 20, 2016
8174b3c
asr_diarization: Support multi output training diagnostics correctly
vimalmanohar Dec 20, 2016
59c9a2d
asr_diarization: Add data to libs __init__
vimalmanohar Dec 20, 2016
38f2515
asr_diarization: Adding new overlapped speech recipe
vimalmanohar Dec 20, 2016
6c9efb6
asr_diarization: Add iter option to run_segmentation_ami
vimalmanohar Dec 20, 2016
7de8a83
asr_diarization: Add iter to aspire segmentation
vimalmanohar Dec 20, 2016
885d17e
asr_diarization: Optional resolve_ctm overlaps in multicondition get_…
vimalmanohar Dec 20, 2016
58d62ab
asr_diarization: Adding tuning scripts for music and SAD
vimalmanohar Dec 20, 2016
61d6f1e
segmentation: Modify segmentation codes
vimalmanohar Jan 2, 2017
5ac90c8
asr_diarization: Support objective type in basic_layers
vimalmanohar Jan 2, 2017
6e3889b
asr_diarization: Update multilingual egs creation
vimalmanohar Jan 2, 2017
3d10480
asr_diarization: Add per-dim accuracy to diagnostics
vimalmanohar Jan 2, 2017
a5d7881
sar_diarization: Minor bug fix in ../egs/wsj/s5/steps/nnet3/get_egs_m…
vimalmanohar Jan 2, 2017
894279b
asr_diarization: Some deep restructuring to decode and segmentation
vimalmanohar Jan 2, 2017
73eb943
asr_diarization: Bug fix in get_reco2num_frames.sh
vimalmanohar Jan 2, 2017
0177743
asr_diarization: Relax some errors in normalize_data_range
vimalmanohar Jan 2, 2017
a638cca
asr_diarization: more tuning scripts for music detection
vimalmanohar Jan 2, 2017
47bf4fd
asr_diarization: Add more tuning scripts for sad overlap
vimalmanohar Jan 2, 2017
e738191
asr_diarization: Modify overlapping sad recipe for AMI
vimalmanohar Jan 2, 2017
e071dec
asr_diarization: Fisher+ Babel SAD recipe
vimalmanohar Jan 2, 2017
d16de41
asr_diarization: Prepare labels for AMI
vimalmanohar Jan 2, 2017
5ac841e
asr_diarization: segmentation configs
vimalmanohar Jan 2, 2017
baa5bf4
asr_diarization: Support per-utt gmm global
vimalmanohar Jan 16, 2017
dafec02
asr_diarization: Fix some bugs in segmenter code and make it simpler
vimalmanohar Jan 19, 2017
31a3e79
asr_diarzation: Rename get_subsegmented_feats.sh
vimalmanohar Jan 19, 2017
6d29bb2
asr_diarization: Modify utt2num_frames etc.
vimalmanohar Jan 19, 2017
bf44dda
asr_diarization: gmm-global-get-post to support archives of models
vimalmanohar Jan 25, 2017
9bd17f2
asr_diarization: Add some debugging stuff to segmenter
vimalmanohar Jan 25, 2017
1e6b3c9
asr_diarization: Preprare for SimpleHmm
vimalmanohar Jan 25, 2017
eaa56b4
asr_diarization: Old version of SimpleHmm
vimalmanohar Jan 25, 2017
b05406d
asr_diarization: Add SimpleHmm
vimalmanohar Jan 25, 2017
3d4cba8
asr_diarization: Moving SimpleHmm
vimalmanohar Jan 25, 2017
be89229
asr_diarization: Convert GMM posteriors to feats
vimalmanohar Jan 25, 2017
c23060e
asr_diarization: Remove some accidentally added files
vimalmanohar Jan 25, 2017
8786dea
asr_diarzation: Update do_corruption_data_dir{,_music}
vimalmanohar Jan 25, 2017
eb54322
asr_diarization: Prepare unsad data fisher and babel
vimalmanohar Jan 25, 2017
e52f032
asr_diarization: Bug fix in reverberate_data_dir.py
vimalmanohar Jan 25, 2017
b7fba13
asr_diarization: Updated compute_output.sh to compute from Am
vimalmanohar Jan 25, 2017
84889b6
asr_diarization: Bug fix in get_egs_multiple_targets
vimalmanohar Jan 25, 2017
911d1d0
asr_diariztion: Add compute-per-dim-accuracy
vimalmanohar Jan 25, 2017
abd45fe
asr_diarization: Update some segmentation scripts
vimalmanohar Jan 25, 2017
0cd44c8
asr_diarization: SimpleHmm version of segmentation
vimalmanohar Jan 25, 2017
a4b823c
More segmentation script updated
vimalmanohar Jan 25, 2017
dd51f1c
subsegment_data_dir fix
vimalmanohar Jan 25, 2017
310f42e
asr_diarization: Update get_sad_map
vimalmanohar Jan 25, 2017
c9a44e0
asr_diarization: downsample_data_dir.sh perturb_data_dir_speed_random.sh
vimalmanohar Jan 25, 2017
0e276b3
asr_diarization: normalize_data_range.pl
vimalmanohar Jan 25, 2017
4637f02
asr_diarization: Add reco2utt to split_data.sh
vimalmanohar Jan 25, 2017
bf1647b
asr_diarization: Possibly deprecated update to do_segmentation_data_d…
vimalmanohar Jan 25, 2017
b63787a
asr_diarization: Minor logging to nnet3-copy-egs
vimalmanohar Jan 25, 2017
ea50042
asr_diarization: Partial update to aspire segmentation
vimalmanohar Jan 25, 2017
6a0fca9
asr_diarization: Update overlapping speech detection in ami
vimalmanohar Jan 25, 2017
fd96de7
asr_diarization: Add simplehmmbin to common_path
vimalmanohar Jan 25, 2017
7a678fd
asr_diarization: Add IB clustering
vimalmanohar Jan 25, 2017
403bde7
asr_diarization: Add intersect int vectors
vimalmanohar Jan 25, 2017
71f0de6
asr_diarization: Clustering using IB
vimalmanohar Jan 25, 2017
120ac02
asr_diarization: aib cluster
vimalmanohar Jan 25, 2017
932073b
asr_diarization: LSTM SAD music
vimalmanohar Jan 25, 2017
53b7649
asr_diarization: segmentation configs
vimalmanohar Jan 25, 2017
cb8c718
An old version of resolve_ctm_overlaps
vimalmanohar Jan 25, 2017
58dc6a6
asr_diarization: Add steps/data/make_corrupted_data_dir.py
vimalmanohar Jan 25, 2017
613f0aa
asr_diarization: Add deprecated sad run scripts
vimalmanohar Jan 25, 2017
840bee2
asr_diarization: Add deprecated do_corruption_whole_data_dir_overlapp…
vimalmanohar Jan 25, 2017
2725cd1
asr_diarization: steps/data/wav_scp2noise_list.py
vimalmanohar Jan 25, 2017
4a35cec
asr_diarization: Cluster segments AIB
vimalmanohar Jan 25, 2017
268e017
asr_diarization: Train simple HMM
vimalmanohar Jan 25, 2017
7f10cd5
asr_diarization: Add deprecated data_lib.py
vimalmanohar Jan 25, 2017
311d31f
asr_diarization: Add nnet3-copy-egs-overlapped
vimalmanohar Jan 25, 2017
d54b412
segmenterbin/Makefile
vimalmanohar Jan 25, 2017
e27267f
asr_diarization: Overlapping speech detection tuning scripts
vimalmanohar Jan 25, 2017
27ab5b2
asr_diarization: add nnet3-am-compute
vimalmanohar Jan 25, 2017
9156d29
asr_diarization: Update simple hmm
vimalmanohar Feb 6, 2017
2d13d90
asr_diarization: Update cluster-utils
vimalmanohar Feb 6, 2017
e5988b7
asr_diarization: ib clusterable
vimalmanohar Feb 6, 2017
9a86fc0
asr_diarization: init-models-from-feats
vimalmanohar Feb 6, 2017
4646f14
asr_diarization: Clustering script
vimalmanohar Feb 6, 2017
53e167d
asr_diarization: Added virtual destructor
vimalmanohar Feb 6, 2017
53dec62
error_msg: Simplifying err_msg
vimalmanohar Feb 13, 2017
94a419f
Modify the way some of the segmentation scripts work
vimalmanohar Feb 23, 2017
0465262
asr_diarization: add more checks and messages to segmentation binaries
vimalmanohar Feb 23, 2017
ff438b9
asr_diarization: Add more control over speed in the SAD scripts
vimalmanohar Mar 1, 2017
6ef5b58
asr_diarization: prepare unsad data
vimalmanohar Mar 1, 2017
1a17123
asr_diarization: Better logging in compute_cmvn_stats
vimalmanohar Mar 1, 2017
d02ef22
asr_diarization: Add perturb_data_dir_speed_random.sh
vimalmanohar Mar 1, 2017
997d17d
asr_diarization: AMI Segmentation run script
vimalmanohar Mar 1, 2017
3c4193b
asr_diarization: Trap PIPE failure in get_egs.sh
vimalmanohar Mar 1, 2017
03af27a
asr_diarization: merging with kaldi 5.1 master
vimalmanohar Mar 2, 2017
805e300
asr_diarization: Fix merging with kaldi 5.1 master
vimalmanohar Mar 3, 2017
40a2086
asr_diarization: remove short segments.
vimalmanohar Mar 6, 2017
95a550b
segmenter: Fixing RemoveSegments
vimalmanohar Apr 24, 2017
ecc483f
sad: Updating subsegment_data_dir
vimalmanohar Apr 24, 2017
20f3072
sad: xconfig stats layer
vimalmanohar Apr 24, 2017
ed129f1
sad: Make utt2num_frames default in feats extraction
vimalmanohar Apr 24, 2017
ddf58d3
segmenter: Update local recipes
vimalmanohar Apr 24, 2017
ddc85cf
segmenter: Adding some missing files
vimalmanohar Apr 25, 2017
c90097e
segmenter: resample data directory
vimalmanohar Apr 25, 2017
3ad4355
segmenter: Updating major scripts
vimalmanohar Apr 25, 2017
1dd03c7
segmenter: snr preparation
vimalmanohar Apr 25, 2017
ff4c061
segmenter: Merging from master
vimalmanohar Apr 25, 2017
c85d161
segmenter: Temporary fix for nnet3 computation
vimalmanohar Apr 25, 2017
e125693
sad: Cleaning up
vimalmanohar Apr 25, 2017
9927b22
SAD: Cleaning up steps and utils
vimalmanohar Apr 26, 2017
7b36a0f
SAD: Removing overlap detection
vimalmanohar Apr 26, 2017
42651aa
SAD: Removing diarization stuff
vimalmanohar Apr 26, 2017
e263e13
segmenter: Adding missed changes
vimalmanohar Apr 26, 2017
10eaea6
SAD: remove duplicate file
vimalmanohar Apr 26, 2017
acd23ca
SAD: More tuning recipes
vimalmanohar Apr 26, 2017
2be65d7
SAD: Removing AMI examples
vimalmanohar Apr 26, 2017
7b42322
SAD: Adding Fisher recipe
vimalmanohar Apr 26, 2017
95a212f
SAD: prepare musan music
vimalmanohar Apr 28, 2017
d1bb65b
SAD: Reorganizing some segmenter functions
vimalmanohar Apr 28, 2017
7c60cbd
segmenter: Prepare fisher data music
vimalmanohar Apr 28, 2017
3a95f37
segmenter: Remove some per-recording level stuff
vimalmanohar Apr 28, 2017
5b05099
SAD: Cleaning up corruption
vimalmanohar Apr 28, 2017
b1a934a
SAT: Bug fixes and changes
vimalmanohar May 9, 2017
30de782
SAD: Bug fixes
vimalmanohar May 10, 2017
8f38988
SAD: Updating prepare_babel_data
vimalmanohar May 10, 2017
3e5acfb
Merging from master
vimalmanohar May 10, 2017
3161e9e
segmentaion: Adding more recipes
vimalmanohar May 10, 2017
88d9ad1
Fix length tolerance
vimalmanohar May 12, 2017
a5107d6
SAD: important bug fixes
vimalmanohar May 12, 2017
5a4e6e2
SAD: Minor fixes
vimalmanohar May 12, 2017
d450f0b
SAD: Updating recipes
vimalmanohar May 12, 2017
91cc6ed
SAD: Cleaning up stuff
vimalmanohar May 16, 2017
af2068e
SAD: count frames in segmentatin-init-from-ali
vimalmanohar May 17, 2017
7a60d7d
sad: fixed bug in get multiple targets
vimalmanohar May 17, 2017
2616153
sad: fixing minor bugs
vimalmanohar May 18, 2017
28d16b3
SAD: Commit from master
vimalmanohar May 18, 2017
fbe1828
SAD: Fix simple-hmm-utils.cc
vimalmanohar May 18, 2017
f10a009
SAD: Reverting some changes to allocate_multilingual_examples
vimalmanohar May 18, 2017
5dc7f36
SAD: Fixing multitask stuff
vimalmanohar May 19, 2017
eaa17fe
Merge branch 'sat' of github.com:vimalmanohar/kaldi into asr_diarizat…
vimalmanohar May 19, 2017
37c2ee6
SAD: Fixing order
vimalmanohar May 19, 2017
849b904
SAD: Fixing compilation issues
vimalmanohar May 19, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions egs/aspire/s5/conf/mfcc_hires_bp.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# config for high-resolution MFCC features, intended for neural network training.
# Note: we keep all cepstra, so it has the same info as filterbank features,
# but MFCC is more easily compressible (because less correlated) which is why
# we prefer this method.
# This config is defined only on the frequencies from 330 Hz to
# 3000 Hz conrresponding to the telephone bandwidth.
--use-energy=false # use average of log energy, not energy.
--sample-frequency=8000 # Switchboard is sampled at 8kHz
--num-mel-bins=28
--num-ceps=28
--cepstral-lifter=0
--low-freq=330 # low cutoff frequency for mel bins
--high-freq=-1000 # high cutoff frequently, relative to Nyquist of 4000 (=3000)


14 changes: 14 additions & 0 deletions egs/aspire/s5/conf/segmentation_music.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# General segmentation options for segmentation on music / non-music
pad_length=-1 # Pad speech segments by this many frames on either side
max_blend_length=-1 # Maximum duration of speech that will be removed as part
# of smoothing process. This is only if there are no other
# speech segments nearby.
max_intersegment_length=0 # Merge nearby speech segments if the silence
# between them is less than this many frames.
post_pad_length=-1 # Pad speech segments by this many frames on either side
# after the merging process using max_intersegment_length
max_segment_length=1000 # Segments that are longer than this are split into
# overlapping frames.
overlap_length=250 # Overlapping frames when segments are split.
# See the above option.
min_silence_length=100000 # Min silence length at which to split very long segments
14 changes: 14 additions & 0 deletions egs/aspire/s5/conf/segmentation_speech.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# General segmentation options for SAD
pad_length=20 # Pad speech segments by this many frames on either side
max_relabel_length=10 # Maximum duration of speech that will be removed as part
# of smoothing process. This is only if there are no other
# speech segments nearby.
max_intersegment_length=30 # Merge nearby speech segments if the silence
# between them is less than this many frames.
post_pad_length=10 # Pad speech segments by this many frames on either side
# after the merging process using max_intersegment_length
max_segment_length=1000 # Segments that are longer than this are split into
# overlapping frames.
overlap_length=250 # Overlapping frames when segments are split.
# See the above option.
min_silence_length=20 # Min silence length at which to split very long segments
15 changes: 15 additions & 0 deletions egs/aspire/s5/conf/segmentation_speech_simple.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# A simple segmentation post-processing options for SAD
pad_length=20 # Pad speech segments by this many frames on either side
max_relabel_length=-1 # Maximum duration of speech that will be removed as part
# of smoothing process. This is only if there are no other
# speech segments nearby. -1 is to disable this step.
max_intersegment_length=30 # Merge nearby speech segments if the silence
# between them is less than this many frames.
post_pad_length=-1 # Pad speech segments by this many frames on either side
# after the merging process using max_intersegment_length
# -1 is to disable this step.
max_segment_length=1000 # Segments that are longer than this are split into
# overlapping frames.
overlap_length=250 # Overlapping frames when segments are split.
# See the above option.
min_silence_length=20 # Min silence length at which to split very long segments
11 changes: 8 additions & 3 deletions egs/aspire/s5/local/multi_condition/get_ctm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,7 @@ decode_mbr=true
filter_ctm_command=cp
glm=
stm=
window=10
overlap=5
resolve_overlaps=true
[ -f ./path.sh ] && . ./path.sh
. parse_options.sh || exit 1;

Expand Down Expand Up @@ -62,7 +61,13 @@ lattice-align-words-lexicon --output-error-lats=true --output-if-empty=true --ma
lattice-to-ctm-conf $frame_shift_opt --decode-mbr=$decode_mbr ark:- $decode_dir/score_$LMWT/penalty_$wip/ctm.overlapping || exit 1;

# combine the segment-wise ctm files, while resolving overlaps
python local/multi_condition/resolve_ctm_overlaps.py --overlap $overlap --window-length $window $data_dir/utt2spk $decode_dir/score_$LMWT/penalty_$wip/ctm.overlapping $decode_dir/score_$LMWT/penalty_$wip/ctm.merged || exit 1;
if $resolve_overlaps; then
steps/resolve_ctm_overlaps.py $data_dir/segments \
$decode_dir/score_$LMWT/penalty_$wip/ctm.overlapping \
$decode_dir/score_$LMWT/penalty_$wip/ctm.merged || exit 1;
else
cp $decode_dir/score_$LMWT/penalty_$wip/ctm.overlapping $decode_dir/score_$LMWT/penalty_$wip/ctm.merged || exit 1;
fi
merged_ctm=$decode_dir/score_$LMWT/penalty_$wip/ctm.merged

cat $merged_ctm | utils/int2sym.pl -f 5 $lang/words.txt | \
Expand Down
153 changes: 153 additions & 0 deletions egs/aspire/s5/local/nnet3/prep_test_aspire_segmentation.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
#!/bin/bash

# Copyright Johns Hopkins University (Author: Daniel Povey, Vijayaditya Peddinti) 2016. Apache 2.0.
# This script generates the ctm files for dev_aspire, test_aspire and eval_aspire
# for scoring with ASpIRE scoring server.
# It also provides the WER for dev_aspire data.

set -e
set -o pipefail
set -u

# general opts
iter=final
stage=0
decode_num_jobs=30
num_jobs=30
affix=

# ivector opts
max_count=75 # parameter for extract_ivectors.sh
sub_speaker_frames=6000
ivector_scale=0.75
filter_ctm=true
weights_file=
silence_weight=0.00001

# decode opts
pass2_decode_opts="--min-active 1000"
lattice_beam=8
extra_left_context=0 # change for (B)LSTM
extra_right_context=0 # change for BLSTM
frames_per_chunk=50 # change for (B)LSTM
acwt=0.1 # important to change this when using chain models
post_decode_acwt=1.0 # important to change this when using chain models

. ./cmd.sh
[ -f ./path.sh ] && . ./path.sh
. utils/parse_options.sh || exit 1;

if [ $# -ne 5 ]; then
echo "Usage: $0 [options] <data-set> <seg-data-dir> <lang-dir> <graph-dir> <model-dir>"
echo " Options:"
echo " --stage (0|1|2) # start scoring script from part-way through."
echo "e.g.:"
echo "$0 dev_aspire data/lang exp/tri5a/graph_pp exp/nnet3/tdnn"
exit 1;
fi

data_set=$1
seg_data_dir=$2
lang=$3 # data/lang
graph=$4 #exp/tri5a/graph_pp
dir=$5 # exp/nnet3/tdnn

model_affix=`basename $dir`
ivector_dir=exp/nnet3
ivector_affix=${affix:+_$affix}_chain_${model_affix}_iter$iter
affix=_${affix}_iter${iter}
act_data_set=${data_set} # we will modify the data dir, when segmenting it
# so we will keep track of original data dirfor the glm and stm files

if [[ "$data_set" =~ "test_aspire" ]]; then
out_file=single_dev_test${affix}_$model_affix.ctm
elif [[ "$data_set" =~ "eval_aspire" ]]; then
out_file=single_eval${affix}_$model_affix.ctm
elif [[ "$data_set" =~ "dev_aspire" ]]; then
# we will just decode the directory without oracle segments file
# as we would like to operate in the actual evaluation condition
out_file=single_dev${affix}_${model_affix}.ctm
else
exit 1
fi

# uniform segmentation script would have created this dataset
# so update that script if you plan to change this variable
segmented_data_set=${data_set}${affix}_seg

if [ $stage -le 1 ]; then
utils/copy_data_dir.sh $seg_data_dir data/${segmented_data_set}
fi

if [ $stage -le 2 ]; then
mfccdir=mfcc_reverb
if [[ $(hostname -f) == *.clsp.jhu.edu ]] && [ ! -d $mfccdir/storage ]; then
date=$(date +'%m_%d_%H_%M')
utils/create_split_dir.pl /export/b0{1,2,3,4}/$USER/kaldi-data/egs/aspire-$date/s5/$mfccdir/storage $mfccdir/storage
fi

utils/copy_data_dir.sh data/${segmented_data_set} data/${segmented_data_set}_hires
steps/make_mfcc.sh --nj 30 --cmd "$train_cmd" \
--mfcc-config conf/mfcc_hires.conf data/${segmented_data_set}_hires \
exp/make_reverb_hires/${segmented_data_set} $mfccdir
steps/compute_cmvn_stats.sh data/${segmented_data_set}_hires \
exp/make_reverb_hires/${segmented_data_set} $mfccdir
utils/fix_data_dir.sh data/${segmented_data_set}_hires
utils/validate_data_dir.sh --no-text data/${segmented_data_set}_hires
fi

decode_dir=$dir/decode_${segmented_data_set}_pp
if [ $stage -le 5 ]; then
echo "Extracting i-vectors, stage 2"
# this does offline decoding, except we estimate the iVectors per
# speaker, excluding silence (based on alignments from a DNN decoding), with a
# different script. This is just to demonstrate that script.
# the --sub-speaker-frames is optional; if provided, it will divide each speaker
# up into "sub-speakers" of at least that many frames... can be useful if
# acoustic conditions drift over time within the speaker's data.
steps/online/nnet2/extract_ivectors.sh --cmd "$train_cmd" --nj 20 \
--sub-speaker-frames $sub_speaker_frames --max-count $max_count \
data/${segmented_data_set}_hires $lang $ivector_dir/extractor \
$ivector_dir/ivectors_${segmented_data_set}${ivector_affix};
fi

if [ $stage -le 6 ]; then
echo "Generating lattices, stage 2 with --acwt $acwt"
rm -f ${decode_dir}_tg/.error
steps/nnet3/decode.sh --nj $decode_num_jobs --cmd "$decode_cmd" --config conf/decode.config $pass2_decode_opts \
--acwt $acwt --post-decode-acwt $post_decode_acwt \
--extra-left-context $extra_left_context \
--extra-right-context $extra_right_context \
--frames-per-chunk "$frames_per_chunk" \
--skip-scoring true --iter $iter --lattice-beam $lattice_beam \
--online-ivector-dir $ivector_dir/ivectors_${segmented_data_set}${ivector_affix} \
$graph data/${segmented_data_set}_hires ${decode_dir}_tg || touch ${decode_dir}_tg/.error
[ -f ${decode_dir}_tg/.error ] && echo "$0: Error decoding" && exit 1;
fi

if [ $stage -le 7 ]; then
echo "Rescoring lattices"
steps/lmrescore_const_arpa.sh --cmd "$decode_cmd" \
--skip-scoring true \
${lang}_pp_test{,_fg} data/${segmented_data_set}_hires \
${decode_dir}_{tg,fg};
fi

decode_dir=${decode_dir}_fg

if [ $stage -le 8 ]; then
local/score_aspire.sh --cmd "$decode_cmd" \
--min-lmwt 1 --max-lmwt 20 \
--word-ins-penalties "0.0,0.25,0.5,0.75,1.0" \
--ctm-beam 6 \
--iter $iter \
--decode-mbr true \
--resolve-overlaps false \
--tune-hyper true \
$lang $decode_dir $act_data_set $segmented_data_set $out_file
fi

# Two-pass decoding baseline
# %WER 27.8 | 2120 27217 | 78.2 13.6 8.2 6.0 27.8 75.9 | -0.613 | exp/chain/tdnn_7b/decode_dev_aspire_whole_uniformsegmented_win10_over5_v6_200jobs_iterfinal_pp_fg/score_9/penalty_0.0/ctm.filt.filt.sys
# Using automatic segmentation
# %WER 28.2 | 2120 27214 | 76.5 12.4 11.1 4.7 28.2 75.2 | -0.522 | exp/chain/tdnn_7b/decode_dev_aspire_seg_v7_n_stddev_iterfinal_pp_fg/score_10/penalty_0.0/ctm.filt.filt.sys
9 changes: 4 additions & 5 deletions egs/aspire/s5/local/score_aspire.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,9 @@ word_ins_penalties=0.0,0.25,0.5,0.75,1.0
default_wip=0.0
ctm_beam=6
decode_mbr=true
window=30
overlap=5
cmd=run.pl
stage=1
resolve_overlaps=true
tune_hyper=true # if true:
# if the data set is "dev_aspire" we check for the
# best lmwt and word_insertion_penalty,
Expand Down Expand Up @@ -89,7 +88,7 @@ if $tune_hyper ; then
# or use the default values

if [ $stage -le 1 ]; then
if [ "$act_data_set" == "dev_aspire" ]; then
if [[ "$act_data_set" =~ "dev_aspire" ]]; then
wip_string=$(echo $word_ins_penalties | sed 's/,/ /g')
temp_wips=($wip_string)
$cmd WIP=1:${#temp_wips[@]} $decode_dir/scoring/log/score.wip.WIP.log \
Expand All @@ -98,8 +97,8 @@ if $tune_hyper ; then
echo \$wip \&\& \
$cmd LMWT=$min_lmwt:$max_lmwt $decode_dir/scoring/log/score.LMWT.\$wip.log \
local/multi_condition/get_ctm.sh --filter-ctm-command "$filter_ctm_command" \
--window $window --overlap $overlap \
--beam $ctm_beam --decode-mbr $decode_mbr \
--resolve-overlaps $resolve_overlaps \
--glm data/${act_data_set}/glm --stm data/${act_data_set}/stm \
LMWT \$wip $lang data/${segmented_data_set}_hires $model $decode_dir || exit 1;

Expand All @@ -124,7 +123,7 @@ wipfile.close()
fi


if [ "$act_data_set" == "test_aspire" ] || [ "$act_data_set" == "eval_aspire" ]; then
if [[ "$act_data_set" =~ "test_aspire" ]] || [[ "$act_data_set" =~ "eval_aspire" ]]; then
# check for the best values from dev_aspire decodes
dev_decode_dir=$(echo $decode_dir|sed "s/test_aspire/dev_aspire_whole/g; s/eval_aspire/dev_aspire_whole/g")
if [ -f $dev_decode_dir/scoring/bestLMWT ]; then
Expand Down
Loading