Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up AddColSumMat with transfrom reduce kernel template #1530

Merged
merged 1 commit into from
Apr 4, 2017

Conversation

kangshiyin
Copy link
Contributor

Speed up AddColSumMat with transfrom reduce kernel template

speed(gflops)                                  dim  old     new    speedup
    CuVector::AddColSumMat<float>[no-trans],    16  0.0057  0.0172 3.01x
    CuVector::AddColSumMat<float>[no-trans],    32  0.0242  0.0668 2.76x
    CuVector::AddColSumMat<float>[no-trans],    64  0.0992  0.2577 2.60x
    CuVector::AddColSumMat<float>[no-trans],   128  0.3747  0.9280 2.48x
    CuVector::AddColSumMat<float>[no-trans],   256  1.4711  3.0541 2.08x
    CuVector::AddColSumMat<float>[no-trans],   512  5.1709  9.4713 1.83x
    CuVector::AddColSumMat<float>[no-trans],  1024 12.4352 20.4517 1.64x
    CuVector::AddColSumMat<double>[no-trans],   16  0.0060  0.0175 2.91x
    CuVector::AddColSumMat<double>[no-trans],   32  0.0240  0.0672 2.80x
    CuVector::AddColSumMat<double>[no-trans],   64  0.1006  0.2712 2.70x
    CuVector::AddColSumMat<double>[no-trans],  128  0.3691  0.9097 2.46x
    CuVector::AddColSumMat<double>[no-trans],  256  1.4530  3.1044 2.14x
    CuVector::AddColSumMat<double>[no-trans],  512  4.4524  7.5872 1.70x
    CuVector::AddColSumMat<double>[no-trans], 1024 11.1212 16.1423 1.45x

    CuVector::AddColSumMat<float>[no-trans],    16  0.0057  0.0172 3.01x
    CuVector::AddColSumMat<float>[no-trans],    32  0.0242  0.0668 2.76x
    CuVector::AddColSumMat<float>[no-trans],    64  0.0992  0.2577 2.60x
    CuVector::AddColSumMat<float>[no-trans],   128  0.3747  0.9280 2.48x
    CuVector::AddColSumMat<float>[no-trans],   256  1.4711  3.0541 2.08x
    CuVector::AddColSumMat<float>[no-trans],   512  5.1709  9.4713 1.83x
    CuVector::AddColSumMat<float>[no-trans],  1024 12.4352 20.4517 1.64x
    CuVector::AddColSumMat<double>[no-trans],   16  0.0060  0.0175 2.91x
    CuVector::AddColSumMat<double>[no-trans],   32  0.0240  0.0672 2.80x
    CuVector::AddColSumMat<double>[no-trans],   64  0.1006  0.2712 2.70x
    CuVector::AddColSumMat<double>[no-trans],  128  0.3691  0.9097 2.46x
    CuVector::AddColSumMat<double>[no-trans],  256  1.4530  3.1044 2.14x
    CuVector::AddColSumMat<double>[no-trans],  512  4.4524  7.5872 1.70x
    CuVector::AddColSumMat<double>[no-trans], 1024 11.1212 16.1423 1.45x
@danpovey
Copy link
Contributor

danpovey commented Apr 4, 2017

Thanks! Merging.

@danpovey danpovey merged commit d8b34d4 into kaldi-asr:master Apr 4, 2017
kronos-cm added a commit to kronos-cm/kaldi that referenced this pull request Apr 5, 2017
* 'master' of https://github.com/kaldi-asr/kaldi:
  [src] Cosmetic change: remove 'train.tra' from usage messages (kaldi-asr#1529)
  [src] cudamatrix: speed up AddColSumMat with transfrom reduce kernel template (kaldi-asr#1530)
  [build]: remove openfst check (kaldi-asr#1531)
  [build,src,doc] Modify get_version.sh to deal better with whitespace (avoid space in version); minor fixes (kaldi-asr#1526)
  [scripts,egs] Adding options for using PCA instead of LDA+MLLT for ivectors used in ASR. Results are reported in the default TDNN recipe in AMI. Updating steps/online/nnet2/{train_diag_ubm.sh,train_ivector_extractor.sh} so that they now backup the contents of their destination directory if it already exists. (kaldi-asr#1514)
  [src] (minor) Added missing SetZero() to NaturalGradientAffineComponent::Scale() if scale==0.0 (kaldi-asr#1522)
  [src,doc] Fix several unrelated minor problems.  Thanks: gaoxinglong
  [src] Adding noexcept to hashing function objects (kaldi-asr#1519)
  [egs] Fix to egs/wsj/s5/run.sh (unset variable) (kaldi-asr#1517)
  [misc] remove eXecute permissions where not needed (kaldi-asr#1515)
  [src,scripts]: Several unrelated cosmetic changes
  [egs] fixes to babel pipeline; thanks to Fred Richardson (kaldi-asr#1509)
  [src] Fix exit code of extract-rows.cc (kaldi-asr#1510)
david-ryan-snyder pushed a commit to david-ryan-snyder/kaldi that referenced this pull request Apr 12, 2017
…template (kaldi-asr#1530)

CuVector::AddColSumMat<float>[no-trans],    16  0.0057  0.0172 3.01x
    CuVector::AddColSumMat<float>[no-trans],    32  0.0242  0.0668 2.76x
    CuVector::AddColSumMat<float>[no-trans],    64  0.0992  0.2577 2.60x
    CuVector::AddColSumMat<float>[no-trans],   128  0.3747  0.9280 2.48x
    CuVector::AddColSumMat<float>[no-trans],   256  1.4711  3.0541 2.08x
    CuVector::AddColSumMat<float>[no-trans],   512  5.1709  9.4713 1.83x
    CuVector::AddColSumMat<float>[no-trans],  1024 12.4352 20.4517 1.64x
    CuVector::AddColSumMat<double>[no-trans],   16  0.0060  0.0175 2.91x
    CuVector::AddColSumMat<double>[no-trans],   32  0.0240  0.0672 2.80x
    CuVector::AddColSumMat<double>[no-trans],   64  0.1006  0.2712 2.70x
    CuVector::AddColSumMat<double>[no-trans],  128  0.3691  0.9097 2.46x
    CuVector::AddColSumMat<double>[no-trans],  256  1.4530  3.1044 2.14x
    CuVector::AddColSumMat<double>[no-trans],  512  4.4524  7.5872 1.70x
    CuVector::AddColSumMat<double>[no-trans], 1024 11.1212 16.1423 1.45x
Skaiste pushed a commit to Skaiste/idlak that referenced this pull request Sep 26, 2018
…template (kaldi-asr#1530)

CuVector::AddColSumMat<float>[no-trans],    16  0.0057  0.0172 3.01x
    CuVector::AddColSumMat<float>[no-trans],    32  0.0242  0.0668 2.76x
    CuVector::AddColSumMat<float>[no-trans],    64  0.0992  0.2577 2.60x
    CuVector::AddColSumMat<float>[no-trans],   128  0.3747  0.9280 2.48x
    CuVector::AddColSumMat<float>[no-trans],   256  1.4711  3.0541 2.08x
    CuVector::AddColSumMat<float>[no-trans],   512  5.1709  9.4713 1.83x
    CuVector::AddColSumMat<float>[no-trans],  1024 12.4352 20.4517 1.64x
    CuVector::AddColSumMat<double>[no-trans],   16  0.0060  0.0175 2.91x
    CuVector::AddColSumMat<double>[no-trans],   32  0.0240  0.0672 2.80x
    CuVector::AddColSumMat<double>[no-trans],   64  0.1006  0.2712 2.70x
    CuVector::AddColSumMat<double>[no-trans],  128  0.3691  0.9097 2.46x
    CuVector::AddColSumMat<double>[no-trans],  256  1.4530  3.1044 2.14x
    CuVector::AddColSumMat<double>[no-trans],  512  4.4524  7.5872 1.70x
    CuVector::AddColSumMat<double>[no-trans], 1024 11.1212 16.1423 1.45x
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants