Skip to content

Benchmark of fgemv for Givaro::Integer in the field of Givaro::ZRing

ZHG2017 edited this page Mar 12, 2019 · 3 revisions
Time Gflops q(field characteristic ) b(nbBits) p(0 for sequential, 1 for <Recursive,Thread>, 2 for <Row,Thread>, 3 for <Row, Grain>) m(dimension m of the matrix) k(dimension k of the matrix)  N(number of numa blocks per dimension) i(number of repetitions) s(seed) number of threads to drive the partition
2.17232 0.0147308  0 100 0 4000 4000 1 10 -1244234020 1
3.72062 0.00860072 0 200 0 4000 4000 1 10 -1244234020 1
0.534122 0.0149779 0 100 0 2000 2000 1 10 -1244234020 1
0.90522 0.00883763  0 200 0 2000 2000 1 10 -1244234020 1
1.60806 0.0198998 0 100 1 4000 4000 1 10 -1244234020 2
1.60362 0.0199548 0 100 2 4000 4000 1 10 -1244234020 2
1.39404 0.0229548 0 100 3 4000 4000 1 10 -1244234020 2
1.47595 0.0216809 0 100 1 4000 4000 1 10 -1244234020 4
1.47145 0.0217472 0 100 2 4000 4000 1 10 -1244234020 4
1.08249 0.0295613 0 100 3 4000 4000 1 10 -1244234020 4
2.84081 0.0112644 0 200 1 4000 4000 1 10 -1244234020 2
2.86566 0.0111667 0 200 2 4000 4000 1 10 -1244234020 2
2.33891 0.0136816 0 200 3 4000 4000 1 10 -1244234020 2
2.73772 0.0116886 0 200 1 4000 4000 1 10 -1244234020 4
2.61494 0.0122374 0 200 2 4000 4000 1 10 -1244234020 4
2.32346 0.0137726 0 200 3 4000 4000 1 10 -1244234020 4
0.387274 0.0206572 0 100 1 2000 2000 1 10 -1244234020 2
0.391406 0.0204391 0 100 2 2000 2000 1 10 -1244234020 2
0.343588 0.0232837 0 100 3 2000 2000 1 10 -1244234020 2
0.353509 0.0226302 0 100 1 2000 2000 1 10 -1244234020 4
0.351224 0.0227775 0 100 2 2000 2000 1 10 -1244234020 4
0.248488 0.0321947 0 100 3 2000 2000 1 10 -1244234020 4
0.680392 0.0117579 0 200 1 2000 2000 1 10 -1244234020 2
0.686015 0.0116616 0 200 2 2000 2000 1 10 -1244234020 2
0.575811 0.0138934 0 200 3 2000 2000 1 10 -1244234020 2
0.701522 0.0114038 0 200 1 2000 2000 1 10 -1244234020 4
0.662669 0.0120724 0 200 2 2000 2000 1 10 -1244234020 4
0.461292 0.0173426 0 200 3 2000 2000 1 10 -1244234020 4

Benchmarks of recursive thread splitting method using different number of threads

Time Gflops q(field characteristic ) b(nbBits) p(0 for sequential, 1 for <Recursive,Thread>, 2 for <Row,Thread>, 3 for <Row, Grain>) m(dimension m of the matrix) k(dimension k of the matrix) N(number of numa blocks per dimension) i(number of repetitions) s(seed) number of threads to drive the partition
1.51053 0.0211846 0 100 1 4000 4000 1 10 -1244234020 8
1.47636 0.0216749 0 100 1 4000 4000 1 10 -1244234020 16
1.39447 0.0229477 0 100 1 4000 4000 1 10 -1244234020 32
1.19449 0.0267897 0 100 1 4000 4000 1 10 -1244234020 64
1.12058 0.0285567 0 100 1 4000 4000 1 10 -1244234020 128
1.07563 0.02975 0 100 1 4000 4000 1 10 -1244234020 256
1.05241 0.0304063 0 100 1 4000 4000 1 10 -1244234020 512
1.14813 0.0278715 0 100 1 4000 4000 1 10 -1244234020 1024
1.60951 0.0198818 0 100 1 4000 4000 1 10 -1244234020 2048

Benchmarks of row based thread splitting method using different number of threads

Time Gflops q(field characteristic ) b(nbBits) p(0 for sequential, 1 for <Recursive,Thread>, 2 for <Row,Thread>, 3 for <Row, Grain>) m(dimension m of the matrix) k(dimension k of the matrix) N(number of numa blocks per dimension) i(number of repetitions) s(seed) number of threads to drive the partition
1.42987 0.0223797 0 100 2 4000 4000 1 10 -1244234020 8
1.393 0.022972 0 100 2 4000 4000 1 10 -1244234020 16
1.31675 0.0243022 0 100 2 4000 4000 1 10 -1244234020 32
1.08845 0.0293997 0 100 2 4000 4000 1 10 -1244234020 64
0.957888 0.0334068 0 100 2 4000 4000 1 10 -1244234020 128
0.910546 0.0351437 0 100 2 4000 4000 1 10 -1244234020 256
0.84995 0.0376493 0 100 2 4000 4000 1 10 -1244234020 512
0.866475 0.0369312 0 100 2 4000 4000 1 10 -1244234020 1024
1.2199 0.0262316 0 100 2 4000 4000 1 10 -1244234020 2048

Benchmarks of grain size based thread splitting method using different number of threads

Time Gflops q(field characteristic ) b(nbBits) p(0 for sequential, 1 for <Recursive,Thread>, 2 for <Row,Thread>, 3 for <Row, Grain>) m(dimension m of the matrix) k(dimension k of the matrix) N(number of numa blocks per dimension) i(number of repetitions) s(seed) number of threads to drive the partition
1.18626 0.0269755 0 100 3 4000 4000 1 10 -1244234020 8
1.20364 0.0265859 0 100 3 4000 4000 1 10 -1244234020 16
1.06228 0.030124 0 100 3 4000 4000 1 10 -1244234020 32
1.08435 0.0295107 0 100 3 4000 4000 1 10 -1244234020 64
1.18055 0.0271059 0 100 3 4000 4000 1 10 -1244234020 128
1.09445 0.0292385 0 100 3 4000 4000 1 10 -1244234020 256
1.17548 0.0272229 0 100 3 4000 4000 1 10 -1244234020 512
1.07947 0.0296441 0 100 3 4000 4000 1 10 -1244234020 1024
1.18223 0.0270676 0 100 3 4000 4000 1 10 -1244234020 2048