Skip to content

Commit

Permalink
[tex] include also ggttg/ggttggg in the table
Browse files Browse the repository at this point in the history
Revision c2e67b4 [nvcc 11.6.55 (gcc 10.2.0)] [inlineHel=0]
            eemumu      ggtt        ggttg       ggttgg      ggttggg
CUD/none    1.35e+09    1.41e+08    1.45e+07    5.20e+05    1.18e+04
CPP/none    1.67e+06    2.01e+05    2.48e+04    1.81e+03    7.22e+01
CPP/sse4    3.13e+06    3.17e+05    4.54e+04    3.34e+03    1.32e+02
CPP/avx2    5.54e+06    5.64e+05    8.86e+04    6.83e+03    2.61e+02
CPP/512y    5.82e+06    6.15e+05    9.83e+04    7.49e+03    2.88e+02
CPP/512z    4.65e+06    3.75e+05    7.19e+04    6.52e+03    2.94e+02

Revision 4f3229d [nvcc 11.6.55 (icx 20210400, clang 13.0.0, gcc 10.2.0)] [inlineHel=0]
            eemumu      ggtt        ggttg       ggttgg      ggttggg
CUD/none    1.33e+09    1.42e+08    1.45e+07    5.14e+05    1.19e+04
CPP/none    7.60e+06    2.15e+05    2.43e+04    1.50e+03    7.21e+01
CPP/sse4    7.89e+06    4.45e+05    4.57e+04    2.82e+03    1.05e+02
CPP/avx2    1.19e+07    6.93e+05    1.04e+05    7.61e+03    2.42e+02
CPP/512y    1.19e+07    7.50e+05    1.09e+05    8.45e+03    2.74e+02
CPP/512z    9.37e+06    5.09e+05    7.85e+04    5.85e+03    2.66e+02
  • Loading branch information
valassi committed Jan 25, 2022
1 parent c2e67b4 commit 1e7fd92
Show file tree
Hide file tree
Showing 9 changed files with 537 additions and 534 deletions.
15 changes: 9 additions & 6 deletions epochX/cudacpp/tput/latexParser.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ cd $(dirname $0)/..

# Select revisions
revs=""
revs="$revs 6307b62" # cuda116/gcc102 (BASELINE 24 Jan 2022)
revs="$revs 37537ce" # cuda116/icx2021 (24 Jan 2022)
revs="$revs c2e67b4" # cuda116/gcc102 BASELINE (25 Jan 2022) eemumu/ggtt/ggttgg x f/d x inl0/inl1 + ggttg/ggttggg x f/d
revs="$revs 4f3229d" # cuda116/icx2021 (25 Jan 2022) eemumu/ggtt/ggttgg x f/d x inl0/inl1 + ggttg/ggttggg x f/d

# Select processes
procs="eemumu ggtt ggttgg"
procs="eemumu ggtt ggttg ggttgg ggttggg"

# Select fptype
fpts="d"
Expand All @@ -20,19 +20,22 @@ inls="inl0"

# Iterate through log files
for rev in $revs; do
###echo "*** REVISION $rev ***"
files=""
for proc in $procs; do
for fpt in $fpts; do
for inl in $inlss; do
for inl in $inls; do
file=tput/logs_${proc}_manu/log_${proc}_manu_${fpt}_${inl}_hrd0.txt
files="$files $file"
if [ -f $file ]; then files="$files $file"; fi
done
done
done
###echo "*** FILES $files ***"
if [ "$files" == "" ]; then continue; fi
git checkout $rev $files >& /dev/null
###cat $files | awk '/^Process/{print $0}; /Workflow/{print $0}; /MECalcOnly/{print $0}'; exit 0
cat $files | awk -vrev=$rev\
'/^Process(.)*nvcc/{split($0,a,"["); comp="["a[2]; if ( comp != complast ){print "Revision", rev, comp; complast=comp}};\
'/^Process(.)*nvcc/{split($0,a,"["); comp="["a[2]"["a[3]; if ( comp != complast ){print "Revision", rev, comp; complast=comp}};\
/^Process/{proc=""; split($3,a,"_"); proc=a[3]"_"a[4]};\
/Workflow/{tag=""; split($4,a,":"); tag=a[1]; split($4,a,"+"); split(a[4],b,"/"); tag=tag"/"b[2]};\
/MECalcOnly/{tput=sprintf("%.2e", $5); tput_proc_tag[proc,tag]=tput}; \
Expand Down
124 changes: 62 additions & 62 deletions epochX/cudacpp/tput/logs_eemumu_manu/log_eemumu_manu_d_inl0_hrd0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -68,112 +68,112 @@ make[1]: Entering directory `/data/avalassi/GPU2020/madgraph4gpuX/epochX/cudacpp
make[1]: Nothing to be done for `all.512z_d_inl0_hrd0_hasCurand'.
make[1]: Leaving directory `/data/avalassi/GPU2020/madgraph4gpuX/epochX/cudacpp/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum'

DATE: 2022-01-24_22:27:27
DATE: 2022-01-24_22:45:38

On itscrd70.cern.ch [CPU: Intel(R) Xeon(R) Silver 4216 CPU] [GPU: 1x Tesla V100S-PCIE-32GB]:
=========================================================================
runExe /data/avalassi/GPU2020/madgraph4gpuX/epochX/cudacpp/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum/build.none_d_inl0_hrd0/gcheck.exe -p 2048 256 12 OMP=
Process = SIGMA_SM_EPEM_MUPMUM_CUDA [nvcc 11.6.55 (gcc 10.2.0)] [inlineHel=0] [hardcodeCIPC=0]
Process = SIGMA_SM_EPEM_MUPMUM_CUDA [nvcc 11.6.55 (icx 20210400, clang 13.0.0, gcc 10.2.0)] [inlineHel=0] [hardcodeCIPC=0]
Workflow summary = CUD:DBL+THX:CURDEV+RMBDEV+MESDEV/none+NAVBRK
FP precision = DOUBLE (NaN/abnormal=0, zero=0)
EvtsPerSec[Rmb+ME] (23) = ( 6.289489e+07 ) sec^-1
EvtsPerSec[MatrixElems] (3) = ( 6.592383e+08 ) sec^-1
EvtsPerSec[MECalcOnly] (3a) = ( 1.350779e+09 ) sec^-1
EvtsPerSec[Rmb+ME] (23) = ( 6.636460e+07 ) sec^-1
EvtsPerSec[MatrixElems] (3) = ( 6.560540e+08 ) sec^-1
EvtsPerSec[MECalcOnly] (3a) = ( 1.330886e+09 ) sec^-1
MeanMatrixElemValue = ( 1.371706e-02 +- 3.270315e-06 ) GeV^0
TOTAL : 0.984690 sec
822,046,743 cycles:u # 0.690 GHz
1,590,022,484 instructions:u # 1.93 insn per cycle
1.289560643 seconds time elapsed
TOTAL : 1.204962 sec
1,124,525,580 cycles:u # 0.814 GHz
2,602,874,292 instructions:u # 2.31 insn per cycle
1.520311103 seconds time elapsed
==PROF== Profiling "sigmaKin": launch__registers_per_thread 128
==PROF== Profiling "sigmaKin": sm__sass_average_branch_targets_threads_uniform.pct 100%
.........................................................................
=========================================================================
runExe /data/avalassi/GPU2020/madgraph4gpuX/epochX/cudacpp/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum/build.none_d_inl0_hrd0/check.exe -p 2048 256 12 OMP=
Process = SIGMA_SM_EPEM_MUPMUM_CPP [gcc 10.2.0] [inlineHel=0] [hardcodeCIPC=0]
Process = SIGMA_SM_EPEM_MUPMUM_CPP [icx 20210400 (clang 13.0.0, gcc 10.2.0)] [inlineHel=0] [hardcodeCIPC=0]
Workflow summary = CPP:DBL+CXS:CURHST+RMBHST+MESHST/none+NAVBRK
FP precision = DOUBLE (NaN/abnormal=0, zero=0)
Internal loops fptype_sv = SCALAR ('none': ~vector[1], no SIMD)
EvtsPerSec[Rmb+ME] (23) = ( 1.092425e+06 ) sec^-1
EvtsPerSec[MatrixElems] (3) = ( 1.673414e+06 ) sec^-1
EvtsPerSec[MECalcOnly] (3a) = ( 1.673414e+06 ) sec^-1
EvtsPerSec[Rmb+ME] (23) = ( 4.052371e+06 ) sec^-1
EvtsPerSec[MatrixElems] (3) = ( 7.600715e+06 ) sec^-1
EvtsPerSec[MECalcOnly] (3a) = ( 7.600715e+06 ) sec^-1
MeanMatrixElemValue = ( 1.371706e-02 +- 3.270315e-06 ) GeV^0
TOTAL : 6.243676 sec
16,491,797,245 cycles:u # 2.636 GHz
40,384,986,574 instructions:u # 2.45 insn per cycle
6.259891175 seconds time elapsed
=Symbols in CPPProcess.o= (~sse4: 280) (avx2: 0) (512y: 0) (512z: 0)
TOTAL : 2.266843 sec
5,513,636,071 cycles:u # 2.462 GHz
12,074,915,325 instructions:u # 2.19 insn per cycle
2.284609318 seconds time elapsed
=Symbols in CPPProcess.o= (~sse4: 579) (avx2: 0) (512y: 0) (512z: 0)
-------------------------------------------------------------------------
runExe /data/avalassi/GPU2020/madgraph4gpuX/epochX/cudacpp/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum/build.none_d_inl0_hrd0/runTest.exe
[ PASSED ] 6 tests.
-------------------------------------------------------------------------
runExe /data/avalassi/GPU2020/madgraph4gpuX/epochX/cudacpp/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum/build.sse4_d_inl0_hrd0/check.exe -p 2048 256 12 OMP=
Process = SIGMA_SM_EPEM_MUPMUM_CPP [gcc 10.2.0] [inlineHel=0] [hardcodeCIPC=0]
Workflow summary = CPP:DBL+CXS:CURHST+RMBHST+MESHST/sse4+CXVBRK
Process = SIGMA_SM_EPEM_MUPMUM_CPP [icx 20210400 (clang 13.0.0, gcc 10.2.0)] [inlineHel=0] [hardcodeCIPC=0]
Workflow summary = CPP:DBL+CXS:CURHST+RMBHST+MESHST/sse4+NOVBRK
FP precision = DOUBLE (NaN/abnormal=0, zero=0)
Internal loops fptype_sv = VECTOR[2] ('sse4': SSE4.2, 128bit) [cxtype_ref=YES]
EvtsPerSec[Rmb+ME] (23) = ( 1.543693e+06 ) sec^-1
EvtsPerSec[MatrixElems] (3) = ( 3.127267e+06 ) sec^-1
EvtsPerSec[MECalcOnly] (3a) = ( 3.127267e+06 ) sec^-1
Internal loops fptype_sv = VECTOR[2] ('sse4': SSE4.2, 128bit) [cxtype_ref=NO]
EvtsPerSec[Rmb+ME] (23) = ( 4.054897e+06 ) sec^-1
EvtsPerSec[MatrixElems] (3) = ( 7.894124e+06 ) sec^-1
EvtsPerSec[MECalcOnly] (3a) = ( 7.894124e+06 ) sec^-1
MeanMatrixElemValue = ( 1.371706e-02 +- 3.270315e-06 ) GeV^0
TOTAL : 4.559802 sec
12,002,164,935 cycles:u # 2.624 GHz
26,019,087,868 instructions:u # 2.17 insn per cycle
4.576347546 seconds time elapsed
=Symbols in CPPProcess.o= (~sse4: 1271) (avx2: 0) (512y: 0) (512z: 0)
TOTAL : 2.262850 sec
5,540,607,092 cycles:u # 2.476 GHz
12,493,297,981 instructions:u # 2.25 insn per cycle
2.280598954 seconds time elapsed
=Symbols in CPPProcess.o= (~sse4: 481) (avx2: 0) (512y: 0) (512z: 0)
-------------------------------------------------------------------------
runExe /data/avalassi/GPU2020/madgraph4gpuX/epochX/cudacpp/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum/build.sse4_d_inl0_hrd0/runTest.exe
[ PASSED ] 6 tests.
-------------------------------------------------------------------------
runExe /data/avalassi/GPU2020/madgraph4gpuX/epochX/cudacpp/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum/build.avx2_d_inl0_hrd0/check.exe -p 2048 256 12 OMP=
Process = SIGMA_SM_EPEM_MUPMUM_CPP [gcc 10.2.0] [inlineHel=0] [hardcodeCIPC=0]
Workflow summary = CPP:DBL+CXS:CURHST+RMBHST+MESHST/avx2+CXVBRK
Process = SIGMA_SM_EPEM_MUPMUM_CPP [icx 20210400 (clang 13.0.0, gcc 10.2.0)] [inlineHel=0] [hardcodeCIPC=0]
Workflow summary = CPP:DBL+CXS:CURHST+RMBHST+MESHST/avx2+NOVBRK
FP precision = DOUBLE (NaN/abnormal=0, zero=0)
Internal loops fptype_sv = VECTOR[4] ('avx2': AVX2, 256bit) [cxtype_ref=YES]
EvtsPerSec[Rmb+ME] (23) = ( 2.015033e+06 ) sec^-1
EvtsPerSec[MatrixElems] (3) = ( 5.536150e+06 ) sec^-1
EvtsPerSec[MECalcOnly] (3a) = ( 5.536150e+06 ) sec^-1
Internal loops fptype_sv = VECTOR[4] ('avx2': AVX2, 256bit) [cxtype_ref=NO]
EvtsPerSec[Rmb+ME] (23) = ( 4.952317e+06 ) sec^-1
EvtsPerSec[MatrixElems] (3) = ( 1.193943e+07 ) sec^-1
EvtsPerSec[MECalcOnly] (3a) = ( 1.193943e+07 ) sec^-1
MeanMatrixElemValue = ( 1.371706e-02 +- 3.270315e-06 ) GeV^0
TOTAL : 3.605390 sec
8,998,997,083 cycles:u # 2.487 GHz
15,411,588,737 instructions:u # 1.71 insn per cycle
3.621990875 seconds time elapsed
=Symbols in CPPProcess.o= (~sse4: 0) (avx2: 1047) (512y: 0) (512z: 0)
TOTAL : 1.990901 sec
4,605,821,165 cycles:u # 2.344 GHz
9,075,434,597 instructions:u # 1.97 insn per cycle
2.008830585 seconds time elapsed
=Symbols in CPPProcess.o= (~sse4: 0) (avx2: 402) (512y: 0) (512z: 0)
-------------------------------------------------------------------------
runExe /data/avalassi/GPU2020/madgraph4gpuX/epochX/cudacpp/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum/build.avx2_d_inl0_hrd0/runTest.exe
[ PASSED ] 6 tests.
-------------------------------------------------------------------------
runExe /data/avalassi/GPU2020/madgraph4gpuX/epochX/cudacpp/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum/build.512y_d_inl0_hrd0/check.exe -p 2048 256 12 OMP=
Process = SIGMA_SM_EPEM_MUPMUM_CPP [gcc 10.2.0] [inlineHel=0] [hardcodeCIPC=0]
Workflow summary = CPP:DBL+CXS:CURHST+RMBHST+MESHST/512y+CXVBRK
Process = SIGMA_SM_EPEM_MUPMUM_CPP [icx 20210400 (clang 13.0.0, gcc 10.2.0)] [inlineHel=0] [hardcodeCIPC=0]
Workflow summary = CPP:DBL+CXS:CURHST+RMBHST+MESHST/512y+NOVBRK
FP precision = DOUBLE (NaN/abnormal=0, zero=0)
Internal loops fptype_sv = VECTOR[4] ('512y': AVX512, 256bit) [cxtype_ref=YES]
EvtsPerSec[Rmb+ME] (23) = ( 2.057477e+06 ) sec^-1
EvtsPerSec[MatrixElems] (3) = ( 5.819569e+06 ) sec^-1
EvtsPerSec[MECalcOnly] (3a) = ( 5.819569e+06 ) sec^-1
Internal loops fptype_sv = VECTOR[4] ('512y': AVX512, 256bit) [cxtype_ref=NO]
EvtsPerSec[Rmb+ME] (23) = ( 4.949493e+06 ) sec^-1
EvtsPerSec[MatrixElems] (3) = ( 1.186647e+07 ) sec^-1
EvtsPerSec[MECalcOnly] (3a) = ( 1.186647e+07 ) sec^-1
MeanMatrixElemValue = ( 1.371706e-02 +- 3.270315e-06 ) GeV^0
TOTAL : 3.540015 sec
8,855,009,739 cycles:u # 2.492 GHz
15,336,091,071 instructions:u # 1.73 insn per cycle
3.556549810 seconds time elapsed
=Symbols in CPPProcess.o= (~sse4: 0) (avx2: 1021) (512y: 1) (512z: 0)
TOTAL : 1.984810 sec
4,605,315,754 cycles:u # 2.348 GHz
8,796,050,694 instructions:u # 1.91 insn per cycle
2.002812172 seconds time elapsed
=Symbols in CPPProcess.o= (~sse4: 0) (avx2: 387) (512y: 0) (512z: 0)
-------------------------------------------------------------------------
runExe /data/avalassi/GPU2020/madgraph4gpuX/epochX/cudacpp/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum/build.512y_d_inl0_hrd0/runTest.exe
[ PASSED ] 6 tests.
-------------------------------------------------------------------------
runExe /data/avalassi/GPU2020/madgraph4gpuX/epochX/cudacpp/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum/build.512z_d_inl0_hrd0/check.exe -p 2048 256 12 OMP=
Process = SIGMA_SM_EPEM_MUPMUM_CPP [gcc 10.2.0] [inlineHel=0] [hardcodeCIPC=0]
Workflow summary = CPP:DBL+CXS:CURHST+RMBHST+MESHST/512z+CXVBRK
Process = SIGMA_SM_EPEM_MUPMUM_CPP [icx 20210400 (clang 13.0.0, gcc 10.2.0)] [inlineHel=0] [hardcodeCIPC=0]
Workflow summary = CPP:DBL+CXS:CURHST+RMBHST+MESHST/512z+NOVBRK
FP precision = DOUBLE (NaN/abnormal=0, zero=0)
Internal loops fptype_sv = VECTOR[8] ('512z': AVX512, 512bit) [cxtype_ref=YES]
EvtsPerSec[Rmb+ME] (23) = ( 1.889569e+06 ) sec^-1
EvtsPerSec[MatrixElems] (3) = ( 4.646248e+06 ) sec^-1
EvtsPerSec[MECalcOnly] (3a) = ( 4.646248e+06 ) sec^-1
Internal loops fptype_sv = VECTOR[8] ('512z': AVX512, 512bit) [cxtype_ref=NO]
EvtsPerSec[Rmb+ME] (23) = ( 4.521361e+06 ) sec^-1
EvtsPerSec[MatrixElems] (3) = ( 9.372420e+06 ) sec^-1
EvtsPerSec[MECalcOnly] (3a) = ( 9.372420e+06 ) sec^-1
MeanMatrixElemValue = ( 1.371706e-02 +- 3.270315e-06 ) GeV^0
TOTAL : 3.815880 sec
8,530,639,662 cycles:u # 2.228 GHz
12,459,295,177 instructions:u # 1.46 insn per cycle
3.832491540 seconds time elapsed
=Symbols in CPPProcess.o= (~sse4: 0) (avx2: 241) (512y: 2) (512z: 787)
TOTAL : 2.114991 sec
4,419,722,407 cycles:u # 2.115 GHz
8,111,055,354 instructions:u # 1.84 insn per cycle
2.132862464 seconds time elapsed
=Symbols in CPPProcess.o= (~sse4: 0) (avx2: 173) (512y: 0) (512z: 206)
-------------------------------------------------------------------------
runExe /data/avalassi/GPU2020/madgraph4gpuX/epochX/cudacpp/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum/build.512z_d_inl0_hrd0/runTest.exe
[ PASSED ] 6 tests.
Expand Down
Loading

0 comments on commit 1e7fd92

Please sign in to comment.