You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are trying to create a timeline of MPI function calls amongst 8 ranks that are performing irregular communication in AMR code, so the MPI whitelist consists of only MPI_Isend, MPI_Irecv, and MPI_Allreduce. The ranks generate cali files that are 45 megabytes each. My cali-query command to generate a timeline is the following:
The execution of the previous command runs but never completes and is killed after ~50 minutes of running, even when using multiple threads since we have multiple .cali files generated (specified with the --threads option).
We have CALI_MARK_FUNCTION macro's in every function that contains communication to track the parent functions of the MPI_Function calls.
Our caliper configuration within the slurm job script is the following:
We are wondering if this is the appropriate way of using Caliper to create a timeline table of MPI Function calls within our code base. Any suggestions are welcomed.
Thank you!
The text was updated successfully, but these errors were encountered:
The .cali format isn't the most efficient unfortunately. You can try to speed things up by filtering, in particular the number of records to look at. Assuming you want to sort by function start times, you can try this:
cali-query -q "select function,event.begin#mpi.function,mpi.rank,time.offset where event.begin#mpi.function format table order by time.offset"
However, note that the event timestamps are local - they're set to 0 when each process starts, but MPI ranks may start at different times, so the timestamps will likely be shifted by some amount between MPI ranks. You can't really compare them between processes.
I have some tools for processing Caliper MPI traces, which among other things have a heuristic to compute global timestamps. Let me know if you're interested in those.
Greetings,
We are trying to create a timeline of MPI function calls amongst 8 ranks that are performing irregular communication in AMR code, so the MPI whitelist consists of only MPI_Isend, MPI_Irecv, and MPI_Allreduce. The ranks generate cali files that are 45 megabytes each. My cali-query command to generate a timeline is the following:
cali-query -t -o query.out msgtrace* --sort-by=time.offset
The execution of the previous command runs but never completes and is killed after ~50 minutes of running, even when using multiple threads since we have multiple .cali files generated (specified with the --threads option).
We have CALI_MARK_FUNCTION macro's in every function that contains communication to track the parent functions of the MPI_Function calls.
Our caliper configuration within the slurm job script is the following:
export CALI_SERVICES_ENABLE=event,recorder,timestamp,trace
export CALI_EVENT_TRIGGER=function,mpi.function
export CALI_TIMER_SNAPSHOT_DURATION=false
export CALI_TIMER_INCLUSIVE_DURATION=false
export CALI_TIMER_OFFSET=true
export CALI_MPI_MSG_TRACING=true
export CALI_MPI_WHITELIST=MPI_Isend,MPI_Irecv,MPI_Allreduce
export CALI_RECORDER_FILENAME=msgtrace-%mpi.rank%.cali
We are wondering if this is the appropriate way of using Caliper to create a timeline table of MPI Function calls within our code base. Any suggestions are welcomed.
Thank you!
The text was updated successfully, but these errors were encountered: