-
Notifications
You must be signed in to change notification settings - Fork 67
Open
Description
Describe the Bug
When running comm_replay on ET traces I get the following error:
$ comm_replay --enable-profiler --trace-type et --trace-path /workspace/traces --num-replays 1
0: [rank0]: Traceback (most recent call last):
0: [rank0]: File "/usr/local/bin/comm_replay", line 8, in <module>
0: [rank0]: sys.exit(main())
0: [rank0]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1671, in main
0: [rank0]: traceBench.runBench(commsParams)
0: [rank0]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1324, in runBench
0: [rank0]: self.benchTime(commsParams)
0: [rank0]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1236, in benchTime
0: [rank0]: self.replayTrace(commsParams=commsParams, warmup=True)
0: [rank0]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1063, in replayTrace
0: [rank0]: (latency, global_latency) = self.runComms(
0: [rank0]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 820, in runComms
0: [rank0]: self.collectiveArgs.waitObjIds[curComm.req] = retObj
0: [rank0]: TypeError: unhashable type: 'list'
56: [rank56]: Traceback (most recent call last):
56: [rank56]: File "/usr/local/bin/comm_replay", line 8, in <module>
56: [rank56]: sys.exit(main())
56: [rank56]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1671, in main
56: [rank56]: traceBench.runBench(commsParams)
56: [rank56]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1324, in runBench
56: [rank56]: self.benchTime(commsParams)
56: [rank56]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1236, in benchTime
56: [rank56]: self.replayTrace(commsParams=commsParams, warmup=True)
56: [rank56]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1063, in replayTrace
56: [rank56]: (latency, global_latency) = self.runComms(
56: [rank56]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 820, in runComms
56: [rank56]: self.collectiveArgs.waitObjIds[curComm.req] = retObj
56: [rank56]: TypeError: unhashable type: 'list'
The chakra schema is 1.1.1-chakra.0.0.4.
I've tried with param@main and param@ 7b19f58 as chakra user guide recommends.
Metadata
Metadata
Assignees
Labels
No labels