You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGES
+12Lines changed: 12 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,16 @@
1
1
LOG:
2
+
Version 4.1.0 versus 4.0.0
3
+
-Features:
4
+
1- Supporting L1 write-allocate with sub-sector writing policy as in Volta+ hardware, and changing the Volta+ cards config to make L1 write-allocate with write-through
5
+
2- Making the L1 adaptive cache policy to be configurable
6
+
3- Adding Ampere RTX 3060 config files
7
+
-Bugs:
8
+
1- Fixing L1 bank hash function bug
9
+
2- Fixing L1 read hit counters in gpgpu-sim to match nvprof, to achieve more accurate L1 correlation with the HW
10
+
3- Fixing bugs in lazy write handling, thanks to Gwendolyn Voskuilen from Sandia labs for this fix
11
+
4- Fixing the backend pipeline for sub_core model
12
+
5- Fixing Memory stomp bug at the shader_config
13
+
6- Some code refactoring:
2
14
Version 4.0.0 (development branch) versus 3.2.3
3
15
-Front-End:
4
16
1- Support .nc cache modifier and __ldg function to access the read-only L1D cache
Christopher Ng, Negar Goli, Matthew D. Sinclair, Timothy G. Rogers, Tor M. Aamodt
26
31
Analyzing Machine Learning Workloads Using a Detailed GPU Simulator, arXiv:1811.08933,
27
32
https://arxiv.org/abs/1811.08933
28
33
29
-
30
34
If you use the Tensor Core model in GPGPU-Sim or GPGPU-Sim's CUTLASS Library
31
35
for your research please cite:
32
36
@@ -261,6 +265,7 @@ To clean the docs run
261
265
The documentation resides at doc/doxygen/html.
262
266
263
267
To run Pytorch applications with the simulator, install the modified Pytorch library as well by following instructions [here](https://github.com/gpgpu-sim/pytorch-gpgpu-sim).
268
+
264
269
## Step 3: Run
265
270
266
271
Before we run, we need to make sure the application's executable file is dynamically linked to CUDA runtime library. This can be done during compilation of your program by introducing the nvcc flag "--cudart shared" in makefile (quotes should be excluded).
0 commit comments