Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization faild with GPU memory allocation but running is only taking a half GPU memory #3

Open
hzhshok opened this issue Oct 11, 2022 · 1 comment

Comments

@hzhshok
Copy link

hzhshok commented Oct 11, 2022

Hello,
Thanks for your great work on nvdiffrec!

This feature improved more quality(good PSNR then nvdiffrec), but therunning failed on optimization phase.

So, could you please give me some suggestion about if it is bug or my wrong configuration?
The running GPU cost is noly a half of total GPU memory, but the optimazation step failed to allocate GPU memory resource.

GPU Hardware: RTX 3090(24G)

GPU Running cost:
C:\Users\jinshui>nvidia-smi.exe
Tue Oct 11 10:59:11 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 512.77 Driver Version: 512.77 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... WDDM | 00000000:01:00.0 On | N/A |
| 58% 68C P2 161W / 350W | 12548MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

Image resolution: 2976 X 2976

Configuration file:
{
"ref_mesh": "data/nerf_synthetic/xxx",
"random_textures": true,
"iter": 9000,
"save_interval": 100,
"texture_res": [ 2048, 2048 ],
"train_res": [1408, 1408],
"batch": 1,
"learning_rate": [0.05, 0.003],
"dmtet_grid" : 128,
"mesh_scale" : 2.5,
"validate" : true,
"n_samples" : 10,
"denoiser" : "bilateral",
"laplace_scale" : 3000,
"display": [{"latlong" : true}, {"bsdf" : "kd"}, {"bsdf" : "ks"}, {"bsdf" : "normal"}],
"background" : "white",
"transparency" : true,
"out_dir": "nerf_xxx"
}

Console error:
Running validation
MSE, PSNR
0.00557304, 22.581
kd shape torch.Size([1, 2048, 2048, 4])
Cuda path C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4
End of OptiXStateWrapper
Base mesh has 75052 triangles and 37853 vertices.
Avg edge length: 0.023254
OptiXStateWrapper destructor
Writing mesh: out/nerf_jianlong\dmtet_mesh/mesh.obj
writing 37853 vertices
writing 67605 texcoords
writing 37853 normals
writing 75052 faces
Writing material: out/nerf_jianlong\dmtet_mesh/mesh.mtl
Done exporting mesh
Traceback (most recent call last):
File "D:\zhansheng\proj\windows\3d\nvdiffrecmc\train.py", line 665, in
geometry, mat = optimize_mesh(denoiser, glctx, glctx_display, geometry, mat, lgt, dataset_train, dataset_validate, FLAGS,
File "D:\zhansheng\proj\windows\3d\nvdiffrecmc\train.py", line 424, in optimize_mesh
img_loss, reg_loss = geometry.tick(glctx, target, lgt, opt_material, image_loss_fn, it, FLAGS, denoiser)
File "D:\zhansheng\proj\windows\3d\nvdiffrecmc\geometry\dlmesh.py", line 65, in tick
buffers = render.render_mesh(FLAGS, glctx, opt_mesh, target['mvp'], target['campos'], target['light'] if lgt is None else lgt, target['resolution'],
File "D:\zhansheng\proj\windows\3d\nvdiffrecmc\render\render.py", line 327, in render_mesh
accum = composite_buffer(key, layers, torch.zeros_like(layers[0][0][key]), True)
File "D:\zhansheng\proj\windows\3d\nvdiffrecmc\render\render.py", line 290, in composite_buffer
accum = dr.antialias(accum.contiguous(), rast, v_pos_clip, mesh.t_pos_idx.int())
File "C:\Users\jinshui\anaconda3\envs\dmodel\lib\site-packages\nvdiffrast\torch\ops.py", line 702, in antialias
return _antialias_func.apply(color, rast, pos, tri, topology_hash, pos_gradient_boost)
File "C:\Users\jinshui\anaconda3\envs\dmodel\lib\site-packages\nvdiffrast\torch\ops.py", line 650, in forward
out, work_buffer = _get_plugin().antialias_fwd(color, rast, pos, tri, topology_hash)
RuntimeError: CUDA out of memory. Tried to allocate 62.00 MiB (GPU 0; 24.00 GiB total capacity; 19.61 GiB already allocated; 0 bytes free; 20.20 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Regards
Zhansheng

@jmunkberg
Copy link
Collaborator

Thanks @hzhshok ,

Yes, you are running out of memory. Try nvidia-smi --query-gpu=memory.used --format=csv -lms 100 in a second prompt to log memory usage more frequently.

On a 24 GB GPU, you likely need to reduce the "train_res", further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants