Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function for recursively printing parameter memory #2560

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

charleskawczynski
Copy link
Contributor

As Clima has developed increasingly complex models and fused increasingly complex broadcast expressions, we've been running into parameter memory issues more frequently.

One issue I have with the existing printed message is that it does not provide granularity for large objects.

This PR implements a recursive print function/macro @rprint_parameter_memory(some_object) that users can use (and build options around) to print parameter memory usage with high granularity. For example (which I've tentatively implemented in MultiBroadcastFusion):

fmb
size: 72, fmb.pairs::Tuple{…}
size: 16, fmb.pairs.1::Pair{…}
size: 64, fmb.pairs.1.first::CUDA.CuArray{…}
size: 16, fmb.pairs.1.first.data::GPUArrays.DataRef{…}
size: 24, fmb.pairs.1.first.data.rc::GPUArrays.RefCounted{…}
size: 64, fmb.pairs.1.first.data.rc.obj::CUDA.Managed{…}
size: 48, fmb.pairs.1.first.data.rc.obj.mem::CUDA.DeviceMemory
size: 16, fmb.pairs.1.first.data.rc.obj.mem.ctx::CUDA.CuContext
size: 40, fmb.pairs.1.first.data.rc.obj.stream::CUDA.CuStream
size: 16, fmb.pairs.1.first.data.rc.obj.stream.ctx::CUDA.CuContext
size: 40, fmb.pairs.1.first.dims::NTuple{…}
size: 64, fmb.pairs.1.second::Base.Broadcast.Broadcasted{…}
size: 64, fmb.pairs.1.second.args::Tuple{…}
size: 64, fmb.pairs.1.second.args.1::CUDA.CuArray{…}
size: 16, fmb.pairs.1.second.args.1.data::GPUArrays.DataRef{…}
size: 24, fmb.pairs.1.second.args.1.data.rc::GPUArrays.RefCounted{…}
size: 64, fmb.pairs.1.second.args.1.data.rc.obj::CUDA.Managed{…}
size: 48, fmb.pairs.1.second.args.1.data.rc.obj.mem::CUDA.DeviceMemory
size: 16, fmb.pairs.1.second.args.1.data.rc.obj.mem.ctx::CUDA.CuContext
size: 40, fmb.pairs.1.second.args.1.data.rc.obj.stream::CUDA.CuStream
size: 16, fmb.pairs.1.second.args.1.data.rc.obj.stream.ctx::CUDA.CuContext
size: 40, fmb.pairs.1.second.args.1.dims::NTuple{…}
size: 24, fmb.pairs.2::Pair{…}
size: 64, fmb.pairs.2.first::CUDA.CuArray{…}
size: 16, fmb.pairs.2.first.data::GPUArrays.DataRef{…}
size: 24, fmb.pairs.2.first.data.rc::GPUArrays.RefCounted{…}
size: 64, fmb.pairs.2.first.data.rc.obj::CUDA.Managed{…}
size: 48, fmb.pairs.2.first.data.rc.obj.mem::CUDA.DeviceMemory
size: 16, fmb.pairs.2.first.data.rc.obj.mem.ctx::CUDA.CuContext
size: 40, fmb.pairs.2.first.data.rc.obj.stream::CUDA.CuStream
size: 16, fmb.pairs.2.first.data.rc.obj.stream.ctx::CUDA.CuContext
size: 40, fmb.pairs.2.first.dims::NTuple{…}
size: 128, fmb.pairs.2.second::Base.Broadcast.Broadcasted{…}
size: 128, fmb.pairs.2.second.args::Tuple{…}
size: 64, fmb.pairs.2.second.args.1::CUDA.CuArray{…}
size: 16, fmb.pairs.2.second.args.1.data::GPUArrays.DataRef{…}
size: 24, fmb.pairs.2.second.args.1.data.rc::GPUArrays.RefCounted{…}
size: 64, fmb.pairs.2.second.args.1.data.rc.obj::CUDA.Managed{…}
size: 48, fmb.pairs.2.second.args.1.data.rc.obj.mem::CUDA.DeviceMemory
size: 16, fmb.pairs.2.second.args.1.data.rc.obj.mem.ctx::CUDA.CuContext
size: 40, fmb.pairs.2.second.args.1.data.rc.obj.stream::CUDA.CuStream
size: 16, fmb.pairs.2.second.args.1.data.rc.obj.stream.ctx::CUDA.CuContext
size: 40, fmb.pairs.2.second.args.1.dims::NTuple{…}
size: 64, fmb.pairs.2.second.args.2::CUDA.CuArray{…}
size: 16, fmb.pairs.2.second.args.2.data::GPUArrays.DataRef{…}
size: 24, fmb.pairs.2.second.args.2.data.rc::GPUArrays.RefCounted{…}
size: 64, fmb.pairs.2.second.args.2.data.rc.obj::CUDA.Managed{…}
size: 48, fmb.pairs.2.second.args.2.data.rc.obj.mem::CUDA.DeviceMemory
size: 16, fmb.pairs.2.second.args.2.data.rc.obj.mem.ctx::CUDA.CuContext
size: 40, fmb.pairs.2.second.args.2.data.rc.obj.stream::CUDA.CuStream
size: 16, fmb.pairs.2.second.args.2.data.rc.obj.stream.ctx::CUDA.CuContext
size: 40, fmb.pairs.2.second.args.2.dims::NTuple{…}
size: 32, fmb.pairs.3::Pair{…}
size: 64, fmb.pairs.3.first::CUDA.CuArray{…}
size: 16, fmb.pairs.3.first.data::GPUArrays.DataRef{…}
size: 24, fmb.pairs.3.first.data.rc::GPUArrays.RefCounted{…}
size: 64, fmb.pairs.3.first.data.rc.obj::CUDA.Managed{…}
size: 48, fmb.pairs.3.first.data.rc.obj.mem::CUDA.DeviceMemory
size: 16, fmb.pairs.3.first.data.rc.obj.mem.ctx::CUDA.CuContext
size: 40, fmb.pairs.3.first.data.rc.obj.stream::CUDA.CuStream
size: 16, fmb.pairs.3.first.data.rc.obj.stream.ctx::CUDA.CuContext
size: 40, fmb.pairs.3.first.dims::NTuple{…}
size: 192, fmb.pairs.3.second::Base.Broadcast.Broadcasted{…}
size: 192, fmb.pairs.3.second.args::Tuple{…}
size: 64, fmb.pairs.3.second.args.1::CUDA.CuArray{…}
size: 16, fmb.pairs.3.second.args.1.data::GPUArrays.DataRef{…}
size: 24, fmb.pairs.3.second.args.1.data.rc::GPUArrays.RefCounted{…}
size: 64, fmb.pairs.3.second.args.1.data.rc.obj::CUDA.Managed{…}
size: 48, fmb.pairs.3.second.args.1.data.rc.obj.mem::CUDA.DeviceMemory
size: 16, fmb.pairs.3.second.args.1.data.rc.obj.mem.ctx::CUDA.CuContext
size: 40, fmb.pairs.3.second.args.1.data.rc.obj.stream::CUDA.CuStream
size: 16, fmb.pairs.3.second.args.1.data.rc.obj.stream.ctx::CUDA.CuContext
size: 40, fmb.pairs.3.second.args.1.dims::NTuple{…}
size: 64, fmb.pairs.3.second.args.2::CUDA.CuArray{…}
size: 16, fmb.pairs.3.second.args.2.data::GPUArrays.DataRef{…}
size: 24, fmb.pairs.3.second.args.2.data.rc::GPUArrays.RefCounted{…}
size: 64, fmb.pairs.3.second.args.2.data.rc.obj::CUDA.Managed{…}
size: 48, fmb.pairs.3.second.args.2.data.rc.obj.mem::CUDA.DeviceMemory
size: 16, fmb.pairs.3.second.args.2.data.rc.obj.mem.ctx::CUDA.CuContext
size: 40, fmb.pairs.3.second.args.2.data.rc.obj.stream::CUDA.CuStream
size: 16, fmb.pairs.3.second.args.2.data.rc.obj.stream.ctx::CUDA.CuContext
size: 40, fmb.pairs.3.second.args.2.dims::NTuple{…}
size: 64, fmb.pairs.3.second.args.3::CUDA.CuArray{…}
size: 16, fmb.pairs.3.second.args.3.data::GPUArrays.DataRef{…}
size: 24, fmb.pairs.3.second.args.3.data.rc::GPUArrays.RefCounted{…}
size: 64, fmb.pairs.3.second.args.3.data.rc.obj::CUDA.Managed{…}
size: 48, fmb.pairs.3.second.args.3.data.rc.obj.mem::CUDA.DeviceMemory
size: 16, fmb.pairs.3.second.args.3.data.rc.obj.mem.ctx::CUDA.CuContext
size: 40, fmb.pairs.3.second.args.3.data.rc.obj.stream::CUDA.CuStream
size: 16, fmb.pairs.3.second.args.3.data.rc.obj.stream.ctx::CUDA.CuContext
size: 40, fmb.pairs.3.second.args.3.dims::NTuple{…}

I'm cc-ing some people who may also be interested in this: @glwagner @simonbyrne @simone-silvestri

@charleskawczynski
Copy link
Contributor Author

I'm open to changing the format somehow, but I do think that this format is somewhat simple and explicit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant