Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a stack to the statistics resource #1563

Merged
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
6cf4ee7
use std::shared_mutex
madsbk May 16, 2024
d13ee6b
clean up
madsbk May 16, 2024
25ff814
impl. push_counters() and pop_counters()
madsbk May 16, 2024
b453393
python bindings
madsbk May 16, 2024
1f7daa1
python tests
madsbk May 16, 2024
d6fd147
doc
madsbk May 17, 2024
fc49fe9
test_statistics
madsbk May 17, 2024
b9d57db
current allocation statistics
madsbk May 17, 2024
0e2f19c
clean up
madsbk May 17, 2024
7f7f940
Context to enable allocation statistics
madsbk May 17, 2024
68fcd08
Apply suggestions from code review
madsbk May 23, 2024
0254b73
doc
madsbk May 23, 2024
e6dd682
add_counters_from_tracked_sub_block
madsbk May 23, 2024
bf49dab
c++ tests
madsbk May 23, 2024
aef3b9f
Merge branch 'branch-24.06' into statistics_resource_counters_stack
madsbk May 23, 2024
145cb96
Merge branch 'branch-24.06' of github.com:rapidsai/rmm into statistic…
madsbk May 24, 2024
9bd1c2e
Merge branch 'branch-24.08' of github.com:rapidsai/rmm into statistic…
madsbk May 24, 2024
cef74e5
Merge branch 'branch-24.08' of github.com:rapidsai/rmm into statistic…
madsbk May 27, 2024
04b39cb
use dataclass Statistics
madsbk May 27, 2024
1dbd022
memory profiler
madsbk May 27, 2024
badbb56
clean up
madsbk May 27, 2024
6f35d23
fix typo
madsbk May 27, 2024
d8ee633
descriptive name
madsbk May 27, 2024
2df77d1
default_profiler_records
madsbk May 27, 2024
89827ad
tracking_resource_adaptor: use std::shared_mutex
madsbk May 28, 2024
24157b5
fix pytorch test
madsbk May 28, 2024
8df2d0a
pretty_print: added memory units
madsbk May 28, 2024
a82a7b6
doc
madsbk May 28, 2024
2b4b7d3
profiler: accept name argument
madsbk May 28, 2024
0067c05
profiler: now also a context manager
madsbk May 28, 2024
ab97d2a
cleanup
madsbk May 28, 2024
189ca30
pretty_print: output format
madsbk May 28, 2024
6debd83
fix doc build
madsbk May 28, 2024
796e159
Apply suggestions from code review
madsbk May 29, 2024
d2d64a1
style clean up
madsbk May 29, 2024
394d39f
doc
madsbk May 29, 2024
499c173
rename Data => MemoryRecord
madsbk May 29, 2024
463172d
rename pretty_print => report
madsbk May 29, 2024
c11b1c5
ruff check --fix --select D400
madsbk May 29, 2024
3d929d6
report: style
madsbk May 29, 2024
62a3870
doc
madsbk May 29, 2024
8d71415
spelling
madsbk May 30, 2024
8b8176b
Merge branch 'branch-24.08' of github.com:rapidsai/rmm into statistic…
madsbk May 30, 2024
a230794
style
madsbk May 30, 2024
17d9fd9
doc
madsbk May 30, 2024
23eb075
doc
madsbk Jun 4, 2024
9e92414
doc
madsbk Jun 4, 2024
42fb6c7
Merge branch 'branch-24.08' of github.com:rapidsai/rmm into statistic…
madsbk Jun 5, 2024
8b52c83
Update python/rmm/docs/guide.md
madsbk Jun 6, 2024
0b59246
Merge branch 'branch-24.08' into statistics_resource_counters_stack
madsbk Jun 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 71 additions & 15 deletions include/rmm/mr/device/statistics_resource_adaptor.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
#include <cstddef>
#include <mutex>
#include <shared_mutex>
#include <stack>

namespace rmm::mr {
/**
Expand All @@ -36,20 +37,23 @@ namespace rmm::mr {
* resource in order to satisfy allocation requests, but any existing
* allocations will be untracked. Tracking statistics stores the current, peak
* and total memory allocations for both the number of bytes and number of calls
* to the memory resource. `statistics_resource_adaptor` is intended as a debug
* adaptor and shouldn't be used in performance-sensitive code.
* to the memory resource.
* A stack of counters is maintained, use `.push_counters()` and `.pop_counters()`
* to track statistics at different nesting levels.
*
* `statistics_resource_adaptor` is intended as a debug adaptor and shouldn't be
* used in performance-sensitive code.
*
* @tparam Upstream Type of the upstream resource used for
* allocation/deallocation.
*/
template <typename Upstream>
class statistics_resource_adaptor final : public device_memory_resource {
public:
// can be a std::shared_mutex once C++17 is adopted
using read_lock_t =
std::shared_lock<std::shared_timed_mutex>; ///< Type of lock used to synchronize read access
std::shared_lock<std::shared_mutex>; ///< Type of lock used to synchronize read access
harrism marked this conversation as resolved.
Show resolved Hide resolved
using write_lock_t =
std::unique_lock<std::shared_timed_mutex>; ///< Type of lock used to synchronize write access
std::unique_lock<std::shared_mutex>; ///< Type of lock used to synchronize write access
/**
* @brief Utility struct for counting the current, peak, and total value of a number
*/
Expand Down Expand Up @@ -83,6 +87,23 @@ class statistics_resource_adaptor final : public device_memory_resource {
value -= val;
return *this;
}

/**
* @brief Add `val` to the current value and update the peak value if necessary
*
* @note When updating the peak value, we assume that `val` is the inner counter of
madsbk marked this conversation as resolved.
Show resolved Hide resolved
* `this` on the counter stack so its peak value becomes `this->value + val.peak`.
*
* @param val Value to add
* @return Reference to this object
*/
counter& operator+=(const counter& val)
{
peak = std::max(value + val.peak, peak);
harrism marked this conversation as resolved.
Show resolved Hide resolved
value += val.value;
total += val.total;
return *this;
}
};

/**
Expand All @@ -96,6 +117,8 @@ class statistics_resource_adaptor final : public device_memory_resource {
statistics_resource_adaptor(Upstream* upstream) : upstream_{upstream}
{
RMM_EXPECTS(nullptr != upstream, "Unexpected null upstream resource pointer.");
// Initially, we push a single counter pair on the stack
push_counters();
}

statistics_resource_adaptor() = delete;
Expand Down Expand Up @@ -131,7 +154,7 @@ class statistics_resource_adaptor final : public device_memory_resource {
{
read_lock_t lock(mtx_);

return bytes_;
return counter_stack_.top().first;
}

/**
Expand All @@ -145,7 +168,40 @@ class statistics_resource_adaptor final : public device_memory_resource {
{
read_lock_t lock(mtx_);

return allocations_;
return counter_stack_.top().second;
}

/**
* @brief Push a pair of zero counters on the stack, which becomes the new
* counters returned by `get_bytes_counter()` and `get_allocations_counter()`
*
* @return pair of counters <bytes, allocations> from the stack _before_ the push
*/
std::pair<counter, counter> push_counters()
{
write_lock_t lock(mtx_);
harrism marked this conversation as resolved.
Show resolved Hide resolved
// auto [bytes, allocations] = counter_stack_.top();
madsbk marked this conversation as resolved.
Show resolved Hide resolved
// bytes.
auto ret = counter_stack_.top();
counter_stack_.push(std::make_pair(counter{}, counter{}));
return ret;
}

/**
* @brief Pop a pair of counters from the stack
*
* @return pair of counters <bytes, allocations> from the stack _before_ the pop
madsbk marked this conversation as resolved.
Show resolved Hide resolved
*/
std::pair<counter, counter> pop_counters()
{
write_lock_t lock(mtx_);
if (counter_stack_.size() < 2) { throw std::out_of_range("cannot pop the last counter pair"); }
auto ret = counter_stack_.top();
counter_stack_.pop();
// The new top inherits the statistics
madsbk marked this conversation as resolved.
Show resolved Hide resolved
counter_stack_.top().first += ret.first;
counter_stack_.top().second += ret.second;
return ret;
}

private:
Expand All @@ -171,8 +227,8 @@ class statistics_resource_adaptor final : public device_memory_resource {
write_lock_t lock(mtx_);

// Increment the allocation_count_ while we have the lock
bytes_ += bytes;
allocations_ += 1;
counter_stack_.top().first += bytes;
counter_stack_.top().second += 1;
}

return ptr;
Expand All @@ -193,8 +249,8 @@ class statistics_resource_adaptor final : public device_memory_resource {
write_lock_t lock(mtx_);

// Decrement the current allocated counts.
bytes_ -= bytes;
allocations_ -= 1;
counter_stack_.top().first -= bytes;
counter_stack_.top().second -= 1;
}
}

Expand All @@ -213,10 +269,10 @@ class statistics_resource_adaptor final : public device_memory_resource {
return get_upstream_resource() == cast->get_upstream_resource();
}

counter bytes_; // peak, current and total allocated bytes
counter allocations_; // peak, current and total allocation count
std::shared_timed_mutex mutable mtx_; // mutex for thread safe access to allocations_
Upstream* upstream_; // the upstream resource used for satisfying allocation requests
// Stack of counter pairs <bytes, allocations>
std::stack<std::pair<counter, counter>> counter_stack_;
std::shared_mutex mutable mtx_; // mutex for thread safe access to allocations_
Upstream* upstream_; // the upstream resource used for satisfying allocation requests
};

/**
Expand Down
57 changes: 47 additions & 10 deletions python/rmm/rmm/_lib/memory_resource.pyx
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2020-2022, NVIDIA CORPORATION.
# Copyright (c) 2020-2024, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -177,20 +177,20 @@ cdef extern from "rmm/mr/device/logging_resource_adaptor.hpp" \

cdef extern from "rmm/mr/device/statistics_resource_adaptor.hpp" \
namespace "rmm::mr" nogil:
cdef cppclass statistics_resource_adaptor[Upstream](
device_memory_resource):
cdef cppclass statistics_resource_adaptor[Upstream](device_memory_resource):
struct counter:
counter()

int64_t value
int64_t peak
int64_t total

statistics_resource_adaptor(
Upstream* upstream_mr) except +
statistics_resource_adaptor(Upstream* upstream_mr) except +

counter get_bytes_counter() except +
counter get_allocations_counter() except +
pair[counter, counter] pop_counters() except +
pair[counter, counter] push_counters() except +

cdef extern from "rmm/mr/device/tracking_resource_adaptor.hpp" \
namespace "rmm::mr" nogil:
Expand Down Expand Up @@ -793,6 +793,9 @@ cdef class StatisticsResourceAdaptor(UpstreamResourceAdaptor):
allocations/deallocations performed by an upstream memory resource.
Includes the ability to query these statistics at any time.

A stack of counters is maintained, use `.push_counters()` and
`.pop_counters()` to track statistics at different nesting levels.
madsbk marked this conversation as resolved.
Show resolved Hide resolved

Parameters
----------
upstream : DeviceMemoryResource
Expand All @@ -812,12 +815,11 @@ cdef class StatisticsResourceAdaptor(UpstreamResourceAdaptor):
Returns:
dict: Dictionary containing allocation counts and bytes.
"""
cdef statistics_resource_adaptor[device_memory_resource]* mr = \
<statistics_resource_adaptor[device_memory_resource]*> self.c_obj.get()

counts = (<statistics_resource_adaptor[device_memory_resource]*>(
self.c_obj.get()))[0].get_allocations_counter()
byte_counts = (<statistics_resource_adaptor[device_memory_resource]*>(
self.c_obj.get()))[0].get_bytes_counter()

counts = deref(mr).get_allocations_counter()
byte_counts = deref(mr).get_bytes_counter()
return {
"current_bytes": byte_counts.value,
"current_count": counts.value,
Expand All @@ -827,6 +829,41 @@ cdef class StatisticsResourceAdaptor(UpstreamResourceAdaptor):
"total_count": counts.total,
}

def pop_counters(self) -> dict:
"""
Pop a counter pair (bytes and allocations) from the stack
"""
madsbk marked this conversation as resolved.
Show resolved Hide resolved
cdef statistics_resource_adaptor[device_memory_resource]* mr = \
<statistics_resource_adaptor[device_memory_resource]*> self.c_obj.get()

bytes_and_allocs = deref(mr).pop_counters()
return {
"current_bytes": bytes_and_allocs.first.value,
"current_count": bytes_and_allocs.second.value,
"peak_bytes": bytes_and_allocs.first.peak,
"peak_count": bytes_and_allocs.second.peak,
"total_bytes": bytes_and_allocs.first.total,
"total_count": bytes_and_allocs.second.total,
}

def push_counters(self) -> dict:
"""
Push a new counter pair (bytes and allocations) on the stack
"""

cdef statistics_resource_adaptor[device_memory_resource]* mr = \
<statistics_resource_adaptor[device_memory_resource]*> self.c_obj.get()

bytes_and_allocs = deref(mr).push_counters()
return {
"current_bytes": bytes_and_allocs.first.value,
"current_count": bytes_and_allocs.second.value,
"peak_bytes": bytes_and_allocs.first.peak,
"peak_count": bytes_and_allocs.second.peak,
"total_bytes": bytes_and_allocs.first.total,
"total_count": bytes_and_allocs.second.total,
}

cdef class TrackingResourceAdaptor(UpstreamResourceAdaptor):

def __cinit__(
Expand Down
133 changes: 133 additions & 0 deletions python/rmm/rmm/statistics.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# Copyright (c) 2024, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from contextlib import contextmanager
from typing import Dict, Optional

import rmm.mr


def enable_statistics() -> None:
"""Enable allocation statistics

This function is idempotent, if statistics has been enabled for the
current RMM resource stack, this is a no-op.
madsbk marked this conversation as resolved.
Show resolved Hide resolved

Warning
-------
This modifies the current RMM memory resource. StatisticsResourceAdaptor
is pushed onto the current RMM memory resource stack and must remain the
the top must resource throughout the statistics gathering.
madsbk marked this conversation as resolved.
Show resolved Hide resolved
"""

mr = rmm.mr.get_current_device_resource()
if not isinstance(mr, rmm.mr.StatisticsResourceAdaptor):
rmm.mr.set_current_device_resource(
rmm.mr.StatisticsResourceAdaptor(mr)
)


def get_statistics() -> Optional[Dict[str, int]]:
"""Get the current allocation statistics

Return
------
If enabled, returns the current tracked statistics.
If disabled, returns None.
"""
mr = rmm.mr.get_current_device_resource()
if isinstance(mr, rmm.mr.StatisticsResourceAdaptor):
return mr.allocation_counts
return None


def push_statistics() -> Optional[Dict[str, int]]:
"""Push new counters on the current allocation statistics stack

This returns the current tracked statistics and push a new set
madsbk marked this conversation as resolved.
Show resolved Hide resolved
of zero counters on the stack of statistics.

If statistics are disabled (the current memory resource is not an
instance of StatisticsResourceAdaptor), this function is a no-op.

Return
------
If enabled, returns the current tracked statistics _before_ the pop.
If disabled, returns None.
"""
mr = rmm.mr.get_current_device_resource()
if isinstance(mr, rmm.mr.StatisticsResourceAdaptor):
return mr.push_counters()
return None


def pop_statistics() -> Optional[Dict[str, int]]:
"""Pop the counters of the current allocation statistics stack

This returns the counters of current tracked statistics and pops
them from the stack.

If statistics are disabled (the current memory resource is not an
instance of StatisticsResourceAdaptor), this function is a no-op.

Return
------
If enabled, returns the popped counters.
If disabled, returns None.
"""
mr = rmm.mr.get_current_device_resource()
if isinstance(mr, rmm.mr.StatisticsResourceAdaptor):
return mr.pop_counters()
return None


@contextmanager
def statistics():
"""Context to enable allocation statistics.

If statistics has been enabled already (the current memory resource is an
madsbk marked this conversation as resolved.
Show resolved Hide resolved
instance of StatisticsResourceAdaptor), new counters are pushed on the
current allocation statistics stack when entering the context and popped
again when exiting using `push_statistics()` and `push_statistics()`.

If statistics has not been enabled, StatisticsResourceAdaptor is set as
madsbk marked this conversation as resolved.
Show resolved Hide resolved
the current RMM memory resource when entering the context and removed
again when exiting.

Raises
------
ValueError
If the current RMM memory source was changed while in the context.
"""

if push_statistics() is None:
wence- marked this conversation as resolved.
Show resolved Hide resolved
# Save the current non-statistics memory resource for later cleanup
prior_non_stats_mr = rmm.mr.get_current_device_resource()
enable_statistics()
else:
prior_non_stats_mr = None
madsbk marked this conversation as resolved.
Show resolved Hide resolved

try:
current_mr = rmm.mr.get_current_device_resource()
yield
finally:
if current_mr is not rmm.mr.get_current_device_resource():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use is not (identity) or != (equality)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be safe, I think identity is good here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the definition of equality between memory resources is that they are interchangeable for the purposes of allocate/deallocate pairs, which is not sufficient for the usage here.

raise ValueError(
"RMM memory source stack was changed "
"while in the statistics context"
)
if prior_non_stats_mr is None:
pop_statistics()
else:
rmm.mr.set_current_device_resource(prior_non_stats_mr)
Loading
Loading