-
Notifications
You must be signed in to change notification settings - Fork 66
[ML] Report the "actual" memory usage of the autodetect process #2846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Determine the actual memory usgae of the autodetect process as reported by the OS, e.g. on Linux this mould be the value of the maximum resident set size returned by a call to `getrusage`. Add this value to the model size stats record returned to the ES Java process so it can be included in the `job counts` tab for anomaly detection jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Ed. I did the first pass.
We should discuss the naming of the new field. While "actual" conveys the intention of the value, it is confusing to the user.
Also, does maximum resident set size actually correspond to the actual current memory usage or is it the historical peak process memory usage?
include/model/CResourceMonitor.h
Outdated
@@ -180,6 +181,8 @@ class MODEL_EXPORT CResourceMonitor { | |||
//! Returns the sum of used memory plus any extra memory | |||
std::size_t totalMemory() const; | |||
|
|||
std::size_t actualMemoryUsage() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can come up with something better than actualMemoryUsage
. Maybe: systemMemoryUsage
?
The resident set size (RSS) represents the process's current RAM usage (so not counting pages that have been swapped out etc.), and the max RSS is the high water mark of that value. I think that reporting both would be useful for our purposes. |
* ActualMemory -> SystemMemory * Report current resident set size as well as max
include/model/ModelTypes.h
Outdated
E_AssignmentBasisSystemMemoryBytes = 4, //!< Use the current system memory size | ||
E_AssignmentBasisMaxSystemMemoryBytes = 5 //!< Use the highest ever system memory size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this assignment basis reasons (at least so far)
bin/autodetect/Main.cc
Outdated
ml::counter_t::E_TSADResidentSetSize, | ||
ml::counter_t::E_TSADMaxResidentSetSize}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call these counters SystemMemoryUsage
for consistency?
lib/model/ModelTypes.cc
Outdated
case E_AssignmentBasisSystemMemoryBytes: | ||
return "system_memory_bytes"; | ||
case E_AssignmentBasisMaxSystemMemoryBytes: | ||
return "max_system_memory_bytes"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are those necessary?
Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com>
Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com>
Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com>
…mem_usage # Conflicts: # bin/autodetect/Main.cc # include/model/CResourceMonitor.h
|
🎉 Snyk checks have passed. No issues have been found so far.✅ security/snyk check is complete. No issues have been found. (View Details) ✅ license/snyk check is complete. No issues have been found. (View Details) |
* Address failing unit tests * More accurate, meaningful description of new program counters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I think the last piece that is missing now is that we use the system memory usage when in CResourceMemory
when calculating if allocations are allowed and that we report it back to Java as "model memory usage" and "peak model memory usage" instead of the estimated values on Linux.
… set size) for the "model memory usage" and "peak model memory usage" fields reported to Java.
…nto ad_real_mem_usage
On Linux both adjusted usage and adjusted peak usage are set to system memory usage (max resident set size) These are the values reported back to the Java process, they are not used for any other purpose.
The ml-cpp PR elastic/ml-cpp#2846 introduces changes to how memory values are calculated and reported for Linux platforms. This PR adjusts test case values accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
docs/CHANGELOG.asciidoc
Outdated
@@ -33,6 +33,7 @@ | |||
=== Enhancements | |||
|
|||
* Track memory used in the hierarchical results normalizer. (See {ml-pull}2831[#2831].) | |||
* Report the actual memory usage of the autodetect process. (See {ml-pull}2846[#2846]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to move it to 9.2.0 now
The ml-cpp PR elastic/ml-cpp#2846 introduces changes to how memory values are calculated and reported for Linux platforms. This PR adjusts test case values accordingly.
|
…#131981) The ml-cpp PR elastic/ml-cpp#2846 introduces changes to how memory values are calculated and reported for Linux platforms. This PR adjusts test case values accordingly.
…#131981) The ml-cpp PR elastic/ml-cpp#2846 introduces changes to how memory values are calculated and reported for Linux platforms. This PR adjusts test case values accordingly.
Determine the actual memory usage of the autodetect process as reported by the OS, e.g. on Linux this would be the value of the maximum resident set size returned by a call to
getrusage
.Add this value to the model size stats record returned to the ES Java process so it can be included in the
job counts
tab for anomaly detection jobs.Relates elastic/elasticsearch#131981