Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use system-level values for Windows virtual memory #2077

Merged
merged 2 commits into from
Oct 20, 2022

Conversation

dbwiddis
Copy link
Contributor

@dbwiddis dbwiddis commented Feb 23, 2022

Summary

Description

Submitting this as a draft PR as I believe it will address #2074. I do not have either a C or Python development environment to test these changes, so they are provided in the hopes someone else can test and confirm they fix the issue.

There's probably some room to simplify the code with temporary variables to reduce redundancy.

For background, I have implemented precisely this logic in my own (Java-based) project here and here following a detailed investigation in oshi/oshi#1182

@giampaolo
Copy link
Owner

Looking back at this, it seems Appveyor signals a problem: https://ci.appveyor.com/project/giampaolo/psutil/builds/42685706/job/pmmgoyuraqv158ca

build + import successful
cmd: C:\Python27\python.exe psutil\tests\runner.py
psutil-debug [psutil/arch/windows/disk.c:134]> DeviceIoControl -> ERROR_INVALID_FUNCTION; ignore PhysicalDrive0
psutil-debug [psutil/arch/windows/socks.c:61]> GetExtendedTcpTable: retry with different bufsize
Traceback (most recent call last):
  File "psutil\tests\runner.py", line 350, in <module>
    main()
  File "psutil\tests\runner.py", line 344, in main
    print_sysinfo()
  File "c:\projects\psutil\psutil\tests\__init__.py", line 1155, in print_sysinfo
    pinfo = psutil.Process().as_dict()
  File "c:\projects\psutil\psutil\__init__.py", line 531, in as_dict
    ret = meth()
  File "c:\projects\psutil\psutil\__init__.py", line 1109, in memory_percent
    % total_phymem)
ValueError: can't calculate process memory percent because total physical system memory is not positive (-877076026483515392L)
Command exited with code 1

@dbwiddis
Copy link
Contributor Author

Which is why it's in draft. :)

Looks like possible overflow, I'll look into it.

@dbwiddis
Copy link
Contributor Author

dbwiddis commented Oct 19, 2022

OK, I've checked and double checked and triple checked the values and they should be correct. I'm wondering if there's something different about accessing the values in a structure.

The -877076026483515392L value is probably not overflow; the value is way too large to even make sense. Looking at the binary of that is suspicious. Here it is split into 4 16-bit segments.

1111001111010100
0000000000000000
0011100001110000
0110000000000000

It seems odd that the second set of 16 bits are all 0's, and it makes me wonder if there's an alignment or endianness issue.

Digging around I see a calculation in Hadoop similar to what I did:

  PERFORMANCE_INFORMATION memInfo;
  // ... next 2 lines might be needed?
  ZeroMemory(&memInfo, sizeof(PERFORMANCE_INFORMATION));
  memInfo.cb = sizeof(PERFORMANCE_INFORMATION);
  if(!GetPerformanceInfo(&memInfo, sizeof(PERFORMANCE_INFORMATION)))
  {
    ReportErrorCode(L"GetPerformanceInfo", GetLastError());
    return EXIT_FAILURE;
  }
  vmemSize = memInfo.CommitLimit*memInfo.PageSize;
  vmemFree = vmemSize - memInfo.CommitTotal*memInfo.PageSize;
  memSize = memInfo.PhysicalTotal*memInfo.PageSize;
  memFree = memInfo.PhysicalAvailable*memInfo.PageSize;

which aligns with my change (some lines omitted and re-ordered to match, but no math changed):

    PERFORMANCE_INFORMATION perfInfo;
    if (! GetPerformanceInfo(&perfInfo, sizeof(PERFORMANCE_INFORMATION))) {
        PyErr_SetFromWindowsErr(0);
        return NULL;
    }
    return Py_BuildValue("(LLLLLL)",
                         perfInfo.CommitLimit * perfInfo.PageSize,   // total virtual
                         (perfInfo.CommitLimit - perfInfo.CommitTotal) * perfInfo.PageSize);  // avail virtual
                         perfInfo.PhysicalTotal * perfInfo.PageSize,      // total
                         perfInfo.PhysicalAvailable * perfInfo.PageSize,  // avail

The only difference seems to be that the Hadoop code zeros the memory prior to the call and then sets the cb value (similar to what was done for meminfo.dwLength). I wonder if that's necessary to do for the padding?

I don't have a windows machine to do much debugging here, but if it's possible for anyone to compare the two sets of numbers, the following values should be equal:

perfinfo.CommitLimit       = memInfo.ullTotalPageFile / perfinfo.PageSize;
perfinfo.PhysicalTotal     = memInfo.ullTotalPhys / perfinfo.PageSize;
perfinfo.PhysicalAvailable = memInfo.ullAvailPhys / perfinfo.PageSize;

I'm going to update the PR to include the setting of cb (and a DCO signoff) to see if that fixes things, but I really can't figure out why it's not working.

@dbwiddis
Copy link
Contributor Author

Now the magic failure number is -927284125454278656L.

Some parts of the binary representation match (the lowest 32 bits), others don't:

OLD              NEW
1111001111010100 1111001100100001
0000000000000000 1010000000000000
0011100001110000 0011100001110000
0110000000000000 0110000000000000

@dbwiddis
Copy link
Contributor Author

One thought: the values in PERFORMANCE_INFORMATION are of type SIZE_T. They can be 32-bit on a 32-bit system, which the failing Appveyor run is. I'm not sure how Py_BuildValue() works given a 32-bit SIZE_T value with an L format, but I'm guessing that might be the problem?

@dbwiddis
Copy link
Contributor Author

psutil/_psutil_windows.c(618) : error C2275: 'ULONGLONG' : illegal use of this type as an expression

You're making me learn C and/or Python. This is not fair. :)

On the bright side, I see handling of SIZE_T earlier in the file so I have an example to work from....

@dbwiddis dbwiddis force-pushed the patch-1 branch 2 times, most recently from 8748396 to 326496b Compare October 19, 2022 07:05
@dbwiddis
Copy link
Contributor Author

Good news is I figured out the appropriate syntax to get the values. Bad news is that the meaning assigned to those values differs here and at my own project, so tests are currently failing.

I'll do some experimenting over the next week to line up the numbers to report the same thing, and flip this into ready for review when I think I've solved it.

@giampaolo
Copy link
Owner

Thank you Daniel.

@dbwiddis dbwiddis force-pushed the patch-1 branch 2 times, most recently from f2b7e19 to 1fd5950 Compare October 20, 2022 02:40
@dbwiddis
Copy link
Contributor Author

OK, that was painfully long for a very simple fix. The root cause was in fact overflow (multiplying two 32-bit size_t values and getting a 33-bit result) and I was overcomplicating how to handle it.

Should be good to go now!

@dbwiddis
Copy link
Contributor Author

Oh, forgot to add, only the first 4 values of the MEMORYSTATUSEX struct had corresponding values in PERFORMANCE_INFORMATION, but the last two values are calling-process based and are never used, so I just removed them.

@dbwiddis dbwiddis marked this pull request as ready for review October 20, 2022 02:46
Submitting this as a draft PR as I believe it will address giampaolo#2074.  I do not have either a C or Python development environment to test these changes, so they are provided in the hopes someone else can test and confirm they fix the issue.

There's probably some room to simplify the code with temporary variables to reduce redundancy.

For background, I have implemented precisely this logic in my own (Java-based) project [here](https://github.com/oshi/oshi/blob/3bb9eafbe2062edad4108b7b12501b1f9be27361/oshi-core/src/main/java/oshi/hardware/platform/windows/WindowsVirtualMemory.java#L110-L118) and [here](https://github.com/oshi/oshi/blob/3bb9eafbe2062edad4108b7b12501b1f9be27361/oshi-core/src/main/java/oshi/hardware/platform/windows/WindowsGlobalMemory.java#L253-L263) following a detailed investigation in oshi/oshi#1182

Signed-off-by: Daniel Widdis <widdis@gmail.com>
Copy link
Owner

@giampaolo giampaolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a couple of minor notes but overall looks good and appveyor tests are not complaining.

psutil/_psutil_windows.c Show resolved Hide resolved
psutil/_psutil_windows.c Outdated Show resolved Hide resolved
@giampaolo
Copy link
Owner

Also, can you please update HISTORY and CREDITS files?

Signed-off-by: Daniel Widdis <widdis@gmail.com>
@giampaolo giampaolo merged commit cd8827d into giampaolo:master Oct 20, 2022
@giampaolo
Copy link
Owner

Merged. Thanks Daniel.

@giampaolo
Copy link
Owner

Daniel, since this didn't fix the problem I think it's safer to revert this PR. The main reason is because this same C function is used also for virtual_memory(), and years ago I remember I checked virtual_memory() values with Windows task manager and they matched.

Do you agree on reverting this?

@dbwiddis
Copy link
Contributor Author

Daniel, since this didn't fix the problem I think it's safer to revert this PR. The main reason is because this same C function is used also for virtual_memory(), and years ago I remember I checked virtual_memory() values with Windows task manager and they matched.

Do you agree on reverting this?

I'm a little late on seeing this, but for completeness I'll say "no". My reasoning:

  1. This PR doesn't really change the problematic calculations. In the vast majority of cases it makes no difference; in some cases it does improve things. System-level is preferable to process-level.
  2. The PR causing the issue was [Windows] psutil.swap_memory() show swap instead of committed memory #1927 which fixed one problem but created another. If one were to be reverted it would be that one.
  3. I've outlined the situation in detail in this comment
  4. I've submitted another PR Get Windows percent swap usage from performance counters #2160, that I think finally solves this problem. Reverting this PR would require re-adding the portions of it into that new PR. Given it's between releases I don't see a problem with adding some code and then modifying some of it after testing... that's what snapshot releases are for!

ddelange added a commit to ddelange/psutil that referenced this pull request Oct 28, 2022
* 'master' of https://github.com/giampaolo/psutil:
  add windows test for free physical mem giampaolo#2074
  fix OSX tests broken by accident
  update HISTORY + give CREDITS for @arossert, @smoofra, @mayeut for giampaolo#2102, giampaolo#2156, giampaolo#2010
  build fix for Mac OS, Apple Silicon (giampaolo#2010)
  Linux: fix missing SPEED_UNKNOWN definition (giampaolo#2156)
  Use system-level values for Windows virtual memory (giampaolo#2077)
  feature: use ABI3 for cp36+ (giampaolo#2102)
  fix py2 compatibility
  improve API speed benchmark script giampaolo#2102
  fix: linter issues from giampaolo#2146 (giampaolo#2155)
  chore: skip test_cpu_freq on macOS arm64 (giampaolo#2146)
  pre-release + give CREDITS to @mayeut (PR giampaolo#2040) and @eallrich (new supporter)
  Fix a typo (giampaolo#2047)
  move isort and coverage config into pyproject.toml
  fix giampaolo#2021, fix giampaolo#1954, provide OSX arm64 bins + add pyproject.toml (giampaolo#2040)
  refactor git_log.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Windows] Swap used is bigger than swap total
2 participants