Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Process.Name() getFromSnapProcess() performance #1368

Closed
xuyang2 opened this issue Oct 21, 2022 · 4 comments · Fixed by #1369
Closed

Windows Process.Name() getFromSnapProcess() performance #1368

xuyang2 opened this issue Oct 21, 2022 · 4 comments · Fixed by #1369

Comments

@xuyang2
Copy link
Contributor

xuyang2 commented Oct 21, 2022

Currently Process.Name() on Windows calls the CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, pid) syscall to get it's ppid and name:

func (p *Process) NameWithContext(ctx context.Context) (string, error) {
ppid, _, name, err := getFromSnapProcess(p.Pid)

func getFromSnapProcess(pid int32) (int32, int32, string, error) {
snap, err := windows.CreateToolhelp32Snapshot(windows.TH32CS_SNAPPROCESS, uint32(pid))

It seems CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, pid) will return all processes in the system despite passing the ProcessID parameter.

Although returning a snapshot of all processes is quite fast most of the time (<10ms), it can sometimes be slow (>100ms, sometimes even >1000ms).

Can we change the flags like below according to the documentation:
(not so sure about the exact flags, especially TH32CS_SNAPMODULE32)

func getFromSnapProcess(pid int32) (int32, int32, string, error) {
	// https://learn.microsoft.com/en-us/windows/win32/api/tlhelp32/nf-tlhelp32-createtoolhelp32snapshot
	// To enumerate the heap or module states for all processes, specify TH32CS_SNAPALL and set th32ProcessID to zero.
	// Then, for each additional process in the snapshot, call CreateToolhelp32Snapshot again,
	// specifying its process identifier and the TH32CS_SNAPHEAPLIST or TH32_SNAPMODULE value.
	snap, err := windows.CreateToolhelp32Snapshot(windows.TH32CS_SNAPHEAPLIST|windows.TH32CS_SNAPMODULE|windows.TH32CS_SNAPMODULE32, uint32(pid))
	if err != nil {
		return 0, 0, "", err
	}
	defer windows.CloseHandle(snap)
	var pe32 windows.ProcessEntry32
	pe32.Size = uint32(unsafe.Sizeof(pe32))
	if err = windows.Process32First(snap, &pe32); err != nil && err != syscall.ERROR_NO_MORE_FILES {
		return 0, 0, "", err
	}
	for {
		if pe32.ProcessID == uint32(pid) {
			szexe := windows.UTF16ToString(pe32.ExeFile[:])
			return int32(pe32.ParentProcessID), int32(pe32.Threads), szexe, nil
		}
		if err = windows.Process32Next(snap, &pe32); err != nil {
			break
		}
	}
	return 0, 0, "", fmt.Errorf("couldn't find pid: %d", pid)
}

or switch to NtQuerySystemInformation?

refs

https://learn.microsoft.com/en-us/windows/win32/api/tlhelp32/nf-tlhelp32-createtoolhelp32snapshot:

  • TH32CS_SNAPPROCESS 0x00000002 | Includes all processes in the system in the snapshot.

To enumerate the heap or module states for all processes, specify TH32CS_SNAPALL and set th32ProcessID to zero.
Then, for each additional process in the snapshot, call CreateToolhelp32Snapshot again, specifying its process identifier and the TH32CS_SNAPHEAPLIST or TH32_SNAPMODULE value.

see also

@Lomanic
Copy link
Collaborator

Lomanic commented Oct 22, 2022

NtQuerySystemInformation is really difficult to use, just check #862 (all the cases checking if the calling process and the inspected process are 32 or 64 bit). I won't go out of my way to use it if there's another API with decent performances.

Changing the flag used in CreateToolhelp32Snapshot to windows.TH32CS_SNAPHEAPLIST|windows.TH32CS_SNAPMODULE|windows.TH32CS_SNAPMODULE32 doesn't work unfortunately:

go test ./process
--- FAIL: Test_Process_Ppid (0.01s)
    process_test.go:161: getting ppid error Il n’y a plus de fichier.
    process_test.go:164: return value is 0 0
    process_test.go:168: return value is 0, expected 6956
--- FAIL: Test_Process_NumThread (0.00s)
    process_test.go:259: getting NumThread error Il n’y a plus de fichier.
--- FAIL: Test_Process_Threads (0.00s)
    process_test.go:272: getting NumThread error Il n’y a plus de fichier.
    process_test.go:28: not implemented
--- FAIL: Test_Process_Name (0.00s)
    process_test.go:294: getting name error could not get Name: Il n’y a plus de fichier.
    process_test.go:297: invalid Exe
--- FAIL: Test_Process_Long_Name_With_Spaces (1.28s)
    process_test.go:338: getting name error could not get Name: Il n’y a plus de fichier.
--- FAIL: Test_Process_Long_Name (1.35s)
    process_test.go:384: getting name error could not get Name: Il n’y a plus de fichier.
--- FAIL: Test_Parent (0.00s)
    process_test.go:476: error Il n’y a plus de fichier.
FAIL
FAIL    github.com/shirou/gopsutil/v3/process   6.129s
FAIL

In fact, psutil doesn't use CreateToolhelp32Snapshot on windows to get a process name, but takes the basename of the process executable path. Also, it is cached after first call.

@Lomanic Lomanic self-assigned this Oct 22, 2022
Lomanic added a commit to Lomanic/gopsutil that referenced this issue Oct 22, 2022
We align ourself with psutil
https://github.com/giampaolo/psutil/blob/8e4099d9f063ceb4ee3da5845562c5b934f83544/psutil/_pswindows.py#L749-L759

Benchmmarks show vast improvements

    go test -run=BenchmarkProcessName -bench=BenchmarkProcessName ./process
    goos: windows
    goarch: amd64
    pkg: github.com/shirou/gopsutil/v3/process
    cpu: Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz
    BenchmarkProcessName-4               180           6564033 ns/op
    BenchmarkProcessNameViaExe-4       22111             51153 ns/op
    PASS
    ok      github.com/shirou/gopsutil/v3/process   3.914s

Fixes shirou#1368
Lomanic added a commit to Lomanic/gopsutil that referenced this issue Oct 22, 2022
We align ourself with psutil
https://github.com/giampaolo/psutil/blob/8e4099d9f063ceb4ee3da5845562c5b934f83544/psutil/_pswindows.py#L749-L759

Benchmarks show vast improvements

    go test -run=BenchmarkProcessName -bench=BenchmarkProcessName ./process
    goos: windows
    goarch: amd64
    pkg: github.com/shirou/gopsutil/v3/process
    cpu: Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz
    BenchmarkProcessName-4               180           6564033 ns/op
    BenchmarkProcessNameViaExe-4       22111             51153 ns/op
    PASS
    ok      github.com/shirou/gopsutil/v3/process   3.914s

Fixes shirou#1368
@xuyang2
Copy link
Contributor Author

xuyang2 commented Oct 25, 2022

@Lomanic
Thanks for the Process.Name() optimize.

BTW, Process.NumThreads(), Process.Ppid() also use getFromSnapProcess().
Do they have alternative methods to retrieve other than CreateToolhelp32Snapshot (a snapshot of all processes in the system)?

@Lomanic
Copy link
Collaborator

Lomanic commented Oct 25, 2022

Ppid can be obtained as the last field in the ProcessBasicInformation struct. But that obviously uses NtQueryInformationProcess (in fact you were mentioning NtQuery*System*Information in OP, which doesn't fit our purpose as our API works for individual processes, not all of them) which is really hard to use. I even tried to adapt our current queryPebAddress() functions to not only return the process PEB address but the whole ProcessBasicInformation struct, unfortunately for a 64bit process inspecting a 32bit process the PEB address is obtained outside of this struct. Nevertheless, here is a quick and really dirty job (to be git-apply'd over the linked MR) at exposing the InheritedFromUniqueProcessId field from queryPEBAddress, with some benchmarks (with ppid caching removed for a real benchmark) and a regression test.

go test -run=BenchmarkProcessPpid -bench=BenchmarkProcessPpid ./process
goos: windows
goarch: amd64
pkg: github.com/shirou/gopsutil/v3/process
cpu: Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz
BenchmarkProcessPpid-4               204           5769890 ns/op
BenchmarkProcessPpidViaPEB-4      150300              7822 ns/op
PASS
ok      github.com/shirou/gopsutil/v3/process   4.581s

This is functional and way faster this way, but a lot more thought needs to occur for this patch.


For the Process.NumThreads() function, the PEB struct doesn't even provide this information. Another alternative would be to snapshot all threads with CreateToolhelp32Snapshot and TH32CS_SNAPTHREAD and loop over all of them

To identify the threads that belong to a specific process, compare its process identifier to the th32OwnerProcessID member of the THREADENTRY32 structure when enumerating the threads.

but I'm not convinced it will be faster as it's a more granular snapshot.

@xuyang2
Copy link
Contributor Author

xuyang2 commented Oct 26, 2022

@Lomanic Thanks.
There doesn't seem to be a convenient API to query ppid and num_threads of a single process.

For ppid, giampaolo/psutil seems to use CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0) to get a full snapshot to build a {pid:ppid, ...} dict.

For num_threads, giampaolo/psutil also seems to use NtQuerySystemInformation(SystemProcessInformation, ...) to query all process information of the system and loop and compare.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants