Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msSinceInput reports incorrectly #194

Closed
VoeSo opened this issue Jan 17, 2024 · 6 comments
Closed

msSinceInput reports incorrectly #194

VoeSo opened this issue Jan 17, 2024 · 6 comments
Assignees
Labels
PresentData Relates to the collection/analysis library (PresentData/) PresentMon-ConsoleApplication Relates to the console application (PresentMon/)

Comments

@VoeSo
Copy link

VoeSo commented Jan 17, 2024

I wanted to see the latency between ISR of USB and when the game registers input, so I ran a wpr trace and presentmon trace at the same time.
It seems that msSinceInput reports input as happening well before the USB ISR even ran.

Presentmon:
pmissue1
By substracting msSinceInput from TimeInSeconds we get 2024-1-17 13:49:56.8100508

WPR:
pmissue2
You can see that there is a fragment every 1s while there was no input and then every 1 ms when I moved the mouse as expected.
So the first ISR caused by input happened at 2024.01.17 13:49:56.8108321.
Thats 0.7813ms AFTER what presentmon reports.

To recreate:

  1. run presentmon and wpr at the same time while ingame. Stay clear of any inputs for a few seconds then move your mouse
    -run presentmon with -track_input and -date_time
    -run wpr like this
    pmissue3
  2. Look for the the first frame that received input in the presentmon report and calculate time of input by substracting msSinceInput from TimeInSeconds
    pmissue1
  3. Open the wpr trace and apply this profile https://github.com/VoeSo/Twix/blob/main/fet.wpaProfile
  4. Expand "Interrupt", then expand the correct driver and cpu.
  5. Scroll down to the approximate timepoint you calculated in step 2). You will see very little interrupts and then a lot of interrupts(every 1ms for 1khz mouse). In my example you can see that there is a fragment every 1s while there was no input and then every 1 ms when I moved the mouse
    pmissue2
  6. Write down the time of the first fragment caused by actual input and compare it to the time calculated in step 2)
@JeffersonMontgomery-Intel
Copy link
Collaborator

Hi, thanks for the report and sorry for the delay.

The main cause of the discrepancy here is that PresentMon is computing datetime differently than WPA. This is a PresentMon bug and I'll work on a fix, but for now this means that the msSinceInput delta is OK but you can't compare datetime's between PresentMon and WPA.

Note that there is another slight difference because PresentMon doesn't use the interrupt time, but the Win32K InputDeviceRead event which occurs slightly later (~100us later in a test I just ran).

@JeffersonMontgomery-Intel JeffersonMontgomery-Intel added PresentMon-ConsoleApplication Relates to the console application (PresentMon/) PresentData Relates to the collection/analysis library (PresentData/) labels Feb 21, 2024
JeffersonMontgomery-Intel added a commit that referenced this issue Feb 21, 2024
This commit has two changes to improve --datetime:

1) When processing an ETL, use full precision for the trace start time
   (previously was only using milliseconds).  This will result in the
   same present time for WPA and PresentMon when using --datetime and
   --etl_file.

2) When collecting in realtime with --datetime, this changes the ETW
   timestamp from QPC to SYSTEMTIME.  This ensures PresentMon's reported
   present start time will be the correct datetime, however it will not
   exactly match WPA.  WPA converts QPC into datetimes, but PresentMon
   cannot match that during realtime collection without using the same
   trace start time, which I can't find a way to query.

See issue #194
@JeffersonMontgomery-Intel
Copy link
Collaborator

I was able to fix a few things, but I couldn't find a way to make your above methodology work (capturing PresentMon and ETL at the same time). With the latest change the datetime in PresentMon and WPA should be closer, but the only way to get them to match would be to:

  1. Capture an ETL
  2. Process it afterwards with PresentMon --etl_file

You can use the Tools\start_etl_collection.cmd and Tools\stop_etl_collection.cmd scripts to ensure that all the events PresentMon needs will be collected.

@VoeSo
Copy link
Author

VoeSo commented Mar 14, 2024

Thanks for looking into this, I appreciate it!

You can use the Tools\start_etl_collection.cmd and Tools\stop_etl_collection.cmd scripts to ensure that all the events PresentMon needs will be collected.

I can confirm that this works, and using the cmd scripts, I get Win32K InputDeviceRead 20us after ISR.
One thing I have issue with is I can only open the etl file with WPA if I record the trace while using windowed mode ingame. If I use fullscreen, Presentmon can still read the etl fine, but opening in WPA gets stuck at 99% loading. Any idea how to fix that?

Note that there is another slight difference because PresentMon doesn't use the interrupt time, but the Win32K InputDeviceRead event which occurs slightly later (~100us later in a test I just ran).

I was wondering how do we know that Win32K InputDeviceRead is associatiated with the right frame? Is it just guessing based on the timing of the event and then associated to the next frame, or how does it work?

Thanks again, take care!

@JeffersonMontgomery-Intel
Copy link
Collaborator

I've seen WPA do that as well occasionally in different contexts. I'm not sure why or how to get around it, sorry. Sometimes it will eventually complete, so you could try waiting, but usually I'll recapture and hope it doesn't happen the next time 🤷.

I use the Win32k RetrieveInputMessage to associate the InputDeviceRead with a frame. This is when the input is available in the target processes' message queue, though it does make a couple assumptions:

  • that all InputDeviceRead events since the last RetrieveInputMessage get into the queue
  • that the application uses all the input events in the queue for that frame

AFAIK those are reasonable assumptions.

At some point in the future we can investigate using the ISR event instead to get a better measure. However, that requires a kernel trace session which is less flexible, and filtering all the interrupts will have more capture overhead so it's a trade off we'll have to evaluate.

@VoeSo
Copy link
Author

VoeSo commented Mar 22, 2024

I've seen WPA do that as well occasionally in different contexts. I'm not sure why or how to get around it, sorry. Sometimes it will eventually complete, so you could try waiting, but usually I'll recapture and hope it doesn't happen the next time 🤷.

I narrowed the problem down to the win32k provider flag. If you change it from 0x8400000440c01000 to 0x4000000402E2000, WPA opens fine even when the trace was made while running fullscreen and it still gives you RetrieveInputMessage and InputDeviceRead messages. But for some reason using that flag makes PM unable to output a csv.

I use the Win32k RetrieveInputMessage to associate the InputDeviceRead with a frame. This is when the input is available in the target processes' message queue, though it does make a couple assumptions:

How does PM determine the sample point/actual cpu frame start? With some framelimiters the game doesn't sample input right after present ends, but stalls and collects input at a later point.
(Getting a stat for this, like mssincecpustart would be a good addition to PM I think, though this might already be on the way with the new intel beta?)

Thank you for the help and insight, appreciate it!

@JeffersonMontgomery-Intel
Copy link
Collaborator

Interesting; thanks! I'll look into those keyword changes.

PresentMon assumes that the cpu frame start is the the time that the previous frame's Present() call returned, which is a good default assumption. To do better we need an API for applications to tell PresentMon their own timing, which we're planning for a future version.

msSinceInput is not relative to the cpu frame start though, it is relative to the start of this frame's Present() call.

The ClickToPhotonLatency metric in v2.0 is the time between input and the frame being displayed (msSinceInput + msUntilDisplayed in v1.* metrics).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PresentData Relates to the collection/analysis library (PresentData/) PresentMon-ConsoleApplication Relates to the console application (PresentMon/)
Projects
None yet
Development

No branches or pull requests

2 participants