-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Renamed pointerOrigin and enum values to reflect usage pattern, not device type. #342
Conversation
…ction vectors. pointerOrigin is now named 'targetRayMode' and can have values to { 'gazing', 'pointing', 'tapping' }. pointerPose is now a ray consisting of two properties: 'targetRayOrigin' and 'targetRayDirection'. This is to be in line with immersive-web/hit-test#8 until a final approach in immersive-web#339 is determined.
Since the raycast discussion is as yet unresolved, I'm changing |
@@ -415,17 +415,21 @@ xrSession.addEventListener('inputsourceschange', (ev) => { | |||
}); | |||
``` | |||
|
|||
The properties of an XRInputSource object are immutable. If a device can be manipulated in such a way that these properties can change, the `XRInputSource` will be removed and recreated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to think about it for a while to decide if I'm OK with this, but I think I am. The most problematic scenario that I can think of is if you have a system that is not explicitly handed (like the Vive wands) and the system decides that you've switched hands in the middle of a grab-and-drag gesture it would be common for the page implementation to cause the object to be dropped as the "hand" holding it disappears.
The reality is, though, that I rarely see this happen outside of early system initialization and if it's a large concern then the system can silently re-map the physical devices to the previously created left/right hand input sources to hide the transition. Plus there's a significant number of devices for which the handedness is an explicit property of the input device or only changed manually, and therefore will never be surprising to the user.
So TL;DR: I'm good with it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is precisely since this is rare that we believe it is safer to remove and re-add the sources - since most apps will experience effectively unchanging sources, they'll likely assume they don't change and would probably miss such changes. Removing and re-adding the source gets us back onto the mainline path such apps will have tested.
explainer.md
Outdated
Each input source can queried a `XRInputPose` using the `getInputPose()` function of any `XRPresentationFrame`. Getting the pose requires passing in the `XRInputSource` you want the pose for, as well as the `XRCoordinateSystem` the pose values should be given in, just like `getDevicePose()`. Similar to `getDevicePose()` the requested pose may return `null` in cases where tracking has been lost. | ||
Each input source can query a `XRInputPose` using the `getInputPose()` function of any `XRPresentationFrame`. Getting the pose requires passing in the `XRInputSource` you want the pose for, as well as the `XRCoordinateSystem` the pose values should be given in, just like `getDevicePose()`. `getInputPose()` may return `null` in cases where tracking has been lost (similar to `getDevicePose()`), or the given `XRInputSource` instance is no longer connected or available. | ||
|
||
If an input source can be tracked the `XRInputPose`'s `gripMatrix` will indicate the device's position and orientation. If only position or orientation is trackable (not both), the `gripMatrix` will return a transform matrix applying only the trackable pose. An example of this case is for physical hands on some AR devices, that only have a tracked position. The `gripMatrix` will be `null` if the input source isn't trackable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"If only position or orientation is trackable (not both), the gripMatrix
will return a transform matrix applying only the trackable pose"
The wording on this is a little weird to me. I assume you mean that, for example, if the input source can only track position the transform matrix will not contain any rotation component? Not sure how to state that more clearly though.
Also, does this imply that the transform matrix should not apply an arm model for 3DoF controllers? Because if so I'm pretty strongly opposed to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your interpretation is correct, but I'll have a think about wording to make this a bit clearer. However, it was not the intention to disallow arm modeling for 3DoF controllers; would changing "will return" -> "may return" be sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will return -> may return is good enough for me, especially since this change does preserve the explicit mention of arm models later on.
explainer.md
Outdated
|
||
* `'head'` indicates the pointer ray will originate at the user's head and follow the direction they are looking. (This is commonly referred to as a "gaze input" device.) There should be at most one `'head'` input source per session. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We lost the text indicating that there should be at most one head
input source, which I think was originally requested by Microsoft? Is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was intentional since we are now treating AR hands as gazing
sources that have a gripMatrix
(whereas in the initial prototypes, we used hand
pointerOrigin). Since there are multiple hands, there needs to be one distinct inputSource in the array for each.
* `'screen'` indicates that the input source was an interaction with the 2D canvas of a non-exclusive session, such as a mouse click or touch event. See [Magic Window Input](#magic_window_input) for more details. | ||
* `'gazing'` indicates the target ray will originate at the user's head and follow the direction they are looking (this is commonly referred to as a "gaze input" device). While it may be possible for these devices to be tracked (and have a grip matrix), the head gaze is used for targeting. Example devices: 0DOF clicker, regular gamepad, voice command, tracked hands. | ||
* `'pointing'` indicates that the target ray originates from a handheld device and represents that the user is using that device for pointing. The exact orientation of the ray relative to the device should follow platform-specific guidelines if there are any. In the absence of platform-specific guidance, the target ray should most likely point in the same direction as the user's index finger if it was outstretched while holding the controller. | ||
* `'tapping'` indicates that the input source was an interaction with the 2D canvas of a non-exclusive session, such as a mouse click or touch event. See [Magic Window Input](#magic_window_input) for more details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if the term tapping
feels accurate here. People typically associate it with touchscreens, and we want it to cover mouse and stylus input as well. Admittedly I don't have a better term in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a tough one to name; some other ideas:
projecting
- as in, the pointing ray is projected into the scene from the camera as defined by a point on the near planetouching
- mouse, touch, stylus, etc could maybe considered forms of touch?pressing
- same reasoning as touching
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
projecting
is probably the most accurate, but it doesn't seem intuitive. Maybe screen-projection
or something like that?
While it's not exactly more accurate than tapping
, the term touching
feels more comfortable for whatever reason. At the very least tapping sounds like an instantaneous event while touching implicitly includes drag or long press operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
screen-projection
seems to be the clearest, but touching
feels more in line with the other enum values and is my preference by a narrow margin. Thoughts from other vendors welcome!
explainer.md
Outdated
} | ||
``` | ||
|
||
### Complex input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Complex input" is maybe not the phrase I would use here. Maybe "Direct manipulation" or "Grabbing and dragging"?
I've left a few comments, but in general I feel like this is a good clarification of the input system. Thanks for putting the time into it! |
Updated section title and clarified gripMatrix wording
I'm happy with this now, and willing to merge it. (Still wish I knew a better term than 'tapping', but I don't) I'd like to get input from at least one other vendor prior to doing so, however. Anyone else want to chime in? (If we don't hear back prior to the next call I'll bring it up then and try to get consensus. |
The code in the samples directory will also need to be modified to handle the new values. Considering that the Ray updates are potentially on the way, does it make sense to update the samples in one go as a separate PR? |
Yes, I think the samples can be updated in a more bundled fashion. We'll also want to add some simple, temporary shims to ensure that the new names map correctly to existing implementations (like Chrome) which will take a couple of browser releases to cycle to the new verbiage. |
The enums and some of the attributes involved in the WebXR input system were changed recently[1]. [1] immersive-web/webxr#342 Bug: 854382 Change-Id: I56fe5909d7015461cb7314d23a12b194f148d483 Reviewed-on: https://chromium-review.googlesource.com/1112881 Commit-Queue: Byoungkwon Ko <codeimpl@gmail.com> Reviewed-by: Brandon Jones <bajones@chromium.org> Reviewed-by: Kinuko Yasuda <kinuko@chromium.org> Cr-Commit-Position: refs/heads/master@{#572756}
The main purpose of this PR is to clarify what input sources are returned from a call to XRSession.getInputSources(). These changes make it easier to expose tracked hands on AR devices are exposed (eg, the HoloLens).
I've renamed the
pointerOrigin
enum and members totargetRayMode: { 'gazing', 'pointing', 'tapping' }
to better reflect the input intent, rather than physical representation. This was primarily to reduce confusion: since a tracked hand is actually uses the head gaze as a ray, so should be ahead
not ahand
in the old nomenclature.The implication of this is that while it is up to the user-agent to determine if it is appropriate to merge multiple 0DOF input sources (clicker, gamepad etc.) into a single
gazing
input source, or expose them as multiple (eg, tracked hands that only have a position but not orientation - thus are still consideredgazing
),I've removed the responsibility of the user agent for actually choosing which XRInputSource's are 'active'. All available input sources are returned.
I've also adopted the as-yet unfinalized raycast origin/target, which is under discussion in #339 and immersive-web/hit-test#14. Will need to likely update this when those discussions are resolved.Lastly - the controller rendering samples here still depend on the outcome of open issues: #336 and #330