Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use visual viewport instead of layout viewport for root intersection rectangle? #95

Open
ChumpChief opened this issue Feb 8, 2016 · 21 comments

Comments

@ChumpChief
Copy link
Member

The definition of the root intersection rectangle specifies that for intersection roots which are document viewports, it is specifically the layout viewport.

I think this should instead be the visual viewport to better align with the stated goal in the spec of "calculating element visibility".

There is a corresponding note that this choice was intentional, implying there was intent of keeping the root intersection rectangle closely related to layout dimensions (specifying that pinch zoom changes neither). If that is still true and this change is rejected, then the relationship between IntersectionObserver and layout dimensions should be better explained. The goals outlined in the spec currently sound more relevant to "what the user sees" than "how the page is laid out".

I would also be concerned that leaving this as the layout viewport would leave the "exotic plugin-based solutions" for ad impression tracking as more-effective than IntersectionObserver and discourage uptake.

Alternatively, if there are scenarios which require intersection against the layout viewport, then an option to specify layout vs. visual viewport as the root intersection rectangle could be interesting.

@szager-chromium
Copy link
Collaborator

You are correct that the intention is visual viewport; marking this "needs spec".

@ChumpChief
Copy link
Member Author

Sounds good, thanks. As a followup, it should probably be made explicit what the coordinate space is for rootMargin then.

I think it should be measured in the visual viewport's coordinate space, e.g. such that 100vh represents content that is one page away, or approximately one "page down" press away. As opposed to the layout viewport's coordinate space, where 100vh in that coordinate space may be multiple "page down" presses away depending on the user's pinch zoom level.

@szager-chromium
Copy link
Collaborator

Actually, I'm sorry, my previous reply was incorrect. After clarifying this with some of the people involved:

At least for the first release, IntersectionObserver will track intersections with the layout viewport, not the visual viewport. The primary reason for this is that by tracking the layout viewport, we are not revealing any new information to cross-origin iframes than they currently have. IntersectionObserver will provide the same information that is currently available, but it will do it more accurately and more efficiently (in terms of performance and battery consumption) by eliminating the need for pollers or scroll event handlers.

Tracking intersections with the visual viewport, while unquestionably useful, is also likely to be more controversial, so we're punting on that for now, pending more discussion with browser vendors.

@ojanvafai
Copy link
Collaborator

@ChumpChief that sound OK to you for now?

@KenjiBaheux KenjiBaheux added this to the Spec V2 milestone Mar 4, 2016
@ChumpChief
Copy link
Member Author

I would also be concerned that leaving this as the layout viewport would leave the "exotic plugin-based solutions" for ad impression tracking as more-effective than IntersectionObserver and discourage uptake.

@ojanvafai The impression I had from our previous in-person conversations is that these plugin-based solutions were already able to detect whether the ad (even cross-origin) was truly within the visual viewport based on paint timing. If that's correct, and IntersectionObserver is less-capable in that regard, is that not an argument against moving to IntersectionObserver from current implementations?

I think (please let me know if I'm wrong) @szager-chromium 's comment was in the spirit of aligning with this behavior Chrome was experimenting with. However, that change was reverted since then, so the desire to hide visual viewport intersection seems less-applicable. Does the Chrome team still feel it's interesting to hide the visual viewport from IntersectionObserver despite it being revealed by other functionality?

Tracking intersections with the visual viewport, while unquestionably useful, is also likely to be more controversial, so we're punting on that for now, pending more discussion with browser vendors.

As far as I can tell, the intersection with the visual viewport is the only one that's actually interesting for the targeted scenarios for this feature - the layout viewport is never the useful one here. Were there any other objections to using visual viewport that I'm not aware of?

@mpb
Copy link
Collaborator

mpb commented Mar 14, 2016

As far as I know, the visual viewport vs layout viewport distinction is mostly relevant for non-mobile-optimized pages on mobile devices where "pinch zoom" is used. In this situation, none of the "exotic plugin-based solutions" should be viable, since Flash would not be available in this scenario. So that would not be a reason for people to halt adoption of intersection observer for viewability use-cases.
Unless anyone can point out a situation where the visual/layout distinction is relevant and there are plugin solutions available.

@jacobrossi
Copy link
Member

@mpb many mobile optimized pages still allow pinch zoom for various use cases and some browsers allow overriding a page that disables pinch-zoom for accessibility purposes.

@szager-chromium
Copy link
Collaborator

@jacobrossi But plugin-based solutions are not available on mobile, so IntersectionObserver is at the very least no less useful than anything else available, right?

@szager-chromium
Copy link
Collaborator

@ChumpChief the situation with pinch zoom is not so clear. This comment explains that reverting the behavior of window coordinates is temporary, until something more permanent is figured out. I think at the very least IntersectionObserver should not leapfrog that effort.

@ChumpChief
Copy link
Member Author

@ChumpChief the situation with pinch zoom is not so clear. This comment explains that reverting the behavior of window coordinates is temporary, until something more permanent is figured out. I think at the very least IntersectionObserver should not leapfrog that effort.

The debate over those APIs is one of web compat. The same arguments don't apply here, as it is a new API. If those properties still reflected the layout viewport I could understand (though not agree with) an argument for consistency, but at least with the change reverted that doesn't apply.

Regardless of where that discussion lands there remains the argument I initially put forth -- visual viewport accurately satisfies the explicitly stated scenarios listed in the spec ("calculating element visibility"); layout viewport does not. I think we can agree on that?

@szager-chromium
Copy link
Collaborator

The main thrust of the pinch zoom issue is that pinch zoom should be essentially invisible; it should not trigger layout or events. Using the visual viewport here would run contrary to that.

One of the main drivers of this API is to reduce jank caused by unnecessary scroll event handlers. I would be very cautious about introducing a new jank opportunity in the form of intersection observer callbacks responding to changes in pinch zoom.

I don't think this is a settled issue, but I think it's prudent to use layout viewport in the first iteration of this spec, and revisit this issue for v2 (which we are already know we have to do). By that time, hopefully the pinch zoom issue will be in a more settled state.

@ChumpChief
Copy link
Member Author

The main thrust of the pinch zoom issue is that pinch zoom should be essentially invisible; it should not trigger layout or events. Using the visual viewport here would run contrary to that.

This is not true. In fact there is strong feedback from web developers that the visual viewport should be detectable and easy to program against (hence the motivation for the Chrome revert and new proposal expanding developers' ability to detect and control the visual viewport).

Visual viewport matches the goals of this spec ("calculating element visibility"), layout viewport does not. I've not yet heard a compelling argument why layout viewport should be used. I'd like for this spec to specify visual viewport in v1, unless/until an argument can be made for layout viewport.

@bokand
Copy link

bokand commented Apr 14, 2016

I agree it's critical to have the option to use the visual viewport for the ads type scenarios to adopt this. But I'd also lean towards making layout be the default. My thoughts on the visual viewport is that, while not "essentially invisible", it's transient. It grows and shrinks regularly as part of browser UI (OSKs, zoom-in/out, etc) and users generally don't expect anything to change on the page as a result. There are some cases for the page to be able to see this but they're usually special cases. Additionally, I expect authors will be surprised by things that can cause the visual viewport to change.

To make this more concrete, you can imagine loading/unloading parts of a form using the intersection observer. As the user scrolls down, more boxes appear. If the user now tries to type in one of those boxes, a visual IntersectionObserver will make it look like the user scrolled up and the focused box would unload. These kinds of scenarios probably make more sense using the layout viewport. If visual is the default I'd expect this kind of bug to be not uncommon.

Anyway, I don't feel too strongly about this and I'm only mildly familiar with IntersectionObserver so that's just my $0.02

@RByers
Copy link
Contributor

RByers commented Apr 14, 2016

@ChumpChief how do you feel about the idea that it's layout by default with an option to choose visual? I think this is consistent with the direction we want other APIs to head. I.e. developers often don't consider the implications of pinch-zoom so better for all the APIs to deal in terms of the layout viewport by default (so pinch "just works") with opt-in to visual for those that understand the distinction.

@ChumpChief
Copy link
Member Author

Re: providing a developer option: I'm not opposed to doing so as I mentioned above, though I'm not sure the layout viewport option will be particularly useful. Either way, the default behavior still needs discussion.

My thoughts on the visual viewport is that, while not "essentially invisible", it's transient. It grows and shrinks regularly as part of browser UI (OSKs, zoom-in/out, etc) and users generally don't expect anything to change on the page as a result. There are some cases for the page to be able to see this but they're usually special cases.

I think this is the core item we need to close on -- "For the scenarios of IntersectionObserver, which is the 'mainstream' case vs. 'special' case between the visual viewport and layout viewport?"

@bokand based on your comment I think we agree that ad scenarios want visual viewport. The other two scenarios called out in the spec are virtualized scrollers (growing/trimming content for performance) and defer-loading content. For both of these I think the English description of the algorithm sounds like "When the content is nearly in view, do X". For an example where the layout/visual distinction matters, perhaps an extremely complex SVG diagram viewer wants to trim and add content as the user pans and zooms around the diagram. Or perhaps it wants to defer loading imported raster assets embedded in that diagram until they are panned into view. Because "in view" is still the interesting qualifier, I still think the visual viewport is the correct choice for these two scenarios.

Re: the transience of the viewport: IntersectionObserver wouldn't exist if it weren't for the transience of the viewport and difficulty in detecting the changes. I don't think it's something to be afraid of in this case, rather the entire point of the API is to expose this transience and make it easy to detect and respond to.

developers often don't consider the implications of pinch-zoom so better for all the APIs to deal in terms of the layout viewport by default (so pinch "just works") with opt-in to visual for those that understand the distinction.

Generally speaking I would consider usage of a brand new API to be opt-in already. But if the concern is more about new users of the API remaining ignorant of the viewport distinction (and therefore ignorant of exactly what they're opting into), then I would say they are only harmed if their scenario would have been better-served by the layout viewport. However, as I explained above, I feel that none of the scenarios that are currently called out meet this criterion.

In fact, I think it's more likely that developers who accidentally get the layout viewport would be harmed. As a counter-example, consider many feed-based sites (e.g. Twitter) which play/pause videos as they move through the center of the viewport. This functionality would be an excellent candidate to be rebuilt using IntersectionObserver. If the layout viewport were used then the user would be surprised that an out-of-view video was playing after they zoomed in on the video near the top or bottom of the layout viewport, while the video they can actually see remains paused.

An argument to make the layout viewport the default would be made stronger with a supporting scenario to back it up. One where providing the layout viewport would be a better fit for the developer's intent than the visual viewport would. So far all the scenarios seem to point to the visual viewport though, making it the more 'mainstream' case and therefore a better choice for the default.

Something along the lines of what you said is what I would be looking for:

To make this more concrete, you can imagine loading/unloading parts of a form using the intersection observer. As the user scrolls down, more boxes appear. If the user now tries to type in one of those boxes, a visual IntersectionObserver will make it look like the user scrolled up and the focused box would unload.

But for this one, why would the box be out of view? To my knowledge all implementations ensure the focused box remains in-view, even after an OSK has been invoked. So I'm not sure this one is particularly real.

@bokand
Copy link

bokand commented Apr 21, 2016

The way I think about the visual viewport is that it's more in the browser's domain while the layout viewport is the app's. The reason the visual viewport came about was because developers previously programmed against the one viewport and so once browser UI came about that needed to change that viewport (pinch, OSK) it resulted in a poor experience as the pages reacted to it in all sorts of ways. I worry that aggressively shepherding developers into programming against the visual viewport will put us in the same situation the next time browser UI is added or changed (we'll need a visualVisualViewport?).

That said, I agree with most of your points. In my experience, most web devs mental model of the viewport comes from desktop, where there really isn't a visual/layout distinction, and that's generally where they develop (I may also have a bias here due to the bugs I see :P). The layout viewport matches that model more closely but perhaps this matters less for new APIs with no precedent. Also, IntersectionObserver seems to be particularly suited to the visual. At the least, I don't see my concerns as affecting the success of IntersectionObserver so I'm fine with making the null root case imply the visual viewport.

In which case, I don't feel we need to add the layout viewport option. I realize this may seem a strange position to take after advocating for making layout the default. I think if visual is the default, few if any people will bother choosing layout and it'll just add needless complexity. Visual is the easier model to understand and gives the author more detail. My reason for preferring default layout is to force developers to consider the difference or get the less detailed picture where pinch-zoom and UI are effectively invisible to the page (i.e. behave as if the user can't pinch-zoom). In analogy, start of with a tricycle but let them choose bicycle. I don't think anyone would go the other way.

But for this one, why would the box be out of view? To my knowledge all implementations ensure the focused box remains in-view, even after an OSK has been invoked. So I'm not sure this one is particularly real.

@RByers pointed this out to me shortly after I posted it. This is true but it does make the implementation a little trickier (at least in Chrome) since the viewport needs to be resized before the focused box is scrolled into view (since the box might be at the bottom of the page beyond the current max scroll, or the page might not be scrolling at all). The scroll is also animated so you get into some subtle timing issues.

@ojanvafai
Copy link
Collaborator

I think it's critical that we have the option at least. I still think layout viewport is the better default though. I agree that layout viewport doesn't make sense for the ads use cases, but what I like about it is that it means that it's less likely for the non-ads use cases to do work during pinch-zoom or when the keyboard pops up, which is a good thing IMO. Those are times when you want to be especially careful to minimize amount of extraneous work, so it's good to default into the fast behavior.

I think people who care about viewability metrics will all learn to use the visual viewport if they care about the level of accuracy. But I don't think people notice performance problems as easily as correctness ones.

@ChumpChief sorry it took me so long to respond on what I think the use case is. Basically, I think people will use this for infinite lists, and lazy load/unload of widgets and I think it'd be better if they didn't unload/reload all the time when you pinch zoom. I can see your counter argument that this might be what the developer wants in some cases (and certainly in the viewability metric case).

@bokand
Copy link

bokand commented Apr 19, 2017

Sorry to revive this but rereading this after coming back to viewport issues I think I've articulated a clearer argument (in my head, at least).

For an example where the layout/visual distinction matters, perhaps an extremely complex SVG diagram viewer wants to trim and add content as the user pans and zooms around the diagram. Or perhaps it wants to defer loading imported raster assets embedded in that diagram until they are panned into view. Because "in view" is still the interesting qualifier, I still think the visual viewport is the correct choice for these two scenarios.

I think the kinds of bugs you're likely to see with visual-as-default will come from developers relying on bad assumptions. So in this case (or the infinite scroller, resource loading, etc.), a common case might be: "I've made my layout responsive so there will never be any horizontal scrolling. Since there's no horizontal scrolling, if we loaded an item from row X it must mean we've loaded all items in row X, so mark row X as loaded." Now if the user zooms in to one side and scrolls down to row X, then zooms out, we wont load the rest of the items in row X.

I think your argument comes down to making the default what's convenient for the developer. I'd argue we should default to what provides a good user experience in the absence of care. I think this lines up the incentives so that a lack of attention by the developer punishes the developer, not the user.

@ChumpChief
Copy link
Member Author

So in this case (or the infinite scroller, resource loading, etc.), a common case might be: "I've made my layout responsive so there will never be any horizontal scrolling. Since there's no horizontal scrolling, if we loaded an item from row X it must mean we've loaded all items in row X, so mark row X as loaded." Now if the user zooms in to one side and scrolls down to row X, then zooms out, we wont load the rest of the items in row X.

This scenario doesn't make sense to me - the developer is observing individual items with IntersectionObserver to lazy-load them, but unobserves a collection of them when any individual one in that collection is intersected? They would have to go out of their way to set up a mapping of the individual elements to the rows, then iterate through each item in the row and unobserve it. Rather than the more natural approach of just unobserving the change.target. I don't think this is a realistic construction...

Basically, I think people will use this for infinite lists, and lazy load/unload of widgets and I think it'd be better if they didn't unload/reload all the time when you pinch zoom. I can see your counter argument that this might be what the developer wants in some cases (and certainly in the viewability metric case).

Similarly I think this seems unlikely -- I've never seen a lazy-loader that tried to un-lazy-load content once loaded as it runs counter to the performance goals of using a lazy-loader in the first place. On the other hand, virtualized scrollers may unload content but even then it's more appropriate to base on the visual viewport since the logic wants to load/unload content based on proximity to the currently viewed content (e.g. load content that is likely reachable by a single "flick").

The number one scenario this spec was designed for is ad visibility, which directly correlates to the visual viewport. And lazy loading of images, and autoplay of video as I mentioned in my comment above also correlate to the visual viewport. "Can the user see this element?" is a fundamentally interesting question for a developer to be able to answer. "Does this element intersect with the rectangle that fixed elements are positioned against?" is not frequently asked when building real scenarios.

Rick mentioned on a separate issue that Google is hoping to relegate pinch-zoom to be an accessibility tool rather than a concept web developers program against. I think it's a fine principle for the default position on viewport-related APIs to be that they are based on the layout viewport until proof is shown that it's not the right choice, but I think there are clear exceptions such as IntersectionObserver where the usecases obviously want the visual viewport.

@szager-chromium
Copy link
Collaborator

Straw poll at TPAC 2018 WepPlat meeting:

  • 2 votes for making visual vs. layout viewport configurable
  • 4 votes for always using visual viewport

@miketaylr
Copy link
Member

Closing #266 as a dupe of this issue, but @zcorpan points to a relevant discussion for <img loading=lazy> in #266 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants