feat(ui): implement fullscreen and pip display modes for MCP Apps#7312
feat(ui): implement fullscreen and pip display modes for MCP Apps#7312
Conversation
- Advertise ['inline', 'fullscreen', 'pip'] as available display modes - Add internal activeDisplayMode state with host-side toggle controls - Fullscreen: fixed overlay (z-1000) with Escape key to exit - PiP: draggable floating panel (z-900) with placeholder in chat - Intercept app-initiated ui/request-display-mode via postMessage, scoped to each instance's iframes to avoid cross-instance conflicts - Update hostContext reactively so apps receive host-context-changed - Add OnDisplayModeChange callback type for parent notification
Use the View Transitions API (document.startViewTransition) to animate between inline, fullscreen, and pip modes. Each instance gets a stable view-transition-name derived from its resourceUri so the browser can track the element across DOM position changes.
- Use flushSync inside startViewTransition so React commits DOM synchronously, giving the View Transitions API correct before/after snapshots for pip transitions - Add no-drag + z-[60] to fullscreen controls so they're clickable above Electron's 32px title bar drag region - Add cursor-pointer to all display mode control buttons
Add invisible placeholder in the chat flow when an app enters fullscreen, maintaining the same height as the inline container. This prevents the chat from reflowing and losing scroll position when the app overlay appears/disappears.
The SDK sets iframe height based on size-changed notifications, which works for inline mode but causes overflow clipping in fixed-size containers. Force iframe to 100% height in fullscreen and pip so the app content fills the container and handles its own scrolling.
Change PiP container from overflow-hidden to overflow-y-auto so the iframe content (which the SDK sizes to full content height) can be scrolled within the fixed-size PiP panel.
…divs The PiP container had overflow-y-auto but inner divs were all h-full, constraining them to the container height. The iframe (sized by the SDK to full content height) never overflowed. Remove h-full from the containerRef and appContent divs in PiP mode so the iframe's natural height flows through and triggers the scrollbar on the outer container.
Replace conditional render branches (if fullscreen... else if pip... else inline) with a single stable container that uses CSS classes to switch between positioning modes. The AppRenderer and its iframe are never unmounted, so app state (event logs, form inputs, etc.) is preserved when switching between inline, fullscreen, and pip.
Drop h-full from the inner content div in PiP mode so the iframe's natural height flows through and triggers overflow-y-auto on the fixed-size PiP container.
Add CSS rules for ::view-transition pseudo-elements to animate MCP app containers between display modes with a 300ms cubic-bezier easing. Disable the default full-page crossfade so only named elements animate. Includes prefers-reduced-motion support.
…de animation View Transitions API can't animate between static and fixed positioning. Replace with CSS transitions on the .mcp-app-container class that animate top/left/right/bottom/width/height/border-radius/box-shadow/opacity with a 300ms cubic-bezier easing. Remove unused flushSync import and viewTransitionName. Includes prefers-reduced-motion support.
Replace CSS transitions with FLIP (First, Last, Invert, Play) animation using getBoundingClientRect + element.animate(). Captures the element's position/size before the state change, lets React re-render, then animates from the old rect to the new rect with translate+scale. Includes prefers-reduced-motion support. Removes unused flushSync and viewTransitionName.
- Save inline height from DOM before leaving inline mode via inlineHeightRef - Use saved height for fullscreen/pip placeholders so they match original size - Read height from getBoundingClientRect instead of iframeHeight state to avoid stale values and dependency ordering issues - Remove iframeHeight from changeDisplayMode deps
Intercept ui/initialize postMessage to extract the app's appCapabilities.availableDisplayModes. Only show fullscreen/pip host-side controls if the app declared support for those modes. Apps that don't declare any display modes get no controls (inline only).
Change PiP mode controls from absolute to sticky positioning with float-right so they stay pinned at the top of the PiP panel as content scrolls.
Change drag handle from absolute to sticky positioning with mx-auto centering so it stays pinned at the top as PiP content scrolls.
Merge the drag handle and display mode controls into one sticky row at the top of the PiP panel, eliminating the stacked vertical spacing. Drag handle centered, controls right-aligned, all in a single h-6 bar.
Controls overlay with h-0 and opacity-0, only visible on hover via group-hover/pip. Drag handle always accepts pointer events for dragging. Zero vertical space taken when not hovering.
Match the drag handle to the same visual style as the fullscreen/close buttons: rounded-md, bg-black/50, p-1, backdrop-blur-sm. Placed inline in the same row with gap-1 for a cohesive toolbar feel.
Drag handle on the left, controls on the right with justify-between. Removed pt-1.5 padding.
Use p-1.5 for uniform top/right/bottom/left spacing on the toolbar so drag handle has equal space from top+left and controls have equal space from top+right.
…padding Remove padding from the toolbar container. Apply m-1 (4px) directly to the drag handle and controls wrapper so each element has equal margin from its nearest edge (top+left for drag, top+right for controls) with zero extra top padding on the container.
Use px-1 pt-1 on the container with no individual margins so both the drag handle and controls share the same top offset and are vertically aligned.
… div PiP controls were wrapped in a div with sticky/float/margin classes from the old approach, causing vertical misalignment with the drag handle. Now renders as a bare fragment so the parent toolbar flex row controls positioning for both elements equally.
* origin/main: (49 commits) add flag to hide select voice providers (#7406) New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) Display working dir (#7419) dev: add cmake to hermitized env (#7399) refactor: remove allows_unlisted_models flag, always allow custom model entry (#7255) feat: expose context window utilization to agent via MOIM (#7418) Small model naming (#7394) chore(deps): bump ajv in /documentation (#7416) doc: groq models (#7404) Client settings (#7381) Fix settings tabs getting cut off in narrow windows (#7379) docs: voice dictation updates (#7396) [docs] Add Excalidraw MCP App Tutorial (#7401) ... # Conflicts: # ui/desktop/src/components/McpApps/McpAppRenderer.tsx
|
hey @zanesq, @DOsinga, @spencrmartin, this PR is ready for review now. If you'd like you can test drive it via
|
zanesq
left a comment
There was a problem hiding this comment.
Tested locally works great!
One thing I noticed is a bunch of new error logs in the server console, wondering if we need to log these?
09:46:08.217 › from renderer: [UNHANDLED REJECTION] Error: Not connected
Error: Not connected
at $h.notification (http://localhost:5173/node_modules/.vite/deps/@mcp-ui_client.js?v=6995c336:6986:13)
at $h.sendHostContextChange (http://localhost:5173/node_modules/.vite/deps/@mcp-ui_client.js?v=6995c336:15891:17)
at $h.setHostContext (http://localhost:5173/node_modules/.vite/deps/@mcp-ui_client.js?v=6995c336:15888:39)
at http://localhost:5173/node_modules/.vite/deps/@mcp-ui_client.js?v=6995c336:16197:17
at Object.react_stack_bottom_frame (http://localhost:5173/node_modules/.vite/deps/react-dom_client.js?v=6a1d43a9:18567:20)
at runWithFiberInDEV (http://localhost:5173/node_modules/.vite/deps/react-dom_client.js?v=6a1d43a9:997:72)
at commitHookEffectListMount (http://localhost:5173/node_modules/.vite/deps/react-dom_client.js?v=6a1d43a9:9411:163)
at commitHookPassiveMountEffects (http://localhost:5173/node_modules/.vite/deps/react-dom_client.js?v=6a1d43a9:9465:60)
at commitPassiveMountOnFiber (http://localhost:5173/node_modules/.vite/deps/react-dom_client.js?v=6a1d43a9:11040:29)
at recursivelyTraversePassiveMountEffects (http://localhost:5173/node_modules/.vite/deps/react-dom_client.js?v=6a1d43a9:11010:13)
09:46:11.276 › from renderer: [UNHANDLED REJECTION] Error: Not connected
Error: Not connected
at $h.notification (http://localhost:5173/node_modules/.vite/deps/@mcp-ui_client.js?v=6995c336:6986:13)
at $h.sendHostContextChange (http://localhost:5173/node_modules/.vite/deps/@mcp-ui_client.js?v=6995c336:15891:17)
at $h.setHostContext (http://localhost:5173/node_modules/.vite/deps/@mcp-ui_client.js?v=6995c336:15888:39)
at http://localhost:5173/node_modules/.vite/deps/@mcp-ui_client.js?v=6995c336:16197:17
at Object.react_stack_bottom_frame (http://localhost:5173/node_modules/.vite/deps/react-dom_client.js?v=6a1d43a9:18567:20)
at runWithFiberInDEV (http://localhost:5173/node_modules/.vite/deps/react-dom_client.js?v=6a1d43a9:997:72)
at commitHookEffectListMount (http://localhost:5173/node_modules/.vite/deps/react-dom_client.js?v=6a1d43a9:9411:163)
at commitHookPassiveMountEffects (http://localhost:5173/node_modules/.vite/deps/react-dom_client.js?v=6a1d43a9:9465:60)
at commitPassiveMountOnFiber (http://localhost:5173/node_modules/.vite/deps/react-dom_client.js?v=6a1d43a9:11040:29)
at recursivelyTraversePassiveMountEffects (http://localhost:5173/node_modules/.vite/deps/react-dom_client.js?v=6a1d43a9:11010:13)
09:46:18.217 › from renderer: [UNHANDLED REJECTION] Error: Timed out waiting for sandbox proxy iframe to be ready
Error: Timed out waiting for sandbox proxy iframe to be ready
at http://localhost:5173/node_modules/.vite/deps/@mcp-ui_client.js?v=6995c336:15939:30
09:46:21.276 › from renderer: [UNHANDLED REJECTION] Error: Timed out waiting for sandbox proxy iframe to be ready
Error: Timed out waiting for sandbox proxy iframe to be ready
at http://localhost:5173/node_modules/.vite/deps/@mcp-ui_client.js?v=6995c336:15939:30
Goose noticed a couple issues that make sense to me also. Could follow up in a refactor but might be worth looking into the security concerns.
Concerns:
Component complexity growth — McpAppRenderer.tsx is now 1,114 lines. This single component handles: resource fetching, iframe sandboxing, theme injection, tool call routing, sampling, error handling, size negotiation, and now three display mode state machines with drag handling, keyboard navigation, and animations. Consider extracting a useDisplayMode hook or a DisplayModeManager that encapsulates:
activeDisplayMode / changeDisplayMode
PiP drag state + handlers (pipPosition, handlePipPointerDown/Move/Up, handlePipKeyDown, clampPipPosition)
Animation class management (enterAnimRef)
The postMessage listener for ui/request-display-mode
This would keep the main component focused on rendering and MCP protocol concerns.
Hard-coded PiP dimensions — PIP_WIDTH = 400, PIP_HEIGHT = 300, PIP_MARGIN_RIGHT = 16, PIP_MARGIN_BOTTOM = 140 are top-level constants. The bottom margin of 140px seems tuned to avoid the chat input area, but this is fragile if the chat UI layout changes. Consider deriving it from the actual chat input height, or at minimum adding a comment explaining why 140px.
Multiple useEffect hooks for mode lifecycle — There are separate effects for: Escape key handling (fullscreen), PiP position reset, iframe cache, postMessage interception, and syncing displayMode prop → activeDisplayMode. Each is individually clean, but together they create a distributed state machine where the mode transition logic is spread across 5+ effects. A reducer-based approach or a single orchestrating effect could make the mode lifecycle more explicit.
Security
postMessage origin validation — The handler checks iframeWindowsRef.current.has(e.source as Window) to verify the message came from a known iframe. This is good — it prevents arbitrary windows from triggering mode changes. However, there's no e.origin check. If a malicious page is loaded in the sandboxed iframe and can navigate, it could still send ui/request-display-mode messages. The iframe sandbox restrictions (allow-scripts but presumably no allow-same-origin for untrusted apps) mitigate this, but an explicit origin allowlist would be defense-in-depth.
ui/request-display-mode validation — The requested mode is checked against AVAILABLE_DISPLAY_MODES (the host's full list), not against effectiveDisplayModes (the intersection with what the app declared). The code comment acknowledges this:
"effective modes aren't available yet during initialize, so fall back to the full host list"
This means an app that only declared ['inline'] could still request 'fullscreen' via postMessage and the host would honor it. This is a minor inconsistency — the UI controls would be hidden, but the programmatic path bypasses the capability check. Consider validating against effectiveDisplayModes when available, falling back to AVAILABLE_DISPLAY_MODES only during the initialize window.
Tighten ui/request-display-mode to validate against effectiveDisplayModes (the intersection of host and app capabilities) once available after ui/initialize. Falls back to the full AVAILABLE_DISPLAY_MODES list only before initialize completes. Previously, an app declaring only ['inline'] could programmatically request 'fullscreen' because validation always used the full host list. Also moves the effectiveDisplayModes useMemo above the postMessage effect to avoid a temporal dead zone reference.
Add comment clarifying that the 140px bottom margin keeps the PiP window above the chat input area (~120px) plus padding.
McpAppRenderer is ~1,100 lines. Display mode logic (state machine, PiP drag handlers, animations, postMessage listener, keyboard/escape effects) accounts for ~300 lines that could be extracted into a dedicated useDisplayMode hook.
Add onLostPointerCapture handler that clears pipDragRef. Without this, if pointer capture is lost unexpectedly (e.g. browser intervention, window focus change), the drag state stays non-null and the PiP window follows the pointer on subsequent moves.
Without this, rapid mode changes could leave stale mcp-enter-* classes on the container if the CSS animation hadn't finished before the next mode change. The one-shot animationend listener removes the class as soon as the animation completes.
Move display mode state machine, capability negotiation, PiP drag handling, entrance animations, and postMessage interception into a dedicated useDisplayMode hook. McpAppRenderer drops from 1,114 to 903 lines. The hook is 335 lines with a clean interface: it takes displayMode, onDisplayModeChange, and containerRef, and returns all display mode state + handlers. No behavioral changes — this is a pure extraction refactor.
Ignore size-changed notifications while in fullscreen/pip mode — the app resizes to the detached container dimensions and would overwrite iframeHeight with the wrong value. When returning to inline, restore iframeHeight from the saved inlineHeight so the container snaps back to its pre-detach size without a visual jump.
Snapshot the container height as React state when leaving inline mode and restore it on return. This prevents incorrect sizing caused by size-changed notifications the app may send while in fullscreen or pip.
…m-cache * 'main' of github.com:block/goose: fix: replace unwrap() with graceful error in scheduler execute_job (#7436) fix: Dictation API error message shows incorrect limit (#7423) fix(acp): Use ACP schema types for session/list (#7409) fix(desktop): make bundle and updater asset naming configurable (#7337) fix(openai): preserve order in Responses API history (#7500) Use the correct Goose emoji 🪿 instead of Swan in README.md (#7485) feat(ui): implement fullscreen and pip display modes for MCP Apps (#7312) Disable tool pair summarization (#7481)
…ock#7312) Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…patible * origin/main: (70 commits) feat: allow goose askai bot to search goose codebase (#7508) Revert "Reapply "fix: prevent crashes in long-running Electron sessions"" Reapply "fix: prevent crashes in long-running Electron sessions" Revert "fix: prevent crashes in long-running Electron sessions" fix: replace unwrap() with graceful error in scheduler execute_job (#7436) fix: Dictation API error message shows incorrect limit (#7423) fix(acp): Use ACP schema types for session/list (#7409) fix(desktop): make bundle and updater asset naming configurable (#7337) fix(openai): preserve order in Responses API history (#7500) Use the correct Goose emoji 🪿 instead of Swan in README.md (#7485) feat(ui): implement fullscreen and pip display modes for MCP Apps (#7312) fix: prevent crashes in long-running Electron sessions Disable tool pair summarization (#7481) fix: New Recipe Warning does not close on cancel (#7524) The client is not the source of truth (#7438) feat: support Anthropic adaptive thinking (#7356) copilot instructions: reword no prerelease docs (#7101) fix(acp): don't fail session creation when model listing is unavailable (#7484) feat: simplify developer extension (#7466) feat: add goose-powered release notes generator workflow (#7503) ... # Conflicts: # Cargo.lock
Summary
Implements fullscreen and picture-in-picture (PiP) display modes for MCP Apps per the ext-apps spec.
Adds host-side support for the three standard MCP display modes: inline (default), fullscreen, and pip. The iframe is never remounted when switching modes — a single stable container swaps CSS classes so app state is fully preserved across transitions.
Host controls appear on hover or keyboard focus. Which buttons are shown depends on what the app declared in
appCapabilities.availableDisplayModesduringui/initialize— the host only offers modes the app supports, per spec. Apps can also request mode changes viaui/request-display-mode.Fullscreen is a fixed overlay with an Escape key listener and a close button that receives focus on enter. PiP is a draggable floating panel positioned above the chat input, with keyboard arrow key repositioning and viewport bounds clamping. Standalone mode (dedicated Electron windows) shares the fullscreen layout.
Mode transitions use lightweight CSS entrance animations (fade + subtle scale on the container shell) that avoid transforming the iframe contents.
prefers-reduced-motionis respected.To test: add https://mcp-app-bench.onrender.com/mcp as an extension and prompt "run app bench display mode"
inlinepipfullscreengoose-display-modes.mov