Apple dropped the Foveated Streaming framework today (visionOS 26.4 beta) along with a reference implementation. It's a first-party API that gives streaming apps access to gaze-directed foveation data — the thing that's been blocked by Apple until now.
Directly relevant to #20, #133, and #157. As @shinyquagsire23 noted in #133: "Eye tracking is an Apple limitation." This framework is Apple's answer.
The whole thing is built around a session management protocol that's transport-agnostic:
- mDNS discovery via
_apple-foveated-streaming._tcp - TCP + JSON messaging for connection lifecycle and QR code pairing
- Session states:
WAITING→CONNECTING→CONNECTED→PAUSED→DISCONNECTED - Auto pause/resume on headset removal
On the visionOS side:
FoveatedStreamingSessionmanages the connection to the streaming endpointFoveatedStreamingSpaceis a newImmersiveSpacevariant with native foveation support- Bidirectional message channel for custom data exchange between client and endpoint
- Supports progressive and mixed immersion styles
The endpoint receives approximate gaze region data and renders high-res content only in that region, reducing bandwidth and compute while improving perceived quality.
I spent some time reading through the alvr-visionos code. Here's where things stand and what would need to change:
Where ALVR is now:
FFR.swiftdoes fixed foveated rendering with static center coordinates from server config (centerSizeX/Y,centerShiftX/Y,edgeRatioX/Y). Fovea is always center-of-FOV.RealityKitEyeTrackingSystem.swiftis the side-channel workaround — screen recording broadcast extensions + CFNotificationCenter shift registers to get approximate eye position. Requires the RealityKit renderer, limited precision.EventHandler.swifthandles mDNS/Bonjour discovery and connection management.ALVRClientApp.swiftdefines three immersive spaces (DummyImmersiveSpace, RealityKitClient, MetalClient) viaCompositorLayer/CompositorServices.
Proposed integration (hybrid approach — keeps ALVR's transport):
-
Add a
FoveatedStreamingSpacealongside the existing immersive spaces inALVRClientApp.swiftas a new rendering mode. -
Implement the session management protocol on the streamer side. It's straightforward TCP + JSON, similar to what ALVR already does for its own connection management. The mDNS service type changes to
_apple-foveated-streaming._tcpand the handshake adds QR code pairing. -
Use Apple's gaze region data to drive
FFR.swift— dynamically updatecenterShiftX/Yfrom the framework's gaze callbacks instead of static values. This replaces theRealityKitEyeTrackingSystemworkaround entirely. -
Keep ALVR's video transport (H.264/HEVC/AV1 pipeline). The framework uses NVIDIA CloudXR as its reference transport, but the session management layer is independent. ALVR can implement the session protocol for foveation data while keeping its own streaming pipeline.
There's also the option of going full CloudXR, but that would drop ALVR's cross-platform support and require NVIDIA GPUs. Probably only interesting as a niche mode for users with the right hardware.
Client-side changes (alvr-visionos):
- New immersive space type wrapping
FoveatedStreamingSpace FoveatedStreamingSessionconnection management alongside existing ALVR connection- Deprecate
RealityKitEyeTrackingSystem.swiftbroadcast extension workaround - Forward gaze region data to server via ALVR's existing data channel
Server-side changes (ALVR streamer):
- Implement Apple's session management protocol (mDNS + TCP/JSON)
- Accept dynamic foveation center coordinates from the client
- Update encoder foveation parameters per-frame based on gaze data
Worth noting: Apple's sample repo also includes a StreamingSession.xcframework for iOS, so this could eventually enable ALVR streaming to iPhones/iPads too.
- Foveated Streaming framework docs (visionOS 26.4+ beta)
- apple/StreamingSession sample code — C# Windows app + OpenXR sample
- Session management protocol spec
- Creating a foveated streaming client on visionOS
The framework is on visionOS 26.4 developer beta, so APIs could still change before release. Probably makes sense to develop this on a feature branch and gate it behind a setting in GlobalSettings — same pattern as experimental40ppd, enablePersonaFaceTracking, etc. That way it can ship disabled by default and get turned on once the framework goes stable.
The session protocol itself is fully documented with complete message formats and is implementable independently of CloudXR. Given how long eye-tracked foveation has been blocked (#20 is from Feb 2024), seems worth starting the work now even if it can't ship until visionOS 26.4 goes GA.