Apple Foveated Streaming framework support

Apple dropped the Foveated Streaming framework today (visionOS 26.4 beta) along with a reference implementation. It's a first-party API that gives streaming apps access to gaze-directed foveation data — the thing that's been blocked by Apple until now.

Directly relevant to #20, #133, and #157. As @shinyquagsire23 noted in #133: "Eye tracking is an Apple limitation." This framework is Apple's answer.

What the framework gives you

The whole thing is built around a session management protocol that's transport-agnostic:

mDNS discovery via _apple-foveated-streaming._tcp
TCP + JSON messaging for connection lifecycle and QR code pairing
Session states: WAITING → CONNECTING → CONNECTED → PAUSED → DISCONNECTED
Auto pause/resume on headset removal

On the visionOS side:

FoveatedStreamingSession manages the connection to the streaming endpoint
FoveatedStreamingSpace is a new ImmersiveSpace variant with native foveation support
Bidirectional message channel for custom data exchange between client and endpoint
Supports progressive and mixed immersion styles

The endpoint receives approximate gaze region data and renders high-res content only in that region, reducing bandwidth and compute while improving perceived quality.

How this maps to ALVR's current architecture

I spent some time reading through the alvr-visionos code. Here's where things stand and what would need to change:

Where ALVR is now:

FFR.swift does fixed foveated rendering with static center coordinates from server config (centerSizeX/Y, centerShiftX/Y, edgeRatioX/Y). Fovea is always center-of-FOV.
RealityKitEyeTrackingSystem.swift is the side-channel workaround — screen recording broadcast extensions + CFNotificationCenter shift registers to get approximate eye position. Requires the RealityKit renderer, limited precision.
EventHandler.swift handles mDNS/Bonjour discovery and connection management.
ALVRClientApp.swift defines three immersive spaces (DummyImmersiveSpace, RealityKitClient, MetalClient) via CompositorLayer / CompositorServices.

Proposed integration (hybrid approach — keeps ALVR's transport):

Add a FoveatedStreamingSpace alongside the existing immersive spaces in ALVRClientApp.swift as a new rendering mode.
Implement the session management protocol on the streamer side. It's straightforward TCP + JSON, similar to what ALVR already does for its own connection management. The mDNS service type changes to _apple-foveated-streaming._tcp and the handshake adds QR code pairing.
Use Apple's gaze region data to drive FFR.swift — dynamically update centerShiftX/Y from the framework's gaze callbacks instead of static values. This replaces the RealityKitEyeTrackingSystem workaround entirely.
Keep ALVR's video transport (H.264/HEVC/AV1 pipeline). The framework uses NVIDIA CloudXR as its reference transport, but the session management layer is independent. ALVR can implement the session protocol for foveation data while keeping its own streaming pipeline.

There's also the option of going full CloudXR, but that would drop ALVR's cross-platform support and require NVIDIA GPUs. Probably only interesting as a niche mode for users with the right hardware.

Client-side changes (alvr-visionos):

New immersive space type wrapping FoveatedStreamingSpace
FoveatedStreamingSession connection management alongside existing ALVR connection
Deprecate RealityKitEyeTrackingSystem.swift broadcast extension workaround
Forward gaze region data to server via ALVR's existing data channel

Server-side changes (ALVR streamer):

Implement Apple's session management protocol (mDNS + TCP/JSON)
Accept dynamic foveation center coordinates from the client
Update encoder foveation parameters per-frame based on gaze data

Worth noting: Apple's sample repo also includes a StreamingSession.xcframework for iOS, so this could eventually enable ALVR streaming to iPhones/iPads too.

References

Foveated Streaming framework docs (visionOS 26.4+ beta)
apple/StreamingSession sample code — C# Windows app + OpenXR sample
Session management protocol spec
Creating a foveated streaming client on visionOS

Beta status

The framework is on visionOS 26.4 developer beta, so APIs could still change before release. Probably makes sense to develop this on a feature branch and gate it behind a setting in GlobalSettings — same pattern as experimental40ppd, enablePersonaFaceTracking, etc. That way it can ship disabled by default and get turned on once the framework goes stable.

The session protocol itself is fully documented with complete message formats and is implementable independently of CloudXR. Given how long eye-tracked foveation has been blocked (#20 is from Feb 2024), seems worth starting the work now even if it can't ship until visionOS 26.4 goes GA.

divideby0/alvr-foveated-streaming-issue.md

Select an option

No results found