Skip to content

Instantly share code, notes, and snippets.

@auramagi
Last active June 16, 2025 00:38
Show Gist options
  • Save auramagi/9c040c2233dfe71c24c76942e186f788 to your computer and use it in GitHub Desktop.
Save auramagi/9c040c2233dfe71c24c76942e186f788 to your computer and use it in GitHub Desktop.
Transcripts from all WWDC 2025 sessions

Transcripts from all WWDC 2025 sessions

Automate your development process with the App Store Connect API

Learn how the new WebHook API can provide you with real-time notifications from App Store Connect. We'll also introduce new APls that can help you manage user feedback and build delivery, and discuss how to integrate these tools into your development workflow to improve efficiency and streamline your processes.

Chapters

Resources

Related Videos

WWDC25

Transcript

Welcome. My name is Dajinsol Jeon, an engineer on the App Store Connect team. Today, I would like to introduce updates to the App Store Connect API. Let’s begin.

App Store Connect provides many APIs to automate development processes.

This automation frees your team to focus on what matters most, building great new features for your users.

This year, App Store Connect has significantly expanded its API across key areas like app management and TestFlight. Furthermore, App Store Connect now supports new sets of APIs, which include long-awaited Webhooks API and Apple-Hosted Background Assets API.

Today, I will go over some important and long-awaited updates, such as Webhooks, BuildUpload, and Feedback APIs. But first, let me quickly recap the typical app development process.

App development is an iterative process. You start with new features or bug fixes and upload a new build to App Store Connect.

Once the new build is distributed, your beta testers submit feedback and your next development phase incorporates the feedback from the previous build.

The key is you want to run this cycle as fast as possible. For example, when your users report a crash, you want to fix it as soon as possible for better user experience.

That's where today's updates come in.

The new App Store Connect APIs allow you to automate this loop, enabling much faster iteration.

Let's dive into how.

This year, App Store Connect is introducing Build upload API and Feedback API to support better automation. On top of that, App Store Connect is now supporting Webhook notifications. Your system can get notified when something happens on your app and react to those events.

For instance, App Store Connect will send a Build upload state event to your webhook listener when the processing of the uploaded build is complete. It signals that you can proceed to the next step.

Webhook notifications also support more events, such as Feedback events and Build beta state events, to inform you of important events about your app.

These additions will help make your development cycle faster than before.

Now, let me explore these enhancements in more detail. I will begin with Webhook notifications.

Traditional APIs are like constantly calling someone to ask, “Is there any news?” Your system has to keep asking App Store Connect for updates, but webhooks flip that around.

Webhooks are essentially push communication between servers. Instead of your system always asking, App Store Connect sends notifications about the events.

Technically speaking, it’s an HTTP callback from App Store Connect to your server when a specified event occurs related to your app.

This allows you to build event-driven workflows, which are far more efficient than constantly polling App Store Connect for updates. So, how does this work? Assuming that you have an HTTP server, which will be your webhook listener, you start by giving App Store Connect the URL of your webhook listener, basically telling App Store Connect where to send updates.

Then, whenever a relevant change occurs within App Store Connect, it sends a POST request to your registered URL.

The request payload contains information about the event. Based on that payload, your system can query the App Store Connect APIs for even more information or perform necessary actions. This year, App Store Connect is introducing webhook notifications for important events, new TestFlight feedback submissions, App version state changes, Build upload state changes, Build beta state changes, and Apple-Hosted Background Asset state changes. In order to receive those notifications from App Store Connect, you need to register your webhook listener first. Let me show you how to register your webhook listener in the App Store Connect website.

First, navigate to the Users and Access section. Then select Integrations.

Within Integrations, there is the Webhooks section on the sidebar. Click the plus button. This will open up the panel where you can configure your webhook details.

First, give your webhook a descriptive name. Then, enter the URL of your webhook listener. This is the endpoint where App Store Connect will send the notifications.

Now pay attention to the Secret field, this is important. App Store Connect uses this secret key to sign each webhook notification.

You can use any string for the secret, but it must be known to App Store Connect and you only.

This signature allows your system to verify that notification indeed came from Apple.

Finally, you need to choose which event you want to subscribe to. For today’s session, let me enable Build Upload, Build Beta State and TestFlight feedback events. Click create. That’s it. Now your servers will receive webhook notifications whenever those events occur for your app.

You can also set up your webhook using the API. This approach is particularly useful if you manage many apps or want to register webhook listeners automatically for new apps.

Let me show you how to register a webhook listener with APIs.

To register a webhook listener with the APIs, you need to send a POST request to the webhook endpoint.

The attributes are very similar to what we saw in the UI. You need to send the event types you want to subscribe to, the secret App Store Connect will use to sign the event payload, and the URL of your webhook listener.

Upon successful webhook creation, you will receive a 201 CREATED response. The payload will include the webhook ID, which is necessary for managing the webhook later.

So that’s what webhook notifications are, and how to register your webhook listener with App Store Connect.

Now it is time to automate uploading a new build to App Store Connect using the new build upload API. One of the major enhancements that I’m introducing today is the ability to automate BuildUploads with the App Store Connect API. So why is build upload API useful? The new build upload API is a part of standardized App Store Connect APIs, which allows you to add build upload to the rest of your automation.

This also means that you can upload your build using any language or platform you prefer.

And if you run into issues when uploading, these new APIs provide well-formatted messages to help automate error handling. This gives you a new, more flexible way to automate build upload processes. Let me show you how it works. To upload your build using the new API, start with creating BuildUploads. BuildUploads contain build information such as version and target platform.

Next, provide App Store Connect with the details of your build file using the BuildUploadFiles. App Store Connect will then provide instructions on how to upload your build.

Then, upload your build binary following the instructions provided.

The final step is to let App Store Connect know that the upload is complete so it can start processing the new build.

Now, let me walk you through the API details.

To create a BuildUpload, make a POST request. This request should include bundle version of your build and the target platform. If your request is successful, App Store Connect will respond with a 201 CREATED status with unique ID for this new BuildUpload. The next step is to create a BuildUploadFile to let App Store Connect know the file details. You will need to include the file name, size in bytes, and asset type.

If the BuildUploadFile is created successfully, you will get 201 CREATED response. Inside the response body, you will find upload operations. This tells you exactly how to upload your binary. It gives you the URL to send your build to, instruct you to use the PUT method, and list the required headers.

As with other App Store Connect APIs for uploading files, you might receive multiple instructions to upload your build in multiple chunks if your build is large. In this case, you will need to make multiple HTTP calls with different parts of your binary. After you upload your build by following the instruction from previous response, you need to tell App Store Connect that the upload is actually finished. This step signals App Store Connect to start processing your new build.

To do this, send a PATCH request with the uploaded property true.

If successful, you will get 200 OK responses with the state COMPLETE.

Now, let me open App Store Connect website to see the new build. App Store Connect website indicates that it started processing the new build.

But this raises a key question. When will it finish? And how will we know? That’s exactly where the webhook notification comes in, the one we configured earlier. It will notify you the moment this processing is complete.

Here is an example of what your server receives from App Store Connect when your build is processed successfully.

The key change is the state transition from PROCESSING to COMPLETE. Once you see that COMPLETE status, you have confirmation that the build is processed and you are ready to proceed to the next step. You will also get the message signature in the X-Apple-SIGNATURE header. This tells you that App Store Connect uses the HMAC-SHA256 algorithm.

You can calculate the signature using the secret you set up earlier and the payload body you received, and compare your calculated value and the header value to verify that the notification came from App Store Connect.

That’s how you automate build upload process with the new APIs.

And now let me move on to beta testing with TestFlight. Once your build is processed by App Store Connect, you can use TestFlight to distribute your new build to beta testers. You can assign the build directly to your specific beta tester groups. For external testers, you will need to submit the build for Beta App Review first. Finally, you will want to notify testers that the new build is available. The great news is that all of these steps can be automated using the TestFlight APIs.

These are some of the TestFlight APIs provided by App Store Connect.

You can find comprehensive information in the App Store Connect API documentation. However, the key point I want you to remember is that these APIs empower you to fully automate build distribution to your testers. While those APIs are well-established, I do want to highlight a useful new edition this year, the build beta state webhook event. This new webhook is designed to notify you immediately when TestFlight beta review is complete.

Here is an example payload of a build beta state notification. The payload will show the updated state and include a specific build ID.

When you receive this notification, you know that your build has passed review and is ready for external testing. Okay, so that’s how you can use TestFlight to distribute your new build.

Now, let me talk about getting TestFlight feedback from testers. Feedback is a key part of using TestFlight. Testers can submit screenshot feedback to make suggestion or report UI issues, and crash feedback when they experience app crashes. TestFlight feedback helps you discover insights that drive your app’s evolution.

A fast reaction to feedback is important for a better user experience. That’s why knowing instantly when new feedback arrives is important.

Also, since you don’t want to miss any feedback from your testers, tracking feedback with your development tools is very useful.

To address this, App Store Connect is introducing the new Feedback API along with corresponding webhoock events. This is a long-awaited capability requested by many developers. Let me explore the details.

When there is new screenshot feedback, you will get a webhook notification with a payload similar to this.

The webhook notification contains minimal information. It has the feedback type, which is screenshot in this case, and a related instance.

The instance tells you that it is beta feedback screenshot submissions and gives the ID.

You can use this ID to retrieve the details of this feedback.

When you call the Feedback API to retrieve screenshot feedback, the response you receive will contain various details, including device information and screenshot URL. You can use these URLs to download the screenshot images in separate calls.

Getting crash feedback is essentially identical to getting screenshot feedback, but when the crash log is available, you can download it programmatically by calling crashLog endpoint. So that’s how to retrieve TestFlight feedback using the API.

And that’s how we complete this app development process using the App Store Connect API. Webhook API and events for getting notification, Build upload API for uploading a new build, and Feedback API for retrieving feedback. But there are more.

For example, if you use Apple-Hosted Background Asset, new APIs are available to automate asset management.

And App Store Connect will send app version state webhook events to notify you about state changes of your app in the App Store, completing your process from development, through testing, and finally to release. Moreover, App Store Connect provides many existing APIs to automate different stages of your development process. Again, automating your daily task is important because it allows you to focus more on better user experience.

To wrap up today’s session, here are some recommendations.

First, build webhook listeners so that you can receive webhook notifications from App Store Connect. Next, implement reactive behaviors based on webhook events to automate your processes. Finally, explore what else the App Store Connect API offers and use these APIs in your automation for an even faster development cycle.

Also, don’t forget to check “Discover Apple-Hosted Background Assets” and “What’s new in App Store Connect” session for more information about this year’s edition in App Store Connect. That's all I have for today. I hope you found this session useful. Thanks for watching.

Summary

  • 0:00 - Introduction

  • App Store Connect now has an enhanced API suite to streamline the app development process. The new APIs, including the Webhooks API, BuildUpload API, and Feedback API, enable you to automate key tasks such as app management, TestFlight distribution, build uploads, and feedback collection. By automating these iterative steps — from build uploads to feedback incorporation — you can respond more swiftly to user feedback, fix bugs faster, and enhance the overall user experience, ultimately accelerating your app-development cycle.

  • 1:56 - Webhook notifications

  • App Store Connect now has Webhook notifications, revolutionizing how you receive updates about your apps. Webhooks enable push communication between servers, replacing the need for constant polling. You can set up webhook listeners by providing App Store Connect with the URL of your server endpoint. When specific events occur, such as build upload completion, feedback submissions, or beta state changes, App Store Connect sends a POST request to this URL that contains relevant information. This new feature allows you to build event-driven workflows, making the development cycle more efficient. By registering webhooks through the App Store Connect website or API, you can ensure your systems receive timely and secure notifications, enabling you to take immediate actions and streamline your app-management processes.

  • 6:50 - Build upload API

  • The new build upload API for App Store Connect helps you automate the build upload process using any language or platform. This API provides standardized instructions and well-formatted error messages for efficient automation. The process involves creating a 'BuildUpload' with version and target platform details, then specifying the build file details. App Store Connect responds with instructions for uploading the build binary, which may be split into chunks for large files. Once the binary is uploaded, notify App Store Connect, which then processes the build. Webhook notifications inform you when the build processing is complete, transitioning from the 'PROCESSING' to 'COMPLETE' state, allowing you to proceed to the next step.

  • 11:18 - Beta testing builds

  • After a build is processed by App Store Connect, you can distribute it to beta testers through TestFlight. Internal testers can receive builds directly, while external testers require a Beta App Review submission. The TestFlight APIs automate this distribution process. A build beta-state webhook event notifies you immediately when TestFlight beta review is complete, indicating the build is ready for external testing. The API documentation provides comprehensive details on these features.

  • 12:51 - Feedback API

  • The Feedback API and webhook events enhance TestFlight feedback management. You now receive real-time webhook notifications for new screenshot and crash feedback. These notifications include basic feedback type and instance information, with the full details accessible via the Feedback API. The API allows you to retrieve device information, screenshot URLs, and crash logs programmatically. This new capability enables swift responses to feedback, improves user experience, and streamlines the app-development process, complementing existing APIs for build uploads and other functionalities.

  • 15:05 - Additional development APIs

  • App Store Connect's new APIs and webhook events automate app version state changes, streamlining the development process from creation to release. Build webhook listeners and implement reactive behaviors to automate tasks, so you can focus on enhancing user experience. Explore the full range of App Store Connect APIs to further expedite your development cycle.

Better together: SwiftUI and RealityKit

Discover how to seamlessly blend SwiftUI and RealityKit in visionOS 26. We'll explore enhancements to Model3D, including animation and ConfigurationCatalog support, and demonstrate smooth transitions to RealityView. You'll learn how to leverage SwiftUI animations to drive RealityKit component changes, implement interactive manipulation, use new SwiftUI components for richer interactions, and observe RealityKit changes from your SwiftUI code. We'll also cover how to use unified coordinate conversion for cross-framework coordinate transformations.

Chapters

Resources

Related Videos

WWDC24

WWDC23

WWDC21

WWDC20

Transcript

Hi. I'm Amanda, and I'm a RealityKit engineer. And I'm Maks. I'm a SwiftUI engineer. Today we'll share some great enhancements to both SwiftUI and RealityKit that help them work even better together! Check out this adorable scene! We've got a charming SwiftUI robot, hovering in mid-air, and a grounded RealityKit robot – both yearning for connection. When they get close, sparks fly! But how can they get close enough to truly interact? Maks and I will share how to combine the worlds of traditional UI and interactive 3D content. First, I'll share some enhancements to Model3D.

Then, I'll demonstrate how to transition from using Model3D to using RealityView, and talk about when to choose one versus the other.

I'll tell you about the new Object Manipulation API.

RealityKit gets new Component types, integrating more aspects of SwiftUI.

Information can now flow both ways between SwiftUI and RealityKit - we'll explain.

Coordinate space conversion is easier than ever.

Drive RealityKit component changes with SwiftUI animations.

Let's wire it up! Display 3D models in your apps with just one line of code using Model3D. In visionOS 26, two enhancements let you do even more with Model3D - playing animations, and loading from a ConfigurationCatalog. Since Model3D is a SwiftUI view, it participates in the SwiftUI layout system. I'll use that layout system to make a little sign that displays the robot's name.

Now the sign says that this robot's name is Sparky.

Sparky also has some sweet dance moves! The artist bundled this animation with the robot model asset. New in visionOS 26 is the Model3DAsset type. Load and control animations on your 3D content by constructing a Model3D using a Model3DAsset.

The model loads the animations from the asset, and lets you choose which one to play.

"Model" is an overloaded term, especially in this session as we're converging a UI framework and a 3D game framework. In UI frameworks, a "model" refers to the data structure that represents the information your app uses.

The model holds the data and business logic, allowing the view to display this information. In 3D frameworks like RealityKit, a model refers to a 3D object that can be placed in a scene. You access it via the ModelComponent, which consists of a mesh resource that defines its shape, and materials that determine its appearance. Sometimes that happens. Two worlds collide, bringing their terminologies with them, and sometimes there's overlap. Now, back to Sparky and its animation.

I'm placing the Model3D above a Picker, a Play button, and a time scrubber. In my RobotView, I'm displaying the animated robot itself, and under that, I'm placing a Picker to choose which animation to play, plus the animation playback controls.

First, I'm initializing a Model3DAsset with the scene name to load from my bundle.

Then, once the asset is present, I pass it to the Model3D initializer. Underneath that in the VStack, I'm presenting a customized Picker that lists the animations that are available in this model asset. When an item is chosen from the list, the Picker sets the asset's `selectedAnimation` to the new value. Then the Model3DAsset creates an AnimationPlaybackController to control playback of that chosen animation.

The asset vends an `animationPlaybackController`. Use this object to pause, resume, and seek in the animation.

I'm passing that animationController into my RobotAnimationControls view, which we'll also look at shortly.

In visionOS 26, the existing RealityKit class `AnimationPlaybackController` is now Observable. In my SwiftUI view, I observe the `time` property to display the animation's progress.

I have a @Bindable property called `controller`, which means I'm using the`AnimationPlaybackController` as my view's data model.

When the controller's isPlaying value changes, SwiftUI will re-evaluate my RobotAnimationControls view. I've got a Slider that shows the current time in the animation, relative to the total duration of the animation. You can drag this slider and it will scrub through the animation. Here's Sparky doing its celebration animation! I can fast forward and rewind using the Slider. Go Sparky, it's your birthday! With its dance moves down, Sparky wants to dress up before it heads to the greenhouse to meet the other robot there. I can help it do that with enhancements to RealityKit's ConfigurationCatalog type. This type stores alternatives representations of an entity, such as different mesh geometries, component values, or material properties.

In visionOS 26, you can initialize a Model3D with a ConfigurationCatalog and switch between its various representations.

To allow Sparky to try on different outfits, my artist bundled a reality file with several different body types. I load this file as a ConfigurationCatalog from my app's main Bundle. Then, I create my Model3D with the configuration.

This popover presents the configuration options. Choosing from the popover changes Sparky's look.

Dance moves? Check. Outfit? Check. Sparky is ready to meet its new friend in the RealityKit greenhouse. Sparks are going to fly! To make those sparks fly, I'll use a Particle Emitter. But - that’s not something I can do at runtime with the Model3D type. Particle Emitter is a component that I add to a RealityKit entity. More on that in a moment. Importantly, Model3D doesn't support adding components.

So, to add a particle emitter, I'll switch to RealityView. I'll share how to smoothly replace my Model3D with a RealityView without changing the layout. First, I switch the view from Model3D to RealityView.

I load the botanist model from the app bundle inside the `make` closure of the RealityView, creating an entity.

I add that entity to the contents of the RealityView so Sparky appears on screen.

But... now the name sign is pushed too far over to the side. That wasn't happening before when we were using a Model3D.

It's happening now because, by default, RealityView takes up all the available space that the SwiftUI layout system gives it. By contrast, the Model3D sizes itself based on the intrinsic size of the underlying model file. I can fix this! I apply the new `.realityViewLayoutBehavior` modifier with `.fixedSize` to make the RealityView tightly wrap the model's initial bounds.

Much better.

RealityView will use the visual bounds of the entities in its contents to figure out its size.

This sizing is only evaluated once - right after your `make` closure is run.

The other options for `realityViewLayoutBehavior` are .flexible and .centered.

In all three of these RealityViews, I have the bottom of the Sparky model sitting on the origin of the scene, and I've marked that origin with a gizmo, the little multicolored cross showing the axes and origin.

On the left, with the `.flexible` option, the RealityView acts as if it doesn't have the modifier applied. The origin remains in the center of the view.

The `.centered` option moves the origin of the RealityView so that the contents are centered in the view. `.fixedSize` makes the RealityView tightly wrap the contents' bounds, and makes your RealityView behave just like Model3D.

None of these options re-position or scale your entities with respect to the RealityViewContent; they just re-position the RealityView's own origin point. I've sorted out Sparky's sizing in the RealityView. Next I'll get Sparky animating again.

I'll move from Model3D's new animation API to a RealityKit animation API directly on the Entity.

For more detail on the many ways of working with animation in RealityKit, check out the session "Compose interactive 3D content in Reality Composer Pro". I switched from Model3D to RealityView so I could give Sparky a ParticleEmitterComponent, because sparks need to fly when these two robots get close to each other.

Particle Emitters let you make effects that involve hundreds of tiny particles animating at once, like fireworks, rain, and twinkles.

RealityKit provides preset values for these, and you can adjust those presets to get the effect you're after. You can use Reality Composer Pro to design them, and you can configure them in code.

You add the ParticleEmitter to an entity as a Component. Components are a central part of RealityKit, which is based on the "Entity Component System" paradigm. Each object in your scene is an Entity, and you add components to it to tell it what traits and behaviors it has. A Component is the type that holds data about an Entity. A System processes entities that have specific components, performing logic involving that data. There are built-in systems for things like animating particles, for handling physics, for rendering, and many more. You can write your own custom system in RealityKit to do custom logic for your game or app. Watch Dive into RealityKit 2 for a more in-depth look at the Entity Component System in RealityKit. I'll add a particle emitter to each side of Sparky's head. First I make two invisible entities that serve as containers for the sparks effect. I designed my sparks emitter to point to the right. I'll add it directly to my invisible entity on Sparky's right side.

On the other side, I rotate the entity 180 degrees about the y axis so it's pointing leftward.

Putting it all together in the RealityView, here's Sparky with its animation, its name sign in the right position, and sparks flying! RealityKit is great for detailed creation like this! If you're making a game or play-oriented experience, or need fine-grained control over the behavior of your 3D content, choose RealityView.

On the other hand, use Model3D to display a self-contained 3D asset on its own. Think of it like SwiftUI's Image view but for 3D assets.

Model3D's new animation and configuration catalogs let you do more with Model3D. And if your design evolves and you need direct access to the entities, components, and systems, transition smoothly from Model3D to RealityView using realityViewLayoutBehavior. Next I'll share details about the new Object Manipulation API in visionOS 26, which lets people pick up the virtual objects in your app! Object manipulation works from both SwiftUI and RealityKit. With Object manipulation you move the object with a single hand, rotate it with one or both hands, and scale it by pinching and dragging with both hands. You can even pass the object from one hand to the other.

There are two ways to enable this, depending on whether the object is a RealityKit Entity or SwiftUI View.

In SwiftUI, add the new `manipulable` modifier.

To disallow scaling, but keep the ability to move and rotate the robot with either hand, I specify what Operations are supported.

To make the robot feel super heavy, I specify that it has high inertia.

The .manipulable modifier works when Sparky is displayed in a Model3D view. It applies to the whole Model3D, or to any View it's attached to.

When Sparky's in a RealityView, I want to enable manipulation on just the robot entity itself, not the whole RealityView.

In visionOS 26, ManipulationComponent is a new type that you can set on an entity to enable Object Manipulation.

The static function `configureEntity` adds the ManipulationComponent to your entity.

It also adds a CollisionComponent so that the interaction system knows when you've tapped on this entity. It adds an InputTargetComponent which tells the system that this entity responds to gestures. And finally, it adds a HoverEffectComponent which applies a visual effect when a person looks at or hovers their mouse over it.

This is the only line you need to enable manipulation of an entity in your scene. To customize the experience further, there are several parameters you can pass. Here, I'm specifying a purple spotlight effect. I'm allowing all types of input: direct touch and indirect gaze and pinch. And I'm supplying collision shapes that define the outer dimensions of the robot. To respond when a person interacts with an object in your app, the object manipulation system raises events at key moments, such as when the interaction starts and stops, gets updated as the entity is moved, rotated, and scaled, when it is released, and when it is handed off from one hand to another.

Subscribe to these events to update your state. By default, standard sounds play when the interaction begins, a handoff occurs, or the object is released.

To apply custom sounds, I first set the audioConfiguration to `none`. That disables the standard sounds. Then I subscribe to the ManipulationEvent DidHandOff, which is delivered when a person passes the robot from one hand to the other.

In that closure, I play my own audio resource. Well, Maks. Sparky's journey has been exciting: animating in Model3D, finding its new home in a RealityView, showing its personality with sparks, and letting people reach out and interact with it. It's come a long way on its path towards the RealityKit greenhouse. It sure has! But for Sparky to truly connect with the robot waiting there, the objects in their virtual space need new capabilities. They need to respond to gestures, present information about themselves, and trigger actions in a way that feels native to SwiftUI.

Sparky's journey toward the RealityKit greenhouse is all about building connection. Deep connection requires rich interactions. That's exactly what the new SwiftUI RealityKit components are designed to enable. The new components in visionOS 26 bring powerful, familiar SwiftUI capabilities directly to RealityKit entities.

RealityKit introduces three key components: First, the ViewAttachmentComponent allows you to add SwiftUI views directly to your entities. Next, the GestureComponent makes your entities responsive to touch and gestures. And finally, the PresentationComponent, which presents SwiftUI views, like popovers, from within your RealityKit scene.

visionOS 1 let you declare attachments up front as part of the RealityView initializer. After evaluating your attachment view builder, the system called your update closure with the results as entities. You could add these entities to your scene and position them in 3D space.

In visionOS 26, this is simplified. Now you create attachments using a RealityKit component from anywhere in your app.

Create your ViewAttachmentComponent by giving it any SwiftUI View. Then, add it to an entity's components collection.

And just like that I moved our NameSign from SwiftUI to RealityKit. Let’s explore gestures next! You can already attach gestures to your RealityView using `targetedToEntity`gesture modifiers.

New in visionOS 26 is GestureComponent. Just like ViewAttachmentComponent, you add GestureComponent to your entities directly, passing regular SwiftUI gestures to it. The gesture values are by default reported in the entity’s coordinate space. Super handy! I use GestureComponent with a tap gesture to toggle the name sign on and off.

Check it out. This robot's name is... Bolts! Pro tip: on any entity that's the target of a gesture, also add both InputTargetComponent and CollisionComponent. This advice applies to both GestureComponent and the targeted gestures API.

GestureComponent and ViewAttachmentComponent let me create a name sign for Bolts. But, Bolts is getting ready for a special visitor: Sparky! Bolts wants to look its absolute best for their meeting in the greenhouse. Time for another outfit change! I'll replace Bolts' name sign with UI to pick what Bolts will wear. Truly, a momentous decision.

To emphasize that, I'll show this UI in a popover, using PresentationComponent, directly from RealityKit.

First, I replace `ViewAttachmentComponent` with `PresentationComponent.

The component takes a boolean binding to control when the popover is presented, and to notify you when someone dismisses the popover.

The `configuration` parameter is the type of presentation to be shown. I'm specifying `popover`.

Inside the popover, I'll present a view with configuration catalog options to dress up Bolts.

Now, I can help Bolts pick its best color for when Sparky comes to visit.

Hey Maks, do you think Bolts is more of a summer? Or an autumn? That's a fashion joke.

Bolts is dressed to impress. But first, it has to go to work. Bolts waters plants in the greenhouse. I'll make a mini map, like on a heads-up display in a game, to track Bolts' position in the greenhouse. For that, I need to observe the robot's Transform component.

In visionOS 26, entities are now observable. They can notify other code when their properties change. To be notified, just read an entity’s "observable" property.

From the “observable” property you can watch for changes to the entity's position, scale, and rotation, its Children collection, and even its Components, including your own custom components! Observe these properties directly using a `withObservationTracking` block.

Or lean on SwiftUI's built-in observation tracking. I’ll use SwiftUI to implement my Minimap.

To learn more about Observation, watch "Discover Observation in SwiftUI".

In this view, I display my entity's position on a MiniMap. I'm accessing this observable value on my entity. This tells SwiftUI that my view depends on this value.

As Bolts moves about the greenhouse, watering the plants, its position will change. Each time it does, SwiftUI will call my View's body again, moving its counterpart symbol in the minimap! For a deeper explanation of SwiftUI's data flow, check out the session "Data Essentials in SwiftUI." Our robot friends are really coming together! That's the dream! I liked your description of the difference between "model" and "model" earlier. And sometimes you need to pass data from your data model to your 3D object model, and vice versa. In visionOS 26, observable entities give us a new tool to do that. Since the beginning, you could pass information from SwiftUI to RealityKit in the `update` closure of RealityView. Now with entity's `observable` property, you can send information the other way. RealityKit entities can act like model objects to drive updates to your SwiftUI views! So information can flow both ways now: from SwiftUI to RealityKit and from RealityKit to SwiftUI. But... does this create the potential for an infinite loop? Yes! Let's look at how to avoid creating infinite loops between SwiftUI and RealityKit.

When you read an observable property inside the body of a view, you create a dependency; your view depends on that property. When the property’s value changes, SwiftUI will update the view and re-run its body. RealityView has some special behavior. Think of its… update closure as an extension of the containing view's body.

SwiftUI will call the closure whenever any of that view's state changes, not only when state that is explicitly observed in that closure changes.

Here in my RealityView's update closure, I'm changing that position. This will write to the position value, which will cause SwiftUI to update the view and re-run its body, causing an infinite loop.

To avoid creating an infinite loop don’t modify your observed state within your update closure.

You are free to modify entities that you're not observing. That won't create an infinite loop because changes to them won't trigger SwiftUI to re-evaluate the view body.

If you do need to modify an observed property, check the existing value of that property and avoid writing that same value back.

This breaks the cycle and avoids an infinite loop.

Note that the RealityView's make closure is special.

When you access an observable property in the make closure, that doesn't create a dependency. It's not included in the containing view's observation scope.

Also, the `make` closure is not re-run on changes. It only runs when the containing view first appears.

You can also update the properties on an observed entity from within your own custom system. A system's update function is not inside the scope of the SwiftUI view body evaluation, so this is a good place to change my observed entities' values.

The closures of Gestures are also not inside the scope of the SwiftUI view body evaluation. They are called in response to user input.

You can modify your observed entities' values here, too.

To recap, it's cool to modify your observed entities in some places, and not in others.

If do you find that you have an infinite loop in your app, here's a tip for fixing it: Split up your larger views into smaller, self-contained views, each one having only their own necessary state. That way, a change in some unrelated entity won't cause your small view to be re-evaluated. This is also great for performance! You know, Maks, you might find that you don't need to use your `update` closure at all anymore. Since your Entity can now be your view's state, you can modify it in the normal places that you're used to modifying state, and forgo the update closure altogether. Yeah! I feel like avoiding infinite loops is something I have to learn over and over again. But, if I don't use an update closure, I'm less likely to run into one.

I think it's time to bring Bolts and Sparky together.

Bolts is finally done with work - time for its date with Sparky! As I pick up Sparky to bring it over, and the two robots get closer together, I want to make sparks fly as a function of the decreasing distance between them. I'll use our new Unified Coordinate Conversion API to enable this.

Sparky is in a Model3D SwiftUI view now, and Bolts is an Entity in the RealityKit greenhouse. I need to get the absolute distance between these two robots, even though they're in different coordinate spaces.

To solve this, the Spatial framework now defines a `CoordinateSpace3D` protocol that represents an abstract coordinate space. You can easily convert values between any two types that conform to CoordinateSpace3D, even from different frameworks.

RealityKit's `Entity` and `Scene` types conform to CoordinateSpace3D. On the SwiftUI side, GeometryProxy3D has a new .coordinateSpace3D() function that gives you its coordinate space. Additionally, several Gesture types can provide their values relative to any given CoordinateSpace3D. CoordinateSpace3D protocol works by first converting a value in Sparky’s coordinate space to a coordinate space shared by both RealityKit and SwiftUI. After that, it converts from the shared space into Bolt’s coordinate space, while taking low-level details like points-to-meter conversion and axis direction into account. In Sparky's Model3D view, whenever the view geometry changes, the system calls my `onGeometryChange3D` function. It passes in a GeometryProxy3D which I use to get its coordinate space. Then, I can convert my view's position to a point in the entity's space so I know how far apart my two robots are from each other. Now as Amanda brings Bolts and Sparky together, the sparks increase. As she pulls them apart, the sparks decrease.

Next, I'll teach these robots to move together and to coordinate their actions. I'll use SwiftUI driven animation for RealityKit components.

SwiftUI already comes with great animation APIs to implicitly animate changes to your view properties.

Here, I animate the Model3D view that Sparky is in. I send it over to the left when I toggle, and then it bounces back to the original position when I toggle again.

I’m adding an animation to my `isOffset` binding, and I'm specifying that I want an extra bouncy animation for it.

In visionOS 26, you can now use SwiftUI animation to implicitly animate changes to RealityKit components.

All you need to do is set a supported component on your entity within a RealityKit animation block and the framework will take care of the rest. There are two ways to associate an animation with a state change. From within a RealityView, you can use `content.animate()` to set new values for your components inside the animate block. RealityKit will use the animation associated with the SwiftUI transaction that triggered the `update` closure, which, in this case, is an extra bouncy animation.

The other way is to call the new Entity.animate() function, passing a SwiftUI animation, and a closure that sets new values for your components. Here, whenever the isOffset property changes, I send Sparky left or right using the entity’s position.

Setting the position inside the animate block will begin an implicit animation of the Transform component, causing the entity to move smoothly to the new position. The power of implicit animation really shines when I combine it with the Object Manipulation API that Amanda introduced. I can use a SwiftUI animation to implement a custom release behavior for Bolts. I’m first going to disable the default release behavior for object manipulation by setting it to .stay.

Then, I will subscribe to the WillRelease event for the manipulation interaction.

And when the object is about to be released, I will snap Sparky back by setting its transform to identity, which resets the scale, translation, and rotation of the entity. Since I’m modifying Sparky’s transform inside the animate block, Sparky will bounce back to its default position. Now Sparky's animation back to its original position is much more fun! All these built-in RealityKit components support implicit animations, including the Transform, the Audio components, and the Model and Light components, which have color properties! Sparky and Bolts had quite a journey. It's so great to see the power of SwiftUI and RealityKit working together.

With this connection, you're also empowered to develop truly exceptional spatial apps, fostering a real connection between the virtual and the physical! Imagine the possibilities as you seamlessly integrate SwiftUI components into your RealityKit scenes, and as entities dynamically drive changes to your SwiftUI state.

Just like Sparky and Bolts, we hope you're inspired to connect SwiftUI and RealityKit in ways we haven't even imagined yet. Let's build the future together!

Bring advanced speech-to-text to your app with SpeechAnalyzer

Discover the new SpeechAnalyzer API for speech to text. We'll learn about the Swift API and its capabilities, which power features in Notes, Voice Memos, Journal, and more. We'll dive into details about how speech to text works and how SpeechAnalyzer and SpeechTranscriber can enable you to create exciting, performant features. And you'll learn how to incorporate SpeechAnalyzer and live transcription into your app with a code-along.

Chapters

Resources

Related Videos

WWDC23

Transcript

Hello! I’m Donovan, an engineer on the Speech framework team and I’m Shantini, an engineer on the Notes team. This year, we’re excited to bring you the next evolution of our speech-to-text API and technology: SpeechAnalyzer. In this session, we’ll discuss the SpeechAnalyzer API and its most important concepts. We’ll also briefly discuss some of the new capabilities of the model behind the API. And finally, we’ll show a live-coding demo of how to use the API. SpeechAnalyzer is already powering features across many system apps, such as Notes, Voice Memos, Journal, and more.

And when we combine SpeechAnalyzer with Apple Intelligence, we create incredibly powerful features such as Call Summarization. Later, I’ll show you how to use the API to build your own live transcription feature. But first, Donovan will give you an overview of the new SpeechAnalyzer API. Speech-to-text, also known as automatic speech recognition or ASR, is a versatile technology that allows you to create great user experiences using live or recorded speech by converting it to a textual form that a device can easily display or interpret. Apps can store, search, or transmit that text in real time or pass it on to a text-based large language model.

In iOS 10, we introduced SFSpeechRecognizer. That class gave you access to the speech-to-text model powering Siri. It worked well for short-form dictation and it could use Apple servers on resource-constrained devices but it didn’t address some use cases as well as we, or you, would have liked and relied on the user to add languages. So now, in iOS 26, we’re introducing a new API for all our platforms called SpeechAnalyzer that supports more use cases and supports them better. The new API leverages the power of Swift to perform speech-to-text processing and manage model assets on the user’s device with very little code. Along with the API, we’re providing a new speech-to-text model that is already powering application features across our platforms. The new model is both faster and more flexible than the one previously available through SFSpeechRecognizer. It’s good for long-form and distant audio, such as lectures, meetings, and conversations. Because of these improvements, Apple is using this new model (and the new API) in Notes and the other applications we mentioned earlier. You can use these new capabilities to build your own application with the same sort of speech to-text features that Notes and our other applications provide but first, let’s check out the design of the API. The API consists of the SpeechAnalyzer class along with several other classes. The SpeechAnalyzer class manages an analysis session. You can add a module class to the session to perform a specific type of analysis. Adding a transcriber module to the session makes it a transcription session that performs speech-to-text processing. You pass audio buffers to the analyzer instance, which then routes them through the transcriber and its speech-to-text model. The model predicts the text that corresponds to the spoken audio and returns that text, along with some metadata, to your application.

This all works asynchronously. Your application can add audio as it becomes available in one task and display or further process the results independently in another task. Swift’s async sequences buffer and decouple the input and results.

The “Meet AsyncSequence” session from WWDC21 covers how to provide an input sequence and how to read the results sequence.

To correlate the input with the results, the API uses the timecode of the corresponding audio. In fact, all API operations are scheduled using timecodes on the audio timeline, which makes their order predictable and independent of when they’re called. The timecodes are precise down to an individual audio sample. Note how the transcriber delivers results in sequence, each of which covers its own range of audio without overlapping. This is normally how it works but, as an optional feature, you can make transcription iterative within a range of audio. You may want to do this to provide more immediate feedback in your application’s UI. You can show a rough result immediately and then show better iterations of that result over the next few seconds. We call the immediate rough results "volatile results". They’re delivered almost as soon as they’re spoken but they are less accurate guesses. However, the model improves its transcription as it gets more audio with more context. Eventually, the result will be as good as it can be, and the transcriber delivers one last finalized result. Once it does that, the transcriber won’t deliver any more results for this range of audio, and it moves on to the next range. Note how the timecodes show that later, improved results replace earlier results. This only happens when you enable volatile results. Normally, the transcriber only delivers finalized results, and none of them replace earlier results. You can build a transcription feature in just one function if all you need to do is read the file and return the transcription. That’s a job that doesn’t need volatile result handling or much concurrency. Here’s the function. Here, we create the transcriber module. We tell it the locale that we want to transcribe into. It doesn’t have any results yet but we’ll read them as they come in and use the AsyncSequence version of reduce to concatenate them. We’ll do this in the background using async let. Here, we create the analyzer and add the transcriber module, then we start analyzing the file. The analyzeSequence method reads from the file and adds its audio to an input sequence. When the file has been read, we tell the analyzer to finish up because we aren’t planning to work on any additional audio. Finally, we return the transcription that we’ve been assembling in the background. That’ll be the spoken words in the file, in the form of a single attributed string, and we’re done.

So now I’ve covered the concepts and basic usage of the API. You add modules to an analysis session to perform, say, transcription. It works concurrently and asynchronously, decoupling audio input from results. You correlate audio, results, and operations using the session’s audio timeline. Some of those results are volatile, if you want them to be, and the rest are final and won’t change. And I showed how the pieces fit together in a one-function use case. Later, Shantini will demonstrate how you can expand that one function’s work across different views, models, and view models. She’ll show you several methods and properties of the SpeechAnalyzer and Transcriber classes that you can use to handle some common needs, and you can read about these in the documentation as well. But now, we’d like to describe some of the advantages of the SpeechTranscriber class’s new speech-to-text model. SpeechTranscriber is powered by a brand new model engineered by Apple to support a broad spectrum of use cases. We wanted to create a model that could support long-form and conversational use cases where some speakers might not be close to the mic, such as recording a meeting. We also wanted to enable live transcription experiences that demand low latency without sacrificing accuracy or readability, and we wanted to keep speech private. Our new, on-device model achieves all of that. We worked closely with internal partners to design a great developer experience for you, and now you can support the same use cases in your own applications. With SpeechTranscriber, you gain a powerful speech-to-text model that you don’t have to procure and manage yourself. Simply install the relevant model assets via the new AssetInventory API. You can download them when needed. The model is retained in system storage and does not increase the download or storage size of your application, nor does it increase the run-time memory size. It operates outside of your application’s memory space, so you don’t have to worry about exceeding the size limit. And we constantly improve the model, so the system will automatically install updates as they become available. SpeechTranscriber can currently transcribe these languages, with more to come, and is available for all platforms but watchOS with certain hardware requirements. If you need an unsupported language or device, we also offer a second transcriber class: DictationTranscriber. It supports the same languages, speech-to-text model, and devices as iOS 10’s on-device SFSpeechRecognizer but improving on SFSpeechRecognizer, you will NOT need to tell your users to go into Settings and turn on Siri or keyboard dictation for any particular language. So that’s your introduction to the new API and model. It was pretty abstract, but we’ll get concrete now. Let’s go to Shantini, who will show you how to integrate SpeechAnalyzer into your app. Thanks for that great overview, Donovan! You may have seen the amazing features that we added to Notes in iOS 18 to record and transcribe phone calls, live audio, and recorded audio. Additionally, we integrated these features with Apple Intelligence, resulting in useful and time-saving summaries. We worked closely with the Speech team to ensure that SpeechAnalyzer and SpeechTranscriber would enable us to deliver a high-quality Notes feature. SpeechTranscriber is a great fit because it’s fast, accurate even at longer distances, and on device. One of our additional goals was to enable you, the developer, to build features just like the ones we added to Notes and customize them to meet the needs of your users. I’d love to get you started with that. Let’s check out an app I’m building with a live transcription feature. My app is meant for kids and records and transcribes bedtime stories, allowing you to play them back. Here are the transcription results in real time.

And, when you play the audio back, the corresponding segment of text is highlighted, so that they can follow along with the story. Let’s check out the project setup.

In my sample app code, I have a Recorder class, and a SpokenWordTranscriber class. I’ve made them both observable.

I also made this Story model to encapsulate our transcript information and other relevant details for display. Finally, I have our transcript view, with live transcription and playback views, and recording and playback buttons. It also handles recording and playback state. Let’s first check out transcription. We can set up live transcription in 3 easy steps: configure the SpeechTranscriber; ensure the model is present; and handle the results. We set up our SpeechTranscriber by initializing it with a locale object and options that we need. The locale’s language code corresponds to the language in which we want to receive transcription. As Donovan highlighted before, volatile results are realtime guesses, and finalized results are the best guesses. Here, both of those are used, with the volatile results in a lighter opacity, replaced by the finalized results when they come in. To configure this in our SpeechTranscriber, we’re going to set these option types. I'm adding the audioTimeRange option so that we get timing information.

That will allow us to sync the playback of the text to the audio.

There are also pre-configured presets that offer different options.

We’re now going to set up the SpeechAnalyzer object with our SpeechTranscriber module.

This unlocks the ability for us to get the audio format that we need to use.

We're also now able to ensure our speech-to-text model is in place.

To finish up our SpeechTranscriber setup, we want to save references to the AsyncStream input and start the analyzer.

Now that we’re done setting up the SpeechTranscriber, let’s check out how we get the models. In our ensure model method, we're going to add checks for whether SpeechTranscriber supports transcription for the language we want.

We’ll also check whether the language is downloaded and installed.

If the language is supported but not downloaded, we can go ahead and make a request to AssetInventory to download support.

Remember that transcription is entirely on device but the models need to be fetched. The download request includes a progress object that you can use to let your user know what’s happening.

Your app can have language support for a limited number of languages at a time. If you exceed the limit, you can ask AssetInventory to deallocate one or more of them to free up a spot.

Now that we’ve gotten our models, let’s get to the fun part - the results.

Next to our SpeechTranscriber setup code, I'm creating a task and saving a reference to it.

I'm also creating two variables to track our volatile and finalized results.

The SpeechTranscriber returns results via AsyncStream. Each result object has a few different fields.

The first one we want to get is text, which is represented by an AttributedString. This is your transcription result for a segment of audio. Each time we get a result back in the stream, we'll want to check whether it’s volatile or finalized by using the isFinal property.

If it’s volatile, we’ll save it to volatileTranscript.

Whenever we get a finalized result, we clear out the volatileTranscript and add the result to finalizedTranscript.

If we don’t clear out our volatile results, we could end up with duplicates.

Whenever we get a finalized result, I'm also going to write that out to our Story model to be used later.

I'm also setting some conditional formatting using the SwiftUI AttributedString APIs.

This will allow us to visualize the transcription results as they transition from volatile to finalized.

If you’re wondering how I’ll get the timing data of the transcript, it’s conveniently part of the attributed string.

Each run has an audioTimeRange attribute represented as CMTimeRange. I’ll use that in my view code to highlight the correct segment. Let’s next check out how to set up our audio input.

In my record function, which I call as the user presses Record, I'm going to request audio permission and start the AVAudioSession. We should also ensure that the app is configured to use the microphone in the project settings.

I am then going to call my setUpTranscriber function that we created before.

Finally, I'm going to handle input from my audio stream. Let’s check out how I set that up. A few things are happening here. We’re configuring the AVAudioEngine to return an AsyncStream and passing the incoming buffers to the stream.

We’re also writing the audio to disk.

Finally, we’re starting the audioEngine.

Back in my Record function, I am passing that AsyncStream input to the transcriber.

Audio sources have different output formats and sample rates. SpeechTranscriber gave us a bestAvailableAudioFormat that we can use.

I'm passing our audio buffers through a conversion step to ensure that the format matches bestAvailableAudioFormat.

I'll then route the AsyncStream to the inputBuilder object from the SpeechTranscriber. When we stop recording, we want to do a few things. I stopped the audio engine and the transcriber. It’s important to cancel your tasks and also to call finalize on your analyzer stream. This will ensure that any volatile results get finalized. Let's check out how we can connect all of this to our view.

My TranscriptView has a binding to the current story, and one to our SpokenWordTranscriber. If we’re recording, we show a concatenation of the finalized transcript with the volatile transcript that we’re observing from our SpokenWordTranscriber class. For playback, we show the final transcript from the data model. I’ve added a method to break up the sentences to make it visually less cluttered.

A key feature I mentioned was highlighting each word as it’s played back. I’m using some helper methods to calculate whether each run should be highlighted, based on its audioTimeRange attribute and the current playback time.

SpeechTranscriber’s accuracy is great for so many reasons, not least of which is the ability to use Apple Intelligence to do useful transformations on the output.

Here, I’m using the new FoundationModels API to generate a title for my story when it’s done. The API makes it super simple to create a clever title, so I don’t have to think of one! To learn more about the FoundationModels APIs, check out the session titled ‘Meet the Foundation Models Framework’ .

Let's see our feature in action! I'm going to tap the Plus button to create a new story.

Then, I’ll start recording. "Once upon a time in the mystical land of Magenta, there was a little girl named Delilah who lived in a castle on the hill. Delilah spent her days exploring the forest and tending to the animals there." When the user is done, they can play it back and each word is highlighted in time with the audio.

"Once upon a time, in the mystical land of Magenta, there was a little girl named Delilah who lived in a castle on the hill.

Delilah spent her days exploring the forest and tending to the animals there." SpeechAnalyzer and SpeechTranscriber enabled us to build a whole app with very little startup time. To learn more, check out the Speech Framework documentation, which includes the sample app that we just created. And that’s SpeechAnalyzer! We know you’ll build amazing features with it. Thanks so much for joining us!

Code

5:21 - Transcribe a file

// Set up transcriber. Read results asynchronously, and concatenate them together.
let transcriber = SpeechTranscriber(locale: locale, preset: .offlineTranscription)
async let transcriptionFuture = try transcriber.results
    .reduce("") { str, result in str + result.text }

let analyzer = SpeechAnalyzer(modules: [transcriber])
if let lastSample = try await analyzer.analyzeSequence(from: file) {
    try await analyzer.finalizeAndFinish(through: lastSample)
} else {
    await analyzer.cancelAndFinishNow()
}
    
return try await transcriptionFuture

11:02 - Speech Transcriber setup (volatile results + timestamps)

func setUpTranscriber() async throws {
        transcriber = SpeechTranscriber(locale: Locale.current,
                                        transcriptionOptions: [],
                                        reportingOptions: [.volatileResults],
                                        attributeOptions: [.audioTimeRange])
    }

11:47 - Speech Transcriber setup (volatile results, no timestamps)

// transcriber = SpeechTranscriber(locale: Locale.current, preset: .progressiveLiveTranscription)

11:54 - Set up SpeechAnalyzer

func setUpTranscriber() async throws {
    transcriber = SpeechTranscriber(locale: Locale.current,
                                    transcriptionOptions: [],
                                    reportingOptions: [.volatileResults],
                                    attributeOptions: [.audioTimeRange])
    
    guard let transcriber else {
        throw TranscriptionError.failedToSetupRecognitionStream
    }

    analyzer = SpeechAnalyzer(modules: [transcriber])
}

12:00 - Get audio format

func setUpTranscriber() async throws {
    transcriber = SpeechTranscriber(locale: Locale.current,
                                    transcriptionOptions: [],
                                    reportingOptions: [.volatileResults],
                                    attributeOptions: [.audioTimeRange])
    
    guard let transcriber else {
        throw TranscriptionError.failedToSetupRecognitionStream
    }

    analyzer = SpeechAnalyzer(modules: [transcriber])
    
    self.analyzerFormat = await SpeechAnalyzer.bestAvailableAudioFormat(compatibleWith: [transcriber])
}

12:06 - Ensure models

func setUpTranscriber() async throws {
    transcriber = SpeechTranscriber(locale: Locale.current,
                                    transcriptionOptions: [],
                                    reportingOptions: [.volatileResults],
                                    attributeOptions: [.audioTimeRange])
    
    guard let transcriber else {
        throw TranscriptionError.failedToSetupRecognitionStream
    }

    analyzer = SpeechAnalyzer(modules: [transcriber])
    
    self.analyzerFormat = await SpeechAnalyzer.bestAvailableAudioFormat(compatibleWith: [transcriber])
    
    do {
        try await ensureModel(transcriber: transcriber, locale: Locale.current)
    } catch let error as TranscriptionError {
        print(error)
        return
    }
}

12:15 - Finish SpeechAnalyzer setup

func setUpTranscriber() async throws {
    transcriber = SpeechTranscriber(locale: Locale.current,
                                    transcriptionOptions: [],
                                    reportingOptions: [.volatileResults],
                                    attributeOptions: [.audioTimeRange])
    
    guard let transcriber else {
        throw TranscriptionError.failedToSetupRecognitionStream
    }

    analyzer = SpeechAnalyzer(modules: [transcriber])
    
    self.analyzerFormat = await SpeechAnalyzer.bestAvailableAudioFormat(compatibleWith: [transcriber])
    
    do {
        try await ensureModel(transcriber: transcriber, locale: Locale.current)
    } catch let error as TranscriptionError {
        print(error)
        return
    }
    
    (inputSequence, inputBuilder) = AsyncStream<AnalyzerInput>.makeStream()
    
    guard let inputSequence else { return }
    
    try await analyzer?.start(inputSequence: inputSequence)
}

12:30 - Check for language support

public func ensureModel(transcriber: SpeechTranscriber, locale: Locale) async throws {
        guard await supported(locale: locale) else {
            throw TranscriptionError.localeNotSupported
        }
    }
    
    func supported(locale: Locale) async -> Bool {
        let supported = await SpeechTranscriber.supportedLocales
        return supported.map { $0.identifier(.bcp47) }.contains(locale.identifier(.bcp47))
    }

    func installed(locale: Locale) async -> Bool {
        let installed = await Set(SpeechTranscriber.installedLocales)
        return installed.map { $0.identifier(.bcp47) }.contains(locale.identifier(.bcp47))
    }

12:39 - Check for model installation

public func ensureModel(transcriber: SpeechTranscriber, locale: Locale) async throws {
        guard await supported(locale: locale) else {
            throw TranscriptionError.localeNotSupported
        }
        
        if await installed(locale: locale) {
            return
        } else {
            try await downloadIfNeeded(for: transcriber)
        }
    }
    
    func supported(locale: Locale) async -> Bool {
        let supported = await SpeechTranscriber.supportedLocales
        return supported.map { $0.identifier(.bcp47) }.contains(locale.identifier(.bcp47))
    }

    func installed(locale: Locale) async -> Bool {
        let installed = await Set(SpeechTranscriber.installedLocales)
        return installed.map { $0.identifier(.bcp47) }.contains(locale.identifier(.bcp47))
    }

12:52 - Download the model

func downloadIfNeeded(for module: SpeechTranscriber) async throws {
        if let downloader = try await AssetInventory.assetInstallationRequest(supporting: [module]) {
            self.downloadProgress = downloader.progress
            try await downloader.downloadAndInstall()
        }
    }

13:19 - Deallocate an asset

func deallocate() async {
        let allocated = await AssetInventory.allocatedLocales
        for locale in allocated {
            await AssetInventory.deallocate(locale: locale)
        }
    }

13:31 - Speech result handling

recognizerTask = Task {
            do {
                for try await case let result in transcriber.results {
                    let text = result.text
                    if result.isFinal {
                        finalizedTranscript += text
                        volatileTranscript = ""
                        updateStoryWithNewText(withFinal: text)
                        print(text.audioTimeRange)
                    } else {
                        volatileTranscript = text
                        volatileTranscript.foregroundColor = .purple.opacity(0.4)
                    }
                }
            } catch {
                print("speech recognition failed")
            }
        }

15:13 - Set up audio recording

func record() async throws {
        self.story.url.wrappedValue = url
        guard await isAuthorized() else {
            print("user denied mic permission")
            return
        }
#if os(iOS)
        try setUpAudioSession()
#endif
        try await transcriber.setUpTranscriber()
                
        for await input in try await audioStream() {
            try await self.transcriber.streamAudioToTranscriber(input)
        }
    }

15:37 - Set up audio recording via AVAudioEngine

#if os(iOS)
    func setUpAudioSession() throws {
        let audioSession = AVAudioSession.sharedInstance()
        try audioSession.setCategory(.playAndRecord, mode: .spokenAudio)
        try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
    }
#endif
    
    private func audioStream() async throws -> AsyncStream<AVAudioPCMBuffer> {
        try setupAudioEngine()
        audioEngine.inputNode.installTap(onBus: 0,
                                         bufferSize: 4096,
                                         format: audioEngine.inputNode.outputFormat(forBus: 0)) { [weak self] (buffer, time) in
            guard let self else { return }
            writeBufferToDisk(buffer: buffer)
            self.outputContinuation?.yield(buffer)
        }
        
        audioEngine.prepare()
        try audioEngine.start()
        
        return AsyncStream(AVAudioPCMBuffer.self, bufferingPolicy: .unbounded) {
            continuation in
            outputContinuation = continuation
        }
    }

16:01 - Stream audio to SpeechAnalyzer and SpeechTranscriber

func streamAudioToTranscriber(_ buffer: AVAudioPCMBuffer) async throws {
        guard let inputBuilder, let analyzerFormat else {
            throw TranscriptionError.invalidAudioDataType
        }
        
        let converted = try self.converter.convertBuffer(buffer, to: analyzerFormat)
        let input = AnalyzerInput(buffer: converted)
        
        inputBuilder.yield(input)
    }

16:29 - Finalize the transcript stream

try await analyzer?.finalizeAndFinishThroughEndOfInput()

Summary

  • 0:00 - Introduction

  • Apple is introducing SpeechAnalyzer, a new speech-to-text API and technology in iOS 26, replacing SFSpeechRecognizer introduced in iOS 10. SpeechAnalyzer, built with Swift, is faster, more flexible, and supports long-form and distant audio, making it suitable for various use cases such as lectures, meetings, and conversations. The new API enables you to create live transcription features and is already powering system apps like Notes, Voice Memos, and Journal. When combined with Apple Intelligence, it facilitates powerful features like Call Summarization.

  • 2:41 - SpeechAnalyzer API

  • The API design centers around the SpeechAnalyzer class, which manages analysis sessions. By adding a transcriber module, the session becomes a transcription session capable of performing speech-to-text processing. Audio buffers are passed to the analyzer instance, which routes them through the transcriber's speech-to-text model. The model predicts text and metadata, which are returned asynchronously to the application using Swift's async sequences. All API operations are scheduled using timecodes on the audio timeline, ensuring predictable order and independence. The transcriber delivers results in sequence, covering specific audio ranges. An optional feature allows iterative transcription within a range, providing immediate, though less accurate, "volatile results" for faster UI feedback, which are later refined into finalized results. Looking forward in this presentation, a use case is discussed that demonstrates how to create a transcriber module, set the locale, read audio from a file, concatenate results using async sequences, and return the final transcription as an attributed string. The API enables concurrent and asynchronous processing, decoupling audio input from results, and can be expanded to handle more complex needs across different views, models, and view models, which is demonstrated later.

  • 7:03 - SpeechTranscriber mode

  • Apple developed a new speech-to-text model for the SpeechTranscriber class, designed to handle various scenarios such as long-form recordings, meetings, and live transcriptions with low latency and high accuracy. The model operates entirely on-device, ensuring privacy and efficiency. It does not increase the app's size or memory usage and automatically updates. You can easily integrate the model into your applications using the AssetInventory API. The SpeechTranscriber class currently supports several languages and is available across most Apple platforms, with a fallback option, DictationTranscriber, provided for unsupported languages or devices.

  • 9:06 - Build a speech-to-text feature

  • In iOS 18, the Notes app has been enhanced with new features that allow people to record and transcribe phone calls, live audio, and recorded audio. These features are integrated with Apple Intelligence to generate summaries. The Speech team developed SpeechAnalyzer and SpeechTranscriber, enabling high-quality, on-device transcription that is fast and accurate, even at distances. You can now use these tools to create your own customized transcription features. An example app is designed for kids; it records and transcribes bedtime stories. The app displays real-time transcription results and highlights the corresponding text segment during audio playback. To implement live transcription in an app, follow three main steps: configure the SpeechTranscriber with the appropriate locale and options, ensure the necessary speech-to-text model is downloaded and installed on the device, and then handle the transcription results as they are received via an AsyncStream. The results include both volatile (realtime guesses) and finalized text, allowing for smooth syncing between text and audio playback. When a finalized result is obtained, the 'volatileTranscript' is cleared, and the result is added to the 'finalizedTranscript' to prevent duplicates. The finalized result is also written to the Story model for later use and visualized with conditional formatting using SwiftUI AttributedString APIs. Set up audio input by requesting permission, starting the 'AVAudioSession', and configuring the AVAudioEngine to return an AsyncStream. The audio is written to disk and passed to the transcriber after being converted to the best available audio format. Upon stopping recording, the audio engine and transcriber are stopped, and any volatile results are finalized. The 'TranscriptView' displays the concatenation of finalized and volatile transcripts during recording and the final transcript from the data model during playback, with words highlighted in time with the audio. In the example app, Apple Intelligence is utilized to generate a title for the story using the FoundationModels API showing how you can use Apple Intelligence to perform useful transformations on the speech-to-text output. The Speech Framework enables the development of this app with minimal startup time, and further details can be found in its documentation.

Bring Swift Charts to the third dimension

Learn how to bring your 2D Swift Charts to the third dimension with Chart3D and visualize your data sets from completely new perspectives. Plot your data in 3D, visualize mathematical surfaces, and customize everything from the camera to the materials to make your 3D charts more intuitive and delightful. To get the most out of this session, we recommend being familiar with creating 2D Swift Charts.

Chapters

Resources

Related Videos

WWDC24

WWDC22

Transcript

Hi, I’m Mike, and I'm an engineer on the System Experience team. Today, I'd like to discuss an exciting new feature in Swift Charts. Swift Charts is a framework for creating approachable and stunning charts.

Charts are used across Apple platforms for things like checking the temperature in Weather, showing battery usage in Settings, and graphing mathematical functions in Math Notes.

Using the building blocks available in Swift Charts, you can create 2D charts using components such as axis marks, labels, and line plots. But, the plot thickens! New in iOS, macOS, and visionOS 26, Swift Charts now supports 3D charts. That's right, you can now enable people to explore and understand datasets from completely new perspectives. In this session, I'll show how to bring a set of 2D charts to the third dimension by plotting in 3D, I'll cover how surface plots can be used for graphing three-dimensional math functions, and lastly, I'll go over ways you can customize your charts to make them more intuitive and delightful. But first, I have an important announcement: I love penguins! In fact, one of my favorite datasets contains measurements for hundreds of penguins across the Palmer Archipelago in Antarctica. This data includes the beak length, flipper length, and weight of each penguin, and is grouped by the species of penguin: Chinstrap, Gentoo, and Adélie.

I'll use Swift Charts to fish for insights in this data, and show how plotting in 3D can help visualize the differences between the penguin species. So, I built a 2D chart here to show the relationship between the flipper lengths and weights of the penguins. PointMark is used to plot each penguin's flipper length and weight, and foregroundStyle colors the points by species, and creates the legend in the corner. This is great, and Swift Charts made it easy to construct a chart. This chart shows that Chinstrap and Adélie penguins have similar flipper lengths and weights, while Gentoo penguins have longer flippers and weigh more.

The penguin dataset also includes beak length, so I also made a chart that plots the beak length and weight. This one shows that Chinstrap and Gentoo penguins have similar beak lengths, while Adélie penguins have shorter beaks.

Lastly, I made a chart for the beak lengths and flipper lengths, where it seems penguins with longer beaks tend to have longer flippers too.

Each of these 2D charts are great, and they provide good insights into the relationships between two properties at a time. However, Swift Charts can now take these charts to the next level, by making a single chart that contains all of this data.

And it's called Chart3D. Chart3D takes familiar concepts from 2D charts, such as scatter plots, and brings them into full 3D. To use a 3D chart, I'll change Chart to Chart3D.

PointMark works great in Chart3D, and it now takes a Z value. Here, I use the Beak Length.

I'll set the Z-axis label to "Beak Length" as well. That's it! With a few lines of code and Chart3D, I can immediately see the differences between the penguin species in a fun and interactive way.

I can use simple gestures to rotate the chart to precise angles and see three clusters of data points.

I can also view the chart from the sides to compare two properties at a time, similar to if I was viewing the chart in 2D. 3D charts work great when the shape of the data is more important than the exact values. This can happen naturally when the data itself is three-dimensional, especially if it represents a physical position in 3D space. Also, interactivity is key to understanding three-dimensional datasets, so only consider 3D charts if requiring interaction enhances the experience in your app. When it comes to the best representation for your dataset, the choice between 2D and 3D isn't black and white.

PointMarks, RuleMarks, and RectangleMarks have all been updated for 3D charts. And now, unique to Chart3D, is SurfacePlot Next, I'll take a deep dive into how SurfacePlot works. SurfacePlot is the three-dimensional extension of LinePlot. It plots a mathematical surface containing up to two variables in three dimensions.

The new SurfacePlot API is similar to the LinePlot API. It accepts a closure that takes two doubles, and returns a double.

After entering a math expression in the closure, SurfacePlot evaluates the expression for different values of X and Z and creates a continuous surface of the computed Y values. These surfaces can be as intricate, or as simple, as you want.

To learn more about function plots via the LinePlot API, check out "Swift Charts: Vectorized and function plots" from WWDC24.

You know what? Now that I'm looking at the penguin dataset again, there appears to be a linear relationship between the beak length and flipper length of a penguin, and how much it weighs. That seems like a reasonable guess, but instead of winging it, I can use SurfacePlot to show a linear regression of the data.

I've defined a LinearRegression class that estimates a y value based on the independent x and z variables. More specifically, I set the linear regression to estimate the weight of a penguin based on the flipper length and beak length. This linear regression is then used in SurfacePlot to plot the estimated weights as a continuous surface.

Great! It does seem like there's a linear relationship in this data. The SurfacePlot shows a positive correlation between the flipper length and weight, and there's a slight positive correlation between the beak length and weight as well. Now, I'll go over some of the great ways to customize the style and behavior of Chart3D.

While I was interacting with the penguin dataset earlier, I noticed that changing the angle of the chart also changes the appearance of the data. This angle is great for showing the clusters of data points. While this angle works well for showing a linear relationship. These angles are known as the pose of the chart.

It's important to choose an initial pose that provides a good representation for your data.

For dynamic data where you don't know the values beforehand, try to choose an initial pose that works well for typical datasets of that type.

The pose of the chart is adjusted using the Chart3DPose modifier, and it takes a Chart3DPose.

I can set the pose to specific values, such as front, or I can define a custom pose. This initializer takes two parameters: the azimuth, which rotates the chart left and right, and the inclination, which tilts the chart up and down.

Next, notice how points near the back of the chart are the same size as points near the front of the chart. This can make it easier to compare sizes and distances across the chart regardless of depth.

It's also great for viewing charts from the side, as it effectively turns a 3D chart into a 2D one.

This is known as an orthographic camera projection.

Chart3D offers two camera projections: orthographic, which is the default behavior, and perspective. With Perspective projection, data points farther away appear smaller, and parallel lines converge. This allows for an immersive experience and can help with depth perception.

The camera projection is set using the chart3DCameraProjection modifier.

SurfacePlots have a few customization options for the surface styles as well. ForegroundStyle accepts gradients such as LinearGradient or EllipticalGradient, and it supports two new values: heightBased, which colors points on the surface based on the height of the surface at that point, and normalBased, which colors points on the surface based on the angle of the surface at that point.

There are many other modifiers available for Chart3D, some of which may be familiar from 2D charts. Use these to customize surface styles, PointMark symbols, the chart domain and axis marks, or the behavior of selection. By combining these view modifiers along with PointMark, RuleMark, RectangleMark, and SurfacePlot, there are all sorts of interesting charts that can be achieved. This is just the tip of the iceberg.

And, 3D charts work and look great on Vision Pro, where it's a natural fit for three dimensional data sets! That's some of the new 3D features coming to Swift Charts. Once you've decided 3D is a good representation for your data, try plotting with Chart3D to bring a new level of depth to your charts, and use Swift Charts' customization APIs to design your own approachable and stunning charts.

To learn about best practices for incorporating Swift Charts into your apps, check out "Design app experiences with charts" from WWDC22.

Thank you. I can't wait to see the types of charts that you bring to the third dimension.

Code

2:03 - A scatterplot of a penguin's flipper length and weight

// A scatterplot of a penguin's flipper length and weight

import SwiftUI
import Charts

struct PenguinChart: View {
  var body: some View {
    Chart(penguins) { penguin in
      PointMark(
        x: .value("Flipper Length", penguin.flipperLength),
        y: .value("Weight", penguin.weight)
      )
      .foregroundStyle(by: .value("Species", penguin.species))
    }
    .chartXAxisLabel("Flipper Length (mm)")
    .chartYAxisLabel("Weight (kg)")
    .chartXScale(domain: 160...240)
    .chartYScale(domain: 2...7)
    .chartXAxis {
      AxisMarks(values: [160, 180, 200, 220, 240]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartYAxis {
      AxisMarks(values: [2, 3, 4, 5, 6, 7]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
  }
}

2:39 - A scatterplot of a penguin's beak length and weight

// A scatterplot of a penguin's beak length and weight

import SwiftUI
import Charts

struct PenguinChart: View {
  var body: some View {
    Chart(penguins) { penguin in
      PointMark(
        x: .value("Beak Length", penguin.beakLength),
        y: .value("Weight", penguin.weight)
      )
      .foregroundStyle(by: .value("Species", penguin.species))
    }
    .chartXAxisLabel("Beak Length (mm)")
    .chartYAxisLabel("Weight (kg)")
    .chartXScale(domain: 30...60)
    .chartYScale(domain: 2...7)
    .chartXAxis {
      AxisMarks(values: [30, 40, 50, 60]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartYAxis {
      AxisMarks(values: [2, 3, 4, 5, 6, 7]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
  }
}

2:51 - A scatterplot of a penguin's beak length and flipper length

// A scatterplot of a penguin's beak length and flipper length

import SwiftUI
import Charts

struct PenguinChart: View {
  var body: some View {
    Chart(penguins) { penguin in
      PointMark(
        x: .value("Beak Length", penguin.beakLength),
        y: .value("Flipper Length", penguin.flipperLength)
      )
      .foregroundStyle(by: .value("Species", penguin.species))
    }
    .chartXAxisLabel("Beak Length (mm)")
    .chartYAxisLabel("Flipper Length (mm)")
    .chartXScale(domain: 30...60)
    .chartYScale(domain: 160...240)
    .chartXAxis {
      AxisMarks(values: [30, 40, 50, 60]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartYAxis {
      AxisMarks(values: [160, 180, 200, 220, 240]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
  }
}

3:28 - A scatterplot of a penguin's flipper length, beak length, and weight

// A scatterplot of a penguin's flipper length, beak length, and weight

import SwiftUI
import Charts

struct PenguinChart: View {
  var body: some View {
    let xLabel = "Flipper Length (mm)"
    let yLabel = "Weight (kg)"
    let zLabel = "Beak Length (mm)"

    Chart3D(penguins) { penguin in
      PointMark(
        x: .value("Flipper Length", penguin.flipperLength),
        y: .value("Weight", penguin.weight),
        z: .value("Beak Length", penguin.beakLength)
      )
      .foregroundStyle(by: .value("Species", penguin.species))
    }
    .chartXAxisLabel(xLabel)
    .chartYAxisLabel(yLabel)
    .chartZAxisLabel(zLabel)
    .chartXScale(domain: 160...240, range: -0.5...0.5)
    .chartYScale(domain: 2...7, range: -0.5...0.5)
    .chartZScale(domain: 30...60, range: -0.5...0.5)
    .chartXAxis {
      AxisMarks(values: [160, 180, 200, 220, 240]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartYAxis {
      AxisMarks(values: [2, 3, 4, 5, 6, 7]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartZAxis {
      AxisMarks(values: [30, 40, 50, 60]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
  }
}

5:19 - A surface plot showing mathematical functions (x * z)

// A surface plot showing mathematical functions

import SwiftUI
import Charts

var SurfacePlotChart: View {
  var body: some View {
    Chart3D {
      SurfacePlot(x: "X", y: "Y", z: "Z") { x, z in
        // (Double, Double) -> Double
        x * z
      }
    }
  }
}

5:43 - A surface plot showing mathematical functions

// A surface plot showing mathematical functions

import SwiftUI
import Charts

var SurfacePlotChart: View {
  var body: some View {
    Chart3D {
      SurfacePlot(x: "X", y: "Y", z: "Z") { x, z in
        // (Double, Double) -> Double
        (sin(5 * x) + sin(5 * z)) / 2
      }
    }
  }
}

5:46 - A surface plot showing mathematical functions (-z)

// A surface plot showing mathematical functions

import SwiftUI
import Charts

var SurfacePlotChart: View {
  var body: some View {
    Chart3D {
      SurfacePlot(x: "X", y: "Y", z: "Z") { x, z in
        // (Double, Double) -> Double
        -z
      }
    }
  }
}

6:19 - Present a linear regression of the penguin data

// Present a linear regression of the penguin data

import SwiftUI
import Charts
import CreateML
import TabularData

final class LinearRegression: Sendable {
  let regressor: MLLinearRegressor

  init<Data: RandomAccessCollection>(
    _ data: Data,
    x xPath: KeyPath<Data.Element, Double>,
    y yPath: KeyPath<Data.Element, Double>,
    z zPath: KeyPath<Data.Element, Double>
  ) {
    let x = Column(name: "X", contents: data.map { $0[keyPath: xPath] })
    let y = Column(name: "Y", contents: data.map { $0[keyPath: yPath] })
    let z = Column(name: "Z", contents: data.map { $0[keyPath: zPath] })
    let data = DataFrame(columns: [x, y, z].map { $0.eraseToAnyColumn() })
    regressor = try! MLLinearRegressor(trainingData: data, targetColumn: "Y")
  }

  func callAsFunction(_ x: Double, _ z: Double) -> Double {
    let x = Column(name: "X", contents: [x])
    let z = Column(name: "Z", contents: [z])
    let data = DataFrame(columns: [x, z].map { $0.eraseToAnyColumn() })
    return (try? regressor.predictions(from: data))?.first as? Double ?? .nan
  }
}

let linearRegression = LinearRegression(
  penguins,
  x: \.flipperLength,
  y: \.weight,
  z: \.beakLength
)

struct PenguinChart: some View {
  var body: some View {
    let xLabel = "Flipper Length (mm)"
    let yLabel = "Weight (kg)"
    let zLabel = "Beak Length (mm)"

    Chart3D {
      ForEach(penguins) { penguin in
        PointMark(
          x: .value("Flipper Length", penguin.flipperLength),
          y: .value("Weight", penguin.weight),
          z: .value("Beak Length", penguin.beakLength),
        )
        .foregroundStyle(by: .value("Species", penguin.species))
      }

      SurfacePlot(x: "Flipper Length", y: "Weight", z: "Beak Length") { flipperLength, beakLength in
        linearRegression(flipperLength, beakLength)
      }
      .foregroundStyle(.gray)
    }
    .chartXAxisLabel(xLabel)
    .chartYAxisLabel(yLabel)
    .chartZAxisLabel(zLabel)
    .chartXScale(domain: 160...240, range: -0.5...0.5)
    .chartYScale(domain: 2...7, range: -0.5...0.5)
    .chartZScale(domain: 30...60, range: -0.5...0.5)
    .chartXAxis {
      AxisMarks(values: [160, 180, 200, 220, 240]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartYAxis {
      AxisMarks(values: [2, 3, 4, 5, 6, 7]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartZAxis {
      AxisMarks(values: [30, 40, 50, 60]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
  }
}

7:50 - Adjust the initial chart pose (Default)

// Adjust the initial chart pose

import SwiftUI
import Charts

struct PenguinChart: View {
  @State var pose: Chart3DPose = .default

  var body: some View {
    let xLabel = "Flipper Length (mm)"
    let yLabel = "Weight (kg)"
    let zLabel = "Beak Length (mm)"

    Chart3D(penguins) { penguin in
      PointMark(
        x: .value("Flipper Length", penguin.flipperLength),
        y: .value("Weight", penguin.weight),
        z: .value("Beak Length", penguin.beakLength)
      )
      .foregroundStyle(by: .value("Species", penguin.species))
    }
    .chart3DPose($pose)
    .chartXAxisLabel(xLabel)
    .chartYAxisLabel(yLabel)
    .chartZAxisLabel(zLabel)
    .chartXScale(domain: 160...240, range: -0.5...0.5)
    .chartYScale(domain: 2...7, range: -0.5...0.5)
    .chartZScale(domain: 30...60, range: -0.5...0.5)
    .chartXAxis {
      AxisMarks(values: [160, 180, 200, 220, 240]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartYAxis {
      AxisMarks(values: [2, 3, 4, 5, 6, 7]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartZAxis {
      AxisMarks(values: [30, 40, 50, 60]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
  }
}

8:02 - Adjust the initial chart pose (Front)

// Adjust the initial chart pose

import SwiftUI
import Charts

struct PenguinChart: View {
  @State var pose: Chart3DPose = .front

  var body: some View {
    let xLabel = "Flipper Length (mm)"
    let yLabel = "Weight (kg)"
    let zLabel = "Beak Length (mm)"

    Chart3D(penguins) { penguin in
      PointMark(
        x: .value("Flipper Length", penguin.flipperLength),
        y: .value("Weight", penguin.weight),
        z: .value("Beak Length", penguin.beakLength)
      )
      .foregroundStyle(by: .value("Species", penguin.species))
    }
    .chart3DPose($pose)
    .chartXAxisLabel(xLabel)
    .chartYAxisLabel(yLabel)
    .chartZAxisLabel(zLabel)
    .chartXScale(domain: 160...240, range: -0.5...0.5)
    .chartYScale(domain: 2...7, range: -0.5...0.5)
    .chartZScale(domain: 30...60, range: -0.5...0.5)
    .chartXAxis {
      AxisMarks(values: [160, 180, 200, 220, 240]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartYAxis {
      AxisMarks(values: [2, 3, 4, 5, 6, 7]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartZAxis {
      AxisMarks(values: [30, 40, 50, 60]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
  }
}

8:09 - Adjust the initial chart pose (Custom)

// Adjust the initial chart pose

import SwiftUI
import Charts

struct PenguinChart: View {
  @State var pose = Chart3DPose(
    azimuth: .degrees(20),
    inclination: .degrees(7)
  )

  var body: some View {
    let xLabel = "Flipper Length (mm)"
    let yLabel = "Weight (kg)"
    let zLabel = "Beak Length (mm)"

    Chart3D(penguins) { penguin in
      PointMark(
        x: .value("Flipper Length", penguin.flipperLength),
        y: .value("Weight", penguin.weight),
        z: .value("Beak Length", penguin.beakLength)
      )
      .foregroundStyle(by: .value("Species", penguin.species))
    }
    .chart3DPose($pose)
    .chartXAxisLabel(xLabel)
    .chartYAxisLabel(yLabel)
    .chartZAxisLabel(zLabel)
    .chartXScale(domain: 160...240, range: -0.5...0.5)
    .chartYScale(domain: 2...7, range: -0.5...0.5)
    .chartZScale(domain: 30...60, range: -0.5...0.5)
    .chartXAxis {
      AxisMarks(values: [160, 180, 200, 220, 240]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartYAxis {
      AxisMarks(values: [2, 3, 4, 5, 6, 7]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartZAxis {
      AxisMarks(values: [30, 40, 50, 60]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
  }
}

9:15 - Adjust the initial chart pose and camera projection

// Adjust the initial chart pose and camera projection

import SwiftUI
import Charts

struct PenguinChart: View {
  @State var pose = Chart3DPose(
    azimuth: .degrees(20),
    inclination: .degrees(7)
  )

  var body: some View {
    let xLabel = "Flipper Length (mm)"
    let yLabel = "Weight (kg)"
    let zLabel = "Beak Length (mm)"

    Chart3D(penguins) { penguin in
      PointMark(
        x: .value("Flipper Length", penguin.flipperLength),
        y: .value("Weight", penguin.weight),
        z: .value("Beak Length", penguin.beakLength)
      )
      .foregroundStyle(by: .value("Species", penguin.species))
    }
    .chart3DPose($pose)
    .chart3DCameraProjection(.perspective)
    .chartXAxisLabel(xLabel)
    .chartYAxisLabel(yLabel)
    .chartZAxisLabel(zLabel)
    .chartXScale(domain: 160...240, range: -0.5...0.5)
    .chartYScale(domain: 2...7, range: -0.5...0.5)
    .chartZScale(domain: 30...60, range: -0.5...0.5)
    .chartXAxis {
      AxisMarks(values: [160, 180, 200, 220, 240]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartYAxis {
      AxisMarks(values: [2, 3, 4, 5, 6, 7]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .chartZAxis {
      AxisMarks(values: [30, 40, 50, 60]) {
        AxisTick()
        AxisGridLine()
        AxisValueLabel()
      }
    }
  }
}

9:24 - Customize the surface styles for a sinc function

// Customize the surface styles for a sinc function

import SwiftUI
import Charts

struct SurfacePlotChart: View {
  var body: some View {
    Chart3D {
      SurfacePlot(x: "X", y: "Y", z: "Z") { x, z in
        let h = hypot(x, z)
        return sin(h) / h
      }
    }
    .chartXScale(domain: -10...10, range: -0.5...0.5)
    .chartZScale(domain: -10...10, range: -0.5...0.5)
    .chartYScale(domain: -0.23...1, range: -0.5...0.5)
  }
}

9:29 - Customize the surface styles for a sinc function (EllipticalGradient)

// Customize the surface styles for a sinc function

import SwiftUI
import Charts

struct SurfacePlotChart: View {
  var body: some View {
    Chart3D {
      SurfacePlot(x: "X", y: "Y", z: "Z") { x, z in
        let h = hypot(x, z)
        return sin(h) / h
      }
      .foregroundStyle(EllipticalGradient(colors: [.red, .orange, .yellow, .green, .blue, .indigo, .purple]))
    }
    .chartXScale(domain: -10...10, range: -0.5...0.5)
    .chartZScale(domain: -10...10, range: -0.5...0.5)
    .chartYScale(domain: -0.23...1, range: -0.5...0.5)
  }
}

9:38 - Customize the surface styles for a sinc function (heightBased)

// Customize the surface styles for a sinc function

import SwiftUI
import Charts

struct SurfacePlotChart: View {
  var body: some View {
    Chart3D {
      SurfacePlot(x: "X", y: "Y", z: "Z") { x, z in
        let h = hypot(x, z)
        return sin(h) / h
      }
      .foregroundStyle(.heightBased)
    }
    .chartXScale(domain: -10...10, range: -0.5...0.5)
    .chartZScale(domain: -10...10, range: -0.5...0.5)
    .chartYScale(domain: -0.23...1, range: -0.5...0.5)
  }
}

9:47 - Customize the surface styles for a sinc function (normalBased)

// Customize the surface styles for a sinc function

import SwiftUI
import Charts

struct SurfacePlotChart: View {
  var body: some View {
    Chart3D {
      SurfacePlot(x: "X", y: "Y", z: "Z") { x, z in
        let h = hypot(x, z)
        return sin(h) / h
      }
      .foregroundStyle(.normalBased)
    }
    .chartXScale(domain: -10...10, range: -0.5...0.5)
    .chartZScale(domain: -10...10, range: -0.5...0.5)
    .chartYScale(domain: -0.23...1, range: -0.5...0.5)
  }
}

Bring your SceneKit project to RealityKit

Understand SceneKit deprecation and explore how to transition your 3D projects to RealityKit, Apple's recommended high-level 3D engine. We'll clarify what SceneKit deprecation means for your projects, compare key concepts between the two engines, and show you how to port a sample SceneKit game to RealityKit. We'll also explore the potential of RealityKit across all supported platforms to help you create amazing 3D experiences with your apps and games.

Chapters

Resources

Related Videos

WWDC25

WWDC24

WWDC23

Transcript

Hi, welcome to the session. I’m Max Cobb, a software engineer here at Apple. I want to start the session off by taking a short look into the past. If you’re a SceneKit developer, you might recognize a super fun and inspiring sample game that Apple shared many years ago. In this project, a red panda, whose name is also Max, runs around a volcanic scene, solving puzzles to save his friends while fighting off enemies. This sample was built using a framework called SceneKit, which allowed developers to create native 3D apps without needing to inflate their bundles with an entire third-party game engine. SceneKit has been around for many years. Actually since OS X Mountain Lion. That’s 13 years ago. A lot has changed in the Apple developer ecosystem since then. New coding paradigms, new hardware, and new platforms. In other words, there's been a big shift in the way that people build and interact with apps. SceneKit was designed with an architecture that made a lot of sense at the time. But as the ecosystem moved on, that made it very challenging to keep SceneKit up to date without introducing breaking changes to existing 3D applications. That's why this year Apple is officially deprecating SceneKit across all platforms.

But what does that mean to existing projects? Is my app still going to work? Do I need to rewrite it? What can I use instead? Let’s break down what this deprecation means to SceneKit developers. First, let me clarify, there’s no need for you to rewrite your apps. This is a soft deprecation, meaning that existing applications that use SceneKit will continue to work. However, if you're planning a new app or a significant update, SceneKit is not recommended.

Secondly, SceneKit is now entering maintenance mode. Apple will only fix critical bugs, so don't expect new features or optimizations moving forward.

At this time, there’s no plan to hard deprecate SceneKit, and Apple will give developers ample notice if this ever were to change.

But if you want to use Apple’s cutting-edge technology and industry-leading combination of hardware and software, the best option is RealityKit.

RealityKit is a modern, general-purpose, high-level 3D engine. It's both powerful and approachable, giving access to industry standards, empowering you to create beautiful, realistic renders with many advanced features that can make your app really shine. It’s the technology powering many third-party apps and system features, like Quick Look on the Mac, for previewing 3D models with the tap of a button. The brand new App Store Tags on iOS 26 is one of the big changes to make your apps even more discoverable. App Store Tags have stylized 3D icons that are rendered with RealityKit. Here’s an example where the App Store surfaced a list of games that were tagged for having great 3D graphics. Also new in iOS, Swift Charts use RealityKit to deliver a third dimension to your data visualizations.

And RealityKit is not just used in VisionOS, but it’s the backbone of that platform. Everything on VisionOS leverages RealityKit, including the buttons in your applications and the windows that hold them. RealityKit makes your app content feel like it’s really there alongside your real-world environment.

RealityKit also puts SwiftUI in the front seat, so SwiftUI developers can feel right at home. It’s supported on: VisionOS, iOS, macOS, iPadOS, and new this year, RealityKit is making its way to a new platform: tvOS, bringing another destination for apps and other content built with RealityKit. This framework is packed with advanced and exciting new features this year too. To learn more, please check out the session “What’s New in RealityKit” from my colleague Lawrence Wong. In today’s session, I will help you understand how RealityKit works when compared to SceneKit, what’s different, the new possibilities, and how you can get started coming from the SceneKit world. I also want to make my game more modern, ready for exciting new features and the platforms I’m thinking about. So during the session, I’ll be explaining the main steps I took porting this fun SceneKit game over to RealityKit. The full sample code for this project is available for you to download and explore. So here's the agenda. I’ll start by explaining the conceptual differences between these two rendering engines and how to interact with them. Next, there’s no 3D game without some cool assets. I’ll explore ways to convert existing SceneKit assets into the format of choice for RealityKit. I’ll show you the tools you can use to compose RealityKit scenes, and start comparing the features in SceneKit and RealityKit side by side, starting with animations. Giving my scene a stylish look by adding dynamic lights, add immersion and personality, custom audio, and bring it home with visual effects like particles post-processing. Everything I need to bring a game like mine from SceneKit to RealityKit. Alright, let’s dive in. In terms of concepts, I’ll focus on four key areas: architecture, coordinate systems, assets, and views.

Starting with architecture, SceneKit is node-based. That means that every object in the scene is a node, and these nodes have predefined behaviors in the form of properties.

Each node has properties for features, including geometry, animations, audio, physics, and lights.

For example, when I create an empty node, it has no geometry, no special properties, and is positioned at its parent's origin.

When I want to render Max in my game, I can assign the model to a node’s geometry property.

When Max walks around the scene, the app is assigning an animation player to the node and playing it.

The footsteps that you hear are coming from an audio player, also assigned to the same node.

So that’s how a node-based architecture works. Everything revolves around the node and its properties.

In contrast, RealityKit uses a design pattern called Entity Component System, or ECS for short.

This means that every object in the scene is an Entity, and I modify its behavior by attaching components to it. Every behavior in RealityKit is a component.

From its transform to advanced behaviors like physics, particles, and even portals, and the list of components keeps growing as RealityKit evolves. New components this year include image presentation and gesture components. These are the architectural differences to keep in mind when bringing SceneKit apps over to RealityKit. Next is an easy one. When coming to a new rendering engine, you can’t do much without understanding the coordinate system.

Coordinate systems are easy to translate between SceneKit and RealityKit because they are the same. In both systems the x-axis is pointing to the right, y-axis is pointing up, and z is directly toward the camera.

For assets, SceneKit is flexible in the model formats it accepts, and the engine serializes them into SCN files. This is convenient but it’s not a unified standard across the industry. Most of these formats are proprietary with varying feature support, bringing extra complexity to asset pipelines.

RealityKit on the other hand is designed around an open industry standard called Universal Scene Description or USD.

It’s a format introduced by Pixar in 2012 with the goal to solve for a few difficulties in the asset creation pipeline, including data exchange and collaboration. This is the standard of choice for Apple across all platforms. I’ll need to convert some SCN files to USD for my game, so I’ll dig into those details in just a moment. Before that, the last core difference I want to highlight is views. Views are fundamental building blocks that represent a portion of an app’s user interface. In the case of both SceneKit and RealityKit, it’s a viewport that renders 3D content.

With SceneKit, I can render content through an SCNView or a SceneView if using SwiftUI. There’s also an ARSCNView, which let’s me render virtual objects in the real world with ARKit. With RealityKit the path is simple: content renders through a RealityView, which was designed from the ground up for all the conveniences we’re so used to with SwiftUI. I can render entirely virtual scenes or place objects in the real world with just this one view.

The same content deploys and adapts across all supported Apple platforms, and even performs stereoscopic rendering in VisionOS automatically, without any changes to the code. Great, so those are the main core concepts you should have in mind when transitioning to RealityKit: architecture, coordinate systems, asset support, and views. Next, every great game has to have some nice assets. So let’s take a look at what I currently have in my game. Inside the art asset catalog, I have a collection of 3D models. These models are in the SCN file format. This is great for a SceneKit project, but I need to convert all these assets to USD so I can use them in RealityKit. Let me show you some options. If you have the models in their original format, that would be the best choice. Chances are your Digital Content Creation Tool or DCC, offers good support for USD. I’m using Blender, so I can export the asset to this format straight away.

But if you don’t have the original files, there are a few other options for converting your existing SCN asset directly to USD. One method, and probably the easiest, is right in Xcode. To do so, select an asset. I’ll choose enemy1 this time. Then go to File, Export..., and in the Export options, choose a Universal Scene Description Package, which is a zipped USD file type.

The results of exporting this way may vary from asset to asset, but for most models that I have in my game, this works great. You might also have animations in separate SCN files, which is a common pattern in SceneKit. For instance, in my SceneKit game, animations for the main character like walking, jumping, and spinning are each in separate files with no geometry. But how do I export the animations to USD and apply them back to the geometry? Well, lucky for me, Apple updated a CLI that ships with Xcode 26 called SCN tool to help with this process. Let me show you how to use it. I can invoke SCN tool by typing xcrun scntool.

This displays a list of options available. To convert, I can type xcrun scntool --convert specifying the file max.scn in this case. And --format usdz as the output type.

This alone would convert the SCN file to USD in the same way as I did earlier in Xcode. To append the animation, I use --append-animation for each SCN animation file I want to export, max_spin in this case.

And save to desktop.

Let’s take a look at the converted file.

Great, my geometry has the animation information in USD format. I did this for all my assets and organized them in a way that works great for my preferred workflow. Now I’m ready to start piecing the game together. Which brings me to the next topic, scene composition. With the SceneKit version, the SceneKit editor in Xcode helped to put all the pieces together. RealityKit has a tool to help me do this too, and it’s called Reality Composer Pro. Reality Composer Pro sits between Xcode and my DCC of choice, such as Blender or Maya. I can use it to compose my scene, add components to entities, create and modify shaders, and prepare the scene for Xcode. I can bring all my newly created USD assets in and begin putting the game back together. Reality Composer Pro ships with Xcode. I’ll open it now.

I’ll create my project with the name PyroPanda.

Reality Composer Pro gives me a default scene without any content.

Next, I can drag all those newly converted assets into my project.

To add these assets to my scene, I can either right click and choose Add to Scene.

Or I can drag in an asset such as Max from the project browser into the viewport directly.

Once in, repositioning entities is straightforward. I can use this gizmo to put Max at the game’s starting point. More or less right there. Reality Composer Pro is a great tool to compose my scene in a visual way allowing me to edit materials, shaders, particles, lights, and more. Remember I said Reality Composer Pro sat in between my digital content creation tool and Xcode? Well, now I need to bring the content into my app. So that's the next task. The Reality Composer Pro project is a Swift package. I can add it as a local dependency in Xcode by going to my project package dependencies here, clicking on Add Local..., and choosing my app as the target.

Next, in my content view Swift file, I need to import RealityKit and my new package, PyroPanda, at the top here.

Within my ContentView, I’ll add a RealityView.

Then I need to load the scene as an entity, specifying that it comes from that package’s bundle.

And finally, add the new entity to my RealityView content.

I’ll also add a camera control just to show you the result.

I spent some time earlier building the scene up with Reality Composer Pro. Here's the fully composed result. I added the remaining models, assigned the textures, and created the dynamic shaders for the lava, plants, and one of the enemies, adding more personality to the volcanic environment.

You can check out the full sample code to inspect how each piece of this scene was built.

There’s so many things you can do with Reality Composer Pro. To learn more, I’d recommend checking out one of these two sessions from previous WWDC years. This is starting to come together. Now I’ll make little Max come to life with some animation. When I converted Max earlier, I also appended an animation. When a USD file has animations, RealityKit exposes them in an AnimationLibraryComponent. This makes it easy to access and play the animations on my entities. I reference all the different animations from a single USD file called “Max” In Reality Composer Pro I can see the references to all the animations in the inspector here.

I can trigger each animation in my project by the name specified in this component.

In the SceneKit version of this game, this is how I played the spin animation. First, I found the Max_rootNode in my scene and loaded the spinAnimation separately. Then, I traversed through the animation scene until I found the SCN animation player and saved a reference to it. Then added the animationPlayer to Max, with the reference “spin”, and played the animation on the node. In RealityKit, accessing the animation via the AnimationLibraryComponent makes this really convenient. First, I find the Max entity by name, just “Max” in this case. From there, grab the AnimationLibraryComponent from Max’s component set, and select the desired animation by name. And finally, play it back on the entity. As Max navigates around the scene, my completed app plays different animations that represent the movement. Check out the source code for the app to see how this all connects. Something that adds an element of realism and mood to any scene is lighting. And when well applied, the difference can be night and day. Lighting in my application can be completely achieved through Reality Composer Pro without any additional code. Let's see how that looks. I can add a light by tapping the insert icon here, at the bottom of my entity hierarchy, and selecting a directional light.

This is an empty entity with just a directional light component. For this light type, only the orientation changes how it illuminates other entities in the scene.

I’ll position it up here just for visual clarity, and rotate around the x-axis as such.

The light looks good, but it’s missing something. There's no shadows! In the component list, I can also add a directional light shadow component by checking this box.

From the starting point, I can now see how the terrain and Max are casting shadows onto the rest of the scene. Achieving the same through code is very similar. For SceneKit, I create an SCNLight, set the type to directional, and assign castShadow to true.

I then create a new node and assign the light to the node’s light property.

For RealityKit, I create an entity with two components; a directional light component and a directional light shadow component. A directional light is one of the light options available in RealityKit. Others include point lights, spotlights, and image-based lights, which use textures to illuminate your scene. My game is looking a little more visually dynamic now, so next I’ll add some audible elements to increase engagement a little more. Let’s take a look at the ambient audio track that's constantly looping in the scene. In my SceneKit project, I first load the ambient audio file as an audio source. I can then modify properties of that audio source to change how it plays. In this case, I want it to loop, and I don’t want it to be positional in the scene or spatial, meaning that the audio playback volume does not change based on the main camera’s distance to the source node. And finally, add an audio player to the terrain node, starting the playback. I can access audio files in RealityKit in a very similar way to how I access animations: through components. The component this time is called AudioLibraryComponent. I can configure the audio playback completely from Reality Composer Pro, rather than doing everything at my app’s runtime. Let's see how that setup looks. I’ve already attached an AudioLibraryComponent to the terrain entity with a reference to the ambient audio file. Since I don’t want the playback to sound like it’s coming from a specific source, I can add an ambient audio component to the same entity.

This audio resource is quite long, and RealityKit’s default behavior will be to preload the whole track to memory at runtime. Instead, I can change that behavior to stream the audio resource as it plays.

And when the track finishes, I want the playback to start again from the beginning, so I'll select the Loop checkbox.

Everything’s now wired up, and the audio is ready to be played in RealityKit. There's two ways I can do this. The first is through the AudioLibraryComponent.

I can start by fetching the AudioLibraryComponent from the terrain’s component set, reference the ambient audioResource by name, and play it back on the same terrain entity. RealityKit sees the settings I added with Reality Composer Pro, so it will automatically loop and stream the audio as an ambient track. Alternatively, I can use a little trick from a built-in entity action called PlayAudioAction.

With this approach, the PlayAudioAction looks at the target entity’s AudioLibraryComponent for me and finds the named audio file.

I convert that action into an animation and play it on the same terrain entity.

Entity actions are really helpful for minimizing the actual code in my application.

I use this action and some others for various events in my game. For example, whenever the main character jumps or attacks.

For the final step I’ll cover in this session, let’s take a look at the visual effects included in my game. Visual effects can turn a 3D scene from something that’s accurate to a truly emotive experience. Starting with particles, I have some really nice particle effects that were put together for the original game from right inside the SceneKit editor in Xcode. These particles are saved as SCN particle files. There’s no tool to directly convert these files into something that’s compatible with RealityKit, but I can make particle effects with very similar settings through Reality Composer Pro. Let’s go to Reality Composer Pro and see what editing particles looks like. I prefer to keep my particles as separate USD files, so I can add a new file by clicking here in Reality Composer Pro. I'll name it volcano_smoke.

I can add a Particle Emitter component right to the root entity of this file.

From there, by pressing the Play button above the components, the default particles begin to appear.

There are a few presets I can choose from, including one of my favorites, Impact.

This particle preset has a good texture for smoke, so it’s a great starting point for this effect. These settings may be familiar when coming from SceneKit, with some small differences.

I compared the settings in Reality Composer Pro to the ones in my original game’s particles in SceneKit, and have come up with a RealityKit Volcano Smoke effect similar to the SceneKit version.

Now that it’s done, I’ll drag that into my main scene and see what it looks like with all the other models.

Great, that is exactly what I was aiming for. The final step is post-processing. This is where you can add the finishing touches with a final pass of your rendered output before it appears in your app.

In the original game, the camera had a strong bloom effect, making bright lights bleed through the rest of the scene, adding a soft, radiant glow that enhances the game’s otherworldly atmosphere.

This was achieved by modifying a few properties on the scene’s active camera. While this brings a lot of convenience, developers, and especially game developers, often prefer to have a very tight control over effects like this, for performance as well as artistic preferences.

But what about RealityKit? So while a simple property to create this effect is deliberately not available in RealityKit, starting this year, you can instead add post-processing effects to RealityViews on iOS, iPadOS, macOS, and tvOS.

This does require some setup, but doesn’t mean you necessarily need to write all the metal shaders from scratch yourself. Apple has some highly optimized Metal performance shaders you can use to get started. I’ll create a bloom effect using the post-processing API, which I can add to my game. Starting with the original texture from my game, I want to extract the highlights and write those to a separate texture. I then blur it to feather the edges, and finally, composite that blurred texture on top of the original to create the bloom effect.

To do this in my app, I first need to define a type, BloomPostProcess, which conforms to PostProcessEffect.

Within the postProcess method, I create a temporary metal texture to write some of the bloom data too. Next, I can use a performance shader to extract only the brightest parts of my image, which is where I want the blooming to come from. I blur that area using a Gaussian blur. And finally, place that blurred image on top of the original texture. The full code for these steps is available for you in the download.

To apply this effect to my RealityView, I can create an instance of my new BloomPostProcess class and apply it to the rendering effects.

This adds a really nice final touch to my game. The environment becomes more vibrant and is really an amazing experience to play. My app has really come together now with RealityKit. It runs exactly the same on iOS, iPadOS, macOS, and tvOS. With one core code base and a RealityKit scene, I can launch my app across all these platforms right away. Let me show you this game running on RealityKit’s latest platform, tvOS, with controller support.

Now I can play my RealityKit game on Apple TV at home.

And on VisionOS, there’s something even more special I can do. By placing my scene inside of a progressive immersion view, I can add a portal into the PyroPanda world that renders in full 3D right in front of me.

This kind of experience is only possible with RealityKit and VisionOS.

Let’s take a look at what’s been covered today. SceneKit is deprecated. And while it’s important, it doesn’t mean you need to worry about your existing SceneKit applications anytime soon. The best path forward is RealityKit, bringing with it unique possibilities for your apps and games.

I’ve discussed the core differences in terms of concepts and tooling available for RealityKit developers, and some of the major steps I took migrating a game from SceneKit over to RealityKit.

I’d encourage you to check out these sessions to help you on your journey to making amazing apps and games with RealityKit, along with other sessions from previous years, as well as the RealityKit documentation. In the download for this sample app, There’s even more details about areas I didn’t have time to cover today, such as camera motion, driving character movement, and game controllers. Technology is always evolving. Apple wants the SceneKit deprecation to be as smooth as possible for developers like yourselves. We’re excited about the future of RealityKit and can’t wait to see what SceneKit developers do with it next. Thank you for watching and enjoy the rest of the conference.

Code

16:33 - Animations in RealityKit

// RealityKit
guard let max = scene.findEntity(named: "Max") else { return }

guard let library = max.components[AnimationLibraryComponent.self],
      let spinAnimation = library.animations["spin"]
else { return }

max.playAnimation(spinAnimation)

18:18 - Directional Light Component in RealityKit

// RealityKit

let lightEntity = Entity(components:
    DirectionalLightComponent(),
    DirectionalLightComponent.Shadow()
)

24:37 - Create Bloom effect using RealityKit Post processing API

final class BloomPostProcess: PostProcessEffect {

    let bloomThreshold: Float = 0.5
    let bloomBlurRadius: Float = 15.0

    func postProcess(context: borrowing PostProcessEffectContext<any MTLCommandBuffer>) {

        // Create metal texture of the same format as 'context.sourceColorTexture'.
        var bloomTexture = ...

        // Write brightest parts of 'context.sourceColorTexture' to 'bloomTexture'
        // using 'MPSImageThresholdToZero'.

        // Blur 'bloomTexture' in-place using 'MPSImageGaussianBlur'.

        // Combine original 'context.sourceColorTexture' and 'bloomTexture'
        // using 'MPSImageAdd', and write to 'context.targetColorTexture'.
    }
}

// RealityKit

content.renderingEffects.customPostProcessing = .effect(
    BloomPostProcess()
)

Build a SwiftUI app with the new design

Explore the ways Liquid Glass transforms the look and feel of your app. Discover how this stunning new material enhances toolbars, controls, and app structures across platforms, providing delightful interactions and seamlessly integrating your app with the system. Learn how to adopt new APIs that can help you make the most of Liquid Glass.

Chapters

Resources

Related Videos

WWDC25

Transcript

Hi. I'm Franck, an engineer on the SwiftUI team. And in this video, you will learn how to build a great app with the new design. iOS 26 and macOS Tahoe introduce significant updates to the look and feel of apps and system experiences. At the heart of these updates is a brand new, adaptive material for controls and navigational elements that we call Liquid Glass. It takes inspiration from the optical properties of glass and the fluidity of liquid to create a lightweight, dynamic material that helps elevate the underlying content across various components.

As you scroll through content, the glass automatically adapts to the content underneath, changing from light to dark. With a new and refreshed look across all platforms, controls come alive during interaction. Controls like toggles, segmented pickers, and sliders now transform into liquid glass during interaction, creating a delightful experience! These updates apply across all the platforms that your app runs on. Watch “Meet Liquid Glass” for a deeper dive into the design of this new material.

Then, check out “Get to know the new design system” to learn best practices with the new design.

Sometimes, in life, to gain clarity and focus on what’s truly important, you may need to re-invent yourself and look at the bigger picture. Today, I’m applying this wisdom to the Landmarks app, a sample project available on the Apple Developer website. I'll showcase elements of the new design system across Apple platforms and I’ll bring even greater clarity and focus to the Landmarks app by adopting the new APIs. When you build your app with the Xcode 26 SDKs, you'll notice changes throughout the UI. I’ll begin with updates to structural app components like TabView and NavigationSplitView. Then, I’ll cover the new look and behavior of toolbars. After that, I’ll share updates to the search experience that enhance consistency and ease of use. Then, I’ll show how controls come to life with Liquid Glass! I’ll finish by describing how you can adopt glass into your own custom UI elements.

In each of these five areas there are improvements you will get automatically and new APIs to customize the experience even further. App structure refers to the family of APIs that define how people navigate your app.

These include views and modifiers like NavigationSplitView, TabView and Sheets! Every one of these members is refined for the new design. NavigationSplitView allows navigating through a well-defined hierarchy of possibly many root categories.

They now have a Liquid Glass sidebar that floats above your content.

This breathtaking hero banner in Landmarks illustrates this, with the pink blossoms refracting against the sidebar. But the Landmarks team didn't travel to all these fantastic destinations, framing spectacular pictures only to have them appear clipped, no matter how gorgeous the sidebar is! With the new backgroundExtensionEffect modifier, views can extend outside the safe area, without clipping their content.

If I hide the sidebar for a moment, you’ll see what’s happening behind it.

The image is mirrored and blurred outside of the safe area, extending the artwork while leaving all its content visible. The new design makes inspectors shine, with more Liquid Glass! Opposite the sidebar in Landmarks, the inspector hosts content with a more subtle layering. This associates the inspector with its related selection.

TabViews provide persistent, top-level navigation. They provide an overview of possibilities at a glance and optimize for switching from section to section, maintaining context within each section.

With the new design, the tab bar on iPhone floats above the content, and can be configured to minimize on scroll.

This lets your app’s content remain the star of the show.

To adopt this behavior, use the tabBarMinimizeBehavior modifier, In this example, the TV app uses the onScrollDown behavior.

With this configuration, the tab bar re-expands when scrolling in the opposite direction.

Now, suppose your app has additional controls that you want close at hand, like this playback view in Music.

Place a view above the bar with the tabViewBottomAccessory modifier. This takes advantage of the extra space provided by the tab bar’s collapsing behavior.

Inside your accessory view, read the tabViewBottomAccessoryPlacement from the environment. Then, adjust the content of your accessory when it collapses into the tab bar area.

Alright, I showed you how NavigationSplitView is beautiful in Landmarks with the new design, and I shared ways you can adapt TabView-based apps too. Next, I’ll present sheets! When creating a collection of landmarks, a sheet of landmark options gets presented. On iOS 26, partial height sheets are inset by default with a Liquid Glass background.

At smaller heights, the bottom edges pull in, nesting in the curved edges of the display.

When transitioning to a full height sheet, the glass background gradually transitions, becoming opaque and anchoring to the edge of the screen.

If you’ve used the presentationBackground modifier to apply a custom background to your sheets, consider removing that and let the new material shine.

Sheets can also directly morph out of buttons that present them. To have the presentation content morph out of the source view, make the presenting toolbar item a source for a navigation zoom transition. And mark the content of your sheet as the destination of the transition.

Like sheets, other presentations such as menus, alerts, and popovers flow smoothly out of liquid glass controls, drawing focus from their action to the presentation’s content.

In the new design, dialogs also automatically morph out of the buttons that present them! With our app structure standing pretty strong, let’s move onto to toolbars! In the new design, toolbar items are placed on a Liquid Glass surface that floats above your app’s content and automatically adapts to what’s beneath it. Toolbar items are automatically grouped. When I build the Landmarks app with Xcode 26, my custom toolbar items are grouped separately from the system-provided back button.

I want to emphasize that the “favorite” and “add to collection” buttons are related actions. So, I used the new ToolbarSpacer API with fixed spacings to split them into their own group.

This provides visual clarity that the grouped actions are related, while the separated actions, like the share link and inspector, have distinct behavior. ToolbarSpacer can also be used to create a flexible space that expands between toolbar items.

The Mail app uses this technique to create a leading-aligned filter item and a trailing-aligned group with the search and compose items.

Some toolbar items can do without this visual grouping, like this item from Books showing my avatar. Apply the sharedBackgroundVisibility modifier to separate an item into its own group without a background.

In the Landmarks app, I added a feature that allows friends to react to my landmarks collection. I would like to add an indicator on my notification item when there is a new reaction. I don’t want to miss that sweet, sweet external validation.

By using the badge modifier on toolbar items that sweet validation is just one line of code away! I applied the badge modifier to my toolbar item’s content to display this indicator.

In addition to grouping and badging items in toolbars, the new design introduces a few other changes. Icons use monochrome rendering in more places, including in toolbars.

The monochrome palette reduces visual noise, emphasizes your app’s content, and maintains legibility.

You can still tint icons with a tint modifier, but use this to convey meaning, like a call to action or next step, but not just for visual effect.

In the new design, an automatic scroll edge effect keeps controls legible.

It is a subtle blur and fade effect applied to content under system toolbars. If your app has any extra backgrounds or darkening effects behind the bar items, make sure to remove them, as these will interfere with the effect.

For denser UIs with a lot of floating elements, like in the calendar app, tune the sharpness of the effect on your content with the scrollEdgeEffectStyle modifier.

Turning from toolbars, next I’ll show how to craft canny search experiences with the new design. There are some big updates to two key patterns for search across all platforms.

Search in the toolbar places the field at the bottom of the screen, within easy reach.

And on iPad and Mac, it appears in the top-trailing position of the toolbar.

The second pattern is to treat it as a dedicated page in a multi-tab app. For the Landmarks app, I placed the search in the top trailing corner. When using this placement, you should make as much of your app’s content available through search.

The search field appears on its own Liquid Glass surface.

A tap activates it and shows the keyboard. To get this variant in Landmarks, I applied the searchable modifier on the NavigationSplitView. Declaring the modifier here indicates that search applies to the entire NavigationSplitView, not just one of the columns.

On iPhone, this variant automatically adapts to bring the search field at the bottom of the display.

Depending on device size, number of toolbar buttons, and other factors, the system may choose to minimize the search field into a toolbar button, like the one shown here in Mail.

When I tap on the button, a full-width search field appears above the keyboard.

If you want to explicitly opt-in to the minimized behavior, say because search isn’t a main part of your app’s experience, use the new searchToolbarBehavior modifier.

Searching in multi-tab apps is often done in a dedicated search page. The pattern is used by apps across all our platforms, such as the Health app to check my fitness trends.

To do this in your app, set a search role on one of your tabs and place a searchable modifier on your TabView.

When someone selects this tab, a search field takes the place of the tab bar, and the content of the tab is shown.

People can interact with your browsing suggestions, or tap on the search field to bring up the keyboard and continue with specific search terms.

On iPad and Mac, when someone selects the search tab, the search field appears centered above your apps browsing suggestions. These patterns give you flexibility and control over the search experience in your app! Speaking of control, next I’ll turn to updates to standard controls. The new design creates a strong family resemblance across platforms for controls like buttons, sliders, menus, and more.

I'm going to start with updates to buttons, one of the most common controls. Bordered buttons now have a capsule shape by default, harmonious with the curved corners of the new design. Mini, small, and medium size controls on macOS retain a rounded-rectangle shape, which preserves horizontal density.

And the existing button border shape modifier enables you to specify the shape for any size.

Control heights are updated for the new design.

Most controls on macOS are slightly taller, providing a little more breathing room around the control label, and enhancing the size of the click targets.

For compatibility with existing high-density layouts, like complex inspectors and popovers, the existing controlSize modifier can be applied to a single control or across an entire set of controls. And for your most important, prominent actions there is now support for extra large sized buttons. Last but not least, the new glass and glass prominent button styles bring Liquid Glass to any button in your app.

Let's move onto Sliders, which have learned a few tricks too.

They now support tick marks! The tick marks appear automatically when initializing a slider with a step parameter.

You can even manually place individual ticks.

Use the ticks closure to specify their location, like I’m doing here for ticks at 60% and 90%.

Sliders also let you start their track fill at a particular place. This is useful for values that may adjust left or right from a non-leading default value, like selecting faster or slower speed values on playback.

Specify the starting point with the neutralValue parameter. Menus across platforms have a new design and more consistent layout. Icons are consistently on the leading edge and are now used on macOS too. The same API using Label or standard control initializers now create the same result on both platforms.

In addition to updates to SwiftUI's controls, there are new APIs to update your controls for the new design.

Many of our controls have their corners aligned perfectly within their containers, even if the container is your iPhone! This is called corner concentricity. For example, a button that is positioned at the bottom of a sheet should share the same corner center with the corners of the sheet.

To build views that automatically maintain concentricity with their container, use the concentric rectangle shape. Pass the containerConcentric configuration to the corner parameter of a rectangle and the shape will automatically match its container across different displays and window shapes. The best way to adopt the new design is to use standard app structures, toolbars, search placements, and controls. But sometimes, your app might need a bit more customization. Next, I’ll share how to build custom Liquid Glass elements for your app.

Maps is a great example for this use-case with their custom glass controls that gracefully float above the map content.

In a similar fashion, I’m going to add badges to the Landmarks app for each landmark people visit. Let’s start by creating a custom badge view with the Liquid Glass effect! To add glass to your custom views, use the glassEffect modifier. By default, a glass effect will be applied within a capsule shape behind your content.

SwiftUI automatically uses a vibrant text color that adapts to maintain legibility against colorful backgrounds.

Customize the shape of the glass effect by providing a shape to the modifier.

For especially important views, use a tint modifier.

Similar to toolbar buttons, only use this to convey meaning and not just for visual effect.

And just like text within a glass effect, the tint also uses a vibrant color that adapts to the content behind it. On iOS, for custom controls or for containers with interactive elements, add the interactive modifier to the glass effect. Glass reacts to user interaction by scaling, bouncing, and shimmering, matching the effect provided by toolbar buttons and sliders. Now that we have our custom badge, let’s bring multiple badges together so they interact and blend with each other. To combine multiple glass elements, use the GlassEffectContainer. This grouping is essential for visual correctness. The glass material reflects and refracts lights, picking colors from nearby content.

This effect is achieved by sampling content from an area larger than itself.

However, glass can not sample other glass, so having nearby glass elements in different containers will result in inconsistent behavior.

Using a glass container allows these elements to share their sampling region, providing a consistent visual result.

In the Landmarks app, I am using the GlassEffectContainer to group my badges. When expanding my badges, I get this wonderful fluid morphing! Add these transitions to your own glass container by using the glassEffectID modifier.

To configure this, I first declare a local namespace. Then, I associate the namespace with each of the glassEffect elements in my expanded stack of badges and with my toolbar button. Now, when I tap the button again, the award badges are re-absorbed gracefully! The liquid glass effect offers an excellent way to highlight the functionality that makes your app truly unique.

I hope you enjoyed this quick tour of applying the new design and using Liquid Glass. Now it's your turn! Adopt the new design in your app by building it with Xcode 26. I think you’ll appreciate how much you get automatically with standard controls.

Audit the flow of your app and identify whether any views need changes, paying special attention to background colors behind sheets and toolbars that you can remove. Finally, build expressive components with Liquid Glass that truly make your app stand out. I hope you have a brilliant time playing with the new design! Keep on shining!

Build a UIKit app with the new design

Update your UIKit app to take full advantage of the new design system. We'll dive into key changes to tab views, split views, bars, presentations, search, and controls, and show you how to use Liquid Glass in your custom UI. To get the most out of this video, we recommend first watching “Get to know the new design system” for general design guidance.

Chapters

Resources

Related Videos

WWDC25

WWDC24

Transcript

Hi, welcome to “Build a UIKit app with the new design”. I’m Sanaa, an engineer on the UIKit team. In this video, I will show you how to adopt the new design, and bring Liquid Glass to your apps! iOS 26 introduces a beautiful new design, updating the look and feel of materials and controls across the system. Central to this design is a new material called Liquid Glass.

It is translucent, dynamic and alive. Throughout the system, UIKit’s components and materials have been updated with Liquid Glass.

Your apps get this new appearance as soon as you recompile with the new SDK. If you haven’t already, I highly recommend watching the design videos “Meet Liquid Glass” and “Design with the Liquid Glass design system” to gain an overview, and learn the best practices of the new design system.

I will start with how tab views and split views adopt the new design system, and float above the content.

Then, I will cover the new look and behavior of navigation bars and toolbars, which are now transparent, contain liquid glass buttons, and give more space to your content.

After that, I will go over the new updates to presentations, including an updated zoom transition, and new behavior for alerts and action sheets.

Then, I will describe how the Search experience has been updated, with more options for the search bar positioning.

I will go over Controls, like buttons, switches, and sliders, and finish with how to adopt the Liquid Glass look and feel in your custom UI elements.

I will start with Tab views and split views.

UITabBarController and UISplitViewController have been updated with the new liquid glass appearance.

Tab bars provide persistent, top-level navigation within your app. They provide an overview of your app at a glance, and help people quickly switch from section to section. With the new design, the tab bar on iPhone floats above the content, and can be configured to minimize on scroll, keeping the focus on your content.

To allow the tab bar to minimize on scroll set tabBarMinimizeBehavior to the desired direction. Here, the TV app is setting it to .onScrollDown. The tab bar re-expands when scrolling in the opposite direction.

Above the tab bar, you can have an accessory view like the mini player in the Music app. UITabBarController displays the accessoryView above the tab bar, matching its appearance. When the tab bar is minimized, the accessory view animates down to display inline with the tab bar.

To set a bottom accessory, create a UITabAccessory with your contentView. Then, set the bottomAccessory property on UITabBarController.

When the accessory is inline with the tab bar, there is less space available to display it. Here, the Music app is accommodating the reduced space by hiding some of the media controls in the Mini player.

To adjust the accessory view, register to observe changes to the tabAccessoryEnvironment trait. Check if the accessory view is inline with the tab bar, and update the view if needed.

You can also use updateProperties to update your accessory view with the automatic trait tracking behavior.

To learn more about automatic trait tracking and the new updateProperties method, check out “What’s new in UIKit”.

On iPad, the tab bar and sidebar are also lifted into Liquid Glass. They float above your app’s content when using UITabBarController. By adopting UITab and UITabGroup, you get automatic adaptivity for your app, allowing people to switch between tab bar and sidebar on iPad. To learn more, check out the videos “Elevate your tab and sidebar experience in iPadOS” from WWDC24, and "Make your UIKit app more flexible” from WWDC25.

With the new design, Sidebars look best when there is vibrant content underneath matching the main scroll view. The TV app on iOS 26 is a great example of this. The artwork appears to extend across the entire screen and underneath the sidebar. It remains centered, and fully visible. This effect is used across many apps in iOS 26, and your app will also look great using the new UIBackgroundExtensionView! The ExtensionView should cover the entire width, including the leading safe area inset for the sidebar. The input of the effect is a content view you provide. For example, an image view. It is placed in the view hierarchy and seamlessly extended to fill the empty space.

This is a special effect that should be used with purpose. Sections like the list of episodes at the bottom naturally scroll underneath the sidebar and don't need to be extended.

Elements on top of the artwork, like the show name and description, also shouldn't be extended. Make sure to add these as siblings of the extension view, not as subviews.

The show view controller already has an image view for the poster artwork. To add the effect, create a BackgroundExtensionView, and assign the image view to its contentView property. Add the extensionView to your hierarchy. And finally, add the detailsView as a sibling of the extensionView. By default, the content view fills the safe area of the extension view. All edges with a positive safe area inset get extended to fill the empty space. In this example, these are the top edge for the navigation and status bar, and the leading edge for the sidebar. Because the TV app has very little content in the navigation bar that could cover the artwork, it doesn't need the extension effect at the top. Instead, the content view is manually positioned outside the safe area at the top.

I will go back to the code example to manually adjust the effect layout. First, set automaticallyPlacesContentView to false on the extensionView. Use AutoLayout constraints to position the image view at the top of the screen. And to extend the image view underneath the sidebar, add a constraint for the leadingAnchor equal to the extensionView's safeAreaLayoutGuide.

And don't forget to add constraints for the trailing and bottom anchors as well.

Now I’ll go over how navigation bars and toolbars look and behave in the new design. In iOS 26, navigation bars and toolbars also adopt the new glass appearance and float above the content. When you specify toolbar or navigation barButton items, the system automatically separates them into visual groups of items. Each group shares a glass background.

By default, bar button items using images share the background with other image buttons. Button groups with multiple items, also share their backgrounds.

Text buttons, the system “Done” and “Close” buttons, and prominent style buttons have separate glass backgrounds . This example shows these rules in action, where the “Select” button at the beginning, and the “Done” button at the end don’t share the glass background with the 4 image buttons in the middle. To set up the navigationBar, assign all BarButtonItems directly to the navigationItem’s rightBarButtonItems. This gets the default system behavior that I described previously. To further break items into separate groups, use a fixedSpace item. In this example, I want to separate the “Share” button, so I insert a fixedSpace between the share and info buttons.

Bar buttons use labelColor by default to improve legibility. If color is needed to communicate information about the action, you can specify a different tint color.

For example, here I’m going to update the flag button to use systemOrange as the tintColor.

Only the Flag symbol will get colored.

To tint the button background, set the style to prominent.

Previously, you might have used flexible spaces to evenly distribute the items in your toolbar. With Liquid Glass, by default, each flexibleSpace item separates the background between items.

To evenly distribute the items, and group them in a single background use flexible spaces with hidesSharedBackground set to false.

In iOS 26, UINavigationItem provides more control over the title and large title areas in the navigationBar. This includes the addition of a new subtitle that is rendered below the title.

Use attributed strings for fine grained adjustments to both the title and subtitle.

Specify custom views to add interactive elements Large titles are now placed at the top of the content scroll view, and scroll with the content underneath the bar.

To keep the large title visible, extend the scroll view fully under the navigationBar.

In this example, Mail places the search field in the toolbar, and shows the number of unread emails in the navigation bar using the new subtitle API.

When filtering emails, Mail shows the current filter in a button below the large title. The button is set as the largeSubtitleView on the navigationItem, appearing below the large title in the navigationBar. As part of the new design system in iOS 26, the bar background is now transparent by default. Remove any background customization from your navigation and toolbars. Using UIBarAppearance or backgroundColor interferes with the glass appearance.

Bar buttons use a glass background. Lay out your customView contents using the layout margins to get the correct spacing.

All scroll views underneath navigation or toolbars automatically apply a visual treatment. This ensures legibility of overlapping content in the bars. This is called an edge effect. This new edge effect isn’t just for system bars. You can also use it with custom containers of views that overlay an edge of a scroll view! This example shows two buttons overlaying the bottom edge of the scroll view. To insert an edge effect behind this stack of buttons, create a ScrollEdgeElementContainerInteraction, assign the contentScrollView and the edge, then add it to the buttonsContainerView. For denser UIs with a lot of floating elements, opt in to a hard edge style on any edge of a scroll view. This has a similar appearance to the standard bar backgrounds in iOS 18. Enable it by setting the style of the edge effect to .hard iOS 18 introduced an always-interactive, interruptible zoom transition. iOS 26 brings the same fluidity to the standard navigation slide transition.

Here, I’m using the Notes app. I can open a note and the app stays responsive during the transition.

I can immediately swipe back if I selected the wrong note, or start scrolling while the transition is still settling. Similarly, I can immediately tap the back button multiple times to go even further back quickly.

This is great because people can interact with your app at any time. But of course, that also means that your app needs to be ready to interact at any time. To learn how to correctly handle interruptible transitions, check out the video “Enhance your UI animations and transitions”. I can now also swipe back anywhere within the content area, not just the leading edge. The new content backswipe gesture automatically checks for other competing interactions.

For example, Swipe actions prevent content backswipe. However, non-interactive areas would allow it.

To gain priority over content backswipe, custom gestures need to set failure requirements on interactiveContentPopGestureRecognizer.

The new design extends to presentations as well, including the new dynamic zoom transition. When a presentation, like a menu or a popover is originated from a glass button, the button morphs into the overlay.

This maintains visual continuity between the source and the presentation throughout the animation.

Menus get this behavior automatically. Popovers also get this new animation when their source is a barButtonItem. Sheets can adopt this effect by using the updated Zoom Transition Set the preferred transition on the presented viewController to .zoom and return the source barButtonItem in the closure.

Sheets have an updated design in iOS 26. They adapt their appearance from smaller to larger heights. To take advantage of their new glass appearance, remove any custom backgrounds.

ActionSheets on iPad are anchored to their source views. Starting in iOS 26, they behave the same on iPhone, appearing directly over the originating view.

On the alertController, make sure to set the sourceItem or the sourceView on popoverPresentationController, regardless of which device it’s displayed on. Assigning the source view automatically applies the new transitions to action sheets as well! Action sheets presented inline don’t have a cancel button because the cancel action is implicit by tapping anywhere else. If you don’t specify a source, the action sheet will be centered, and you will have a cancel button. iOS 26 provides a new, more integrated search experience, letting you position the search field where it best suits the needs of your app.

On iPhone, the search bar moves automatically to the toolbar, this keeps the search field easily accessible.

If you already have a toolbar, include a searchBarPlacementBarButtonItem along with other bar buttons. This will position search exactly where you want.

It will appear either as an expanded field or a toolbar button, depending on available space.

On iPad, for universally accessible search, follow the macOS toolbar pattern. Place search at the trailing edge of the navigationBar. This is ideal for split views. To enable this behavior, set searchBarPlacementAllowsExternalIntegration on the navigation item to true.

To have search available while switching between views, use a UITabBarController. It can now include a distinct tab for Search on the trailing side. When tapped, the search button expands into a search field, and the other buttons collapse.

This search tabView is a great place for placing search suggestions. By default, one of these suggestions can be selected, or the search field can be tapped to start searching.

To have the search field activate automatically when the search tab is tapped, set automaticallyActivateSearch to true on that tab.

For dedicated search views, consider including search as a section in the sidebar or tab bar. The search bar can be integrated in the trailing edge of the navigation bar, stacked, or placed centered in the regular width on iPad. To center the searchBar, use integratedCentered as the preferredSearchBarPlacement. When the tab bar is visible, the search bar is placed below it Now I’ll talk about the updated look of controls.

Controls on iOS are redesigned with a new look and feel, while remaining completely familiar. Sizes are updated slightly for controls like UISwitch. Check that your layouts are set up to accommodate size updates.

Control thumbs, like those on switch and segmentedControl, automatically have a new liquid glass appearance for interactions.

In addition to the existing button styles, two new glass appearances are available with UIButtonConfiguration. Use the .glass() configuration to get standard glass. And .prominentGlass() to get glass tinted with your app’s tint color.

And with sliders, in addition to the liquid glass effects on the thumb, they now preserve momentum and stretch when they are moved. On iOS 26, sliders now support tick marks with a TrackConfiguration. This configuration is used to set up the look and behavior for the slider. For example, to limit this speed slider to only 5 values, set a track configuration with allowsTickValuesOnly and 5 tick marks.

Sliders can also be configured to use a neutral value, to anchor the slider fill at any location along the track, instead of just the minimum end. This lets the slider fill show the difference between the selected value and the neutral value. In this example, the slider fill shows a higher speed selected than the default one. Sliders can also take a thumbless style to look like a progress bar when not interactive. This is great for media playback, to not distract with a large thumb while the media is playing. Those are places where system controls have adopted liquid glass. For your special use cases, UIKit also offers APIs to adopt the new Liquid Glass look and feel. When using Liquid Glass in your UI, it is crucial to keep the design intent of liquid glass in mind. Liquid Glass is distinct from other visual effects, like UIBlurEffect. As such, it has specific places where it is appropriate to use. Liquid Glass is designed to be an interactive layer. It floats above your content, right below your fingertips, and provides the main controls that the user touches. For that reason, limit Liquid Glass to the most important elements of your app. Where possible, use the system views and controls for the best experience.

Maps uses Liquid Glass for custom buttons that are floating above the map. They feel natural as a distinct control layer. This makes them a great candidate, to use the glass effect for a floating illusion. And when the sheet expands, Maps removes the buttons. This prevents glass elements from overlapping other glass elements, and keeps the illusion of a single floating layer of glass intact.

To use glass with custom views, create a UIVisualEffectView, create a new UIGlassEffect, in an animation block, set the effect.

Glass appears using a special materialize animation.

By default, the glass is in a capsule shape. To customize the shape, use the new cornerConfiguration API.

Glass has a dark and a light appearance. It adapts to the selected userInterfaceStyle. When adding glass to an existing glass container, it adapts its appearance automatically. To have corners automatically adapt to their container, use .containerRelative cornerConfiguration. When moving the view closer to the container’s corner, its corner radius adapts automatically.

When moving further away the corner radius decreases, to maintain concentricity automatically. Glass adapts the appearance based on its size.

A larger size is more opaque.

A smaller size is clearer, and switches between light and dark mode automatically, to increase contrast.

To add content, like labels and images, use the visualEffectView’s contentView. The label automatically becomes vibrant, based on its textColor. This ensures legibility against a wide variety of backgrounds.

Depending on the colors behind, the glass and its content will switch to light or dark mode automatically, when using dynamic colors.

To highlight prominent views, set .tintColor on the glass, and animate it alongside any other glass properties. Animate changes to your content, like textColor, here at the same time.

To use a custom tint color with glass, create a new UIGlassEffect, assign a custom tintColor, and animate the effect to the new UIGlassEffect. Tinted glass color, automatically adapts to a vibrant version. To remove content on top of your glass, animate the content’s alpha to zero.

Interactive system elements, like buttons, react to user interactions. When tapping the button, it scales and bounces. To get that same kind of interactivity in your custom views, set isInteractive to true on the glassEffect.

And finally, when you no longer need the glass on screen animate it out by setting the effect to nil.

Always prefer setting the effect property over the alpha to ensure that the glass dematerializes or materializes with the appropriate animation. In these examples, there was only a single view using Liquid Glass. Glass has additional, built-in behavior, when multiple elements interact. Liquid Glass can seamlessly blend between different shapes.

To dynamically merge glass views, use a UIGlassContainerEffect configure a UIVisualEffectView with it, create your glass views, and add them as subviews to your containers contentView. As long as there is space between them, they appear as two separate views.

Only if they get closer, they start merging like small droplets of water.

To control the distance at which they start affecting each other, use the spacing property on UIGlassContainerEffect.

When animating into an overlapping frame, glass views combine into a single shape.

To split glass into multiple elements, first, add them to the same position without animation. Then, animate them out together! UIGlassContainerEffect does more than just enabling animations. It enforces a uniform adaptation! Glass dynamically adapts to its background, but still gets a consistent appearance.

I went over the UIKit components and materials updated in Liquid Glass. UIKit gives you all the tools you need to update your app to the new design. Going from here, start by building your app with Xcode 26. Much of the new design will work in your app immediately. Audit your app screen by screen, and identify which views stand out.

If you have custom controls, decide whether standard UIKit controls may be a better fit. And lastly, determine how you can make your special use cases stand out with Liquid Glass.

I am looking forward to checking out your app after you adopt the new design system. Thank you for watching!

Code

2:31 - Minimize tab bar on scroll

// Minimize tab bar on scroll

tabBarController.tabBarMinimizeBehavior = .onScrollDown

3:08 - Add a bottom accessory

// Add a bottom accessory

let nowPlayingView = NowPlayingView()
let accessory = UITabAccessory(contentView: nowPlayingView)
tabBarController.bottomAccessory = accessory

3:35 - Update the accessory with the tabAccessoryEnvironment trait

// Update the accessory with the trait

registerForTraitChanges([UITraitTabAccessoryEnvironment.self]) { (view: MiniPlayerView, _) in
    let isInline = view.traitCollection.tabAccessoryEnvironment == .inline
    view.updatePlayerAppearance(inline: isInline)
}

// Automatic trait tracking with updateProperties()
override func updateProperties() {
    super.updateProperties()
    let isInline = traitCollection.tabAccessoryEnvironment == .inline
    updatePlayerAppearance(inline: isInline)
}

5:51 - Extend content under the sidebar

// Extend content underneath the sidebar

let posterImageView = UIImageView(image: ...)

let extensionView = UIBackgroundExtensionView()
extensionView.contentView = posterImageView
view.addSubview(extensionView)

let detailsView = ShowDetailsView()
view.addSubview(detailsView)

6:51 - Adjust the effect layout

// Adjust the effect layout

let posterImageView = UIImageView(image: ...)

let extensionView = UIBackgroundExtensionView()
extensionView.contentView = posterImageView
extensionView.automaticallyPlacesContentView = false
view.addSubview(extensionView)

posterImageView.translatesAutoresizingMaskIntoConstraints = false
NSLayoutConstraint.activate([
    posterImageView.topAnchor.constraint(equalTo: extensionView.topAnchor),
    posterImageView.leadingAnchor.constraint(equalTo: extensionView.safeAreaLayoutGuide.leadingAnchor),
    posterImageView.trailingAnchor.constraint(equalTo: extensionView.safeAreaLayoutGuide.trailingAnchor),
    posterImageView.bottomAnchor.constraint(equalTo: extensionView.safeAreaLayoutGuide.bottomAnchor),
])

8:38 - Custom grouping

// Custom grouping

navigationItem.rightBarButtonItems = [
    doneButton,
    flagButton,
    folderButton,
    infoButton,
    .fixedSpace(0),
    shareButton,
    selectButton
]

8:53 - UIBarButtonItem tint color and style

// Tint color and style

let flagButton = UIBarButtonItem(image: UIImage(systemName: "flag.fill"))
flagButton.tintColor = .systemOrange
flagButton.style = .prominent

9:10 - Toolbar with evenly distributed items in a single background

// Toolbar with evenly distributed items, grouped in a single background.

let flexibleSpace = UIBarButtonItem.flexibleSpace()
flexibleSpace.hidesSharedBackground = false

toolbarItems = [
   .init(image: UIImage(systemName: "location")),
   flexibleSpace,
   .init(image: UIImage(systemName: "number")),
   flexibleSpace,
   .init(image: UIImage(systemName: "camera")),
   flexibleSpace,
   .init(image: UIImage(systemName: "trash")),
]

10:15 - Titles and subtitles

// Titles and subtitles

navigationItem.title = "Inbox"
navigationItem.subtitle = "49 Unread"

10:27 - Large subtitle view

// Titles and subtitles

navigationItem.title = "Inbox"
navigationItem.largeSubtitleView = filterButton

11:20 - Edge effect for a custom container

// Edge effect’s custom container

let interaction = UIScrollEdgeElementContainerInteraction()
interaction.scrollView = contentScrollView
interaction.edge = .bottom

buttonsContainerView.addInteraction(interaction)

11:48 - Hard edge effect style

// Hard edge effect style

scrollView.topEdgeEffect.style = .hard

13:55 - Morph popover from its source button

// Morph popover from its source button

viewController.popoverPresentationController?.sourceItem = barButtonItem

14:07 - Morph sheet from bar button

// Morph sheet from bar button

viewController.preferredTransition = .zoom { _ in 
     folderBarButtonItem
}

14:46 - Source item for action sheets

// Setting source item for action sheets

alertController.popoverPresentationController?.sourceItem = barButtonItem

15:36 - Placing search in the toolbar

// Place search bar in a toolbar

toolbarItems = [
    navigationItem.searchBarPlacementBarButtonItem,
    .flexibleSpace(),
    addButton
]

16:01 - Universally accessible search on iPad

// Place search at the trailing edge of the navigation bar

navigationItem.searchBarPlacementAllowsExternalIntegration = true

16:47 - Activate the search field when search bar is tapped

// Activate the search field when search bar is tapped

searchTab.automaticallyActivatesSearch = true

17:03 - Search as a dedicated view

// Search as a dedicated view

navigationItem.preferredSearchBarPlacement = .integratedCentered

17:52 - Buttons

// Standard glass
button.configuration = .glass()

// Prominent glass
tintedButton.configuration = .prominentGlass()

18:16 - Neutral slider with 5 ticks and a neutral value

// Neutral slider with 5 ticks and a neutral value
slider.trackConfiguration = .init(allowsTickValuesOnly: true,
                                  neutralValue: 0.2,
                                  numberOfTicks: 5)

18:59 - Thumbless slider

// Thumbless slider
slider.sliderStyle = .thumbless

20:28 - Glass for custom views

// Adopting glass for custom views

let effectView = UIVisualEffectView()
addSubview(effectView)

let glassEffect = UIGlassEffect()
// Animating setting the effect results in a materialize animation
UIView.animate {
    effectView.effect = glassEffect
}

20:49 - Custom corner configuration

// Custom corner configuration

UIView.animate {
    effectView.cornerConfiguration = .fixed(8)
}

20:54 - Dark mode

// Adapting to dark mode

UIView.animate {
    view.overrideUserInterfaceStyle = .dark
}

21:02 - Adding glass to an existing glass container

// Adding glass to an existing glass container

let container = UIVisualEffectView()
container.effect = UIGlassEffect()

container.contentView.addSubview(effectView)

21:08 - Container relative corners

// Container relative corners

UIView.animate {
    effectView.cornerConfiguration = .containerRelative()
    effectView.frame.origin = CGPoint(x: 10, y: 10)
}

21:23 - Container relative corners, animated

// Container relative corners

UIView.animate {
    effectView.frame.origin = CGPoint(x: 30, y: 30)
}

21:30 - Glass adapts based on its size

// Glass adapts based on its size

UIView.animate {
    view.overrideUserInterfaceStyle = .light
    effectView.bounds.size = CGSize(width: 250, height: 88)
}

UIView.animate {
    effectView.bounds.size = CGSize(width: 150, height: 44)
}

21:49 - Adding content to glass views

// Adding content to glass views

let label = UILabel()
label.text = "WWDC25"
label.textColor = .secondaryLabel

effectView.contentView.addSubview(label)

22:15 - Applying tint color to glass

// Applying tint color to glass

let glassEffect = UIGlassEffect()
glassEffect.tintColor = .systemBlue

UIView.animate {
    effectView.effect = glassEffect
    label.textColor = .label
}

22:33 - Using custom colors with glass

// Using custom colors with glass

let glassEffect = UIGlassEffect()
glassEffect.tintColor = UIColor(displayP3Red: r,
                                green: g,
                                blue: b,
                                alpha: 1)

UIView.animate {
    effectView.effect = glassEffect
    // Animate out the label
    label.alpha = 0
}

23:03 - Enabling interactive glass behavior

// Enabling interactive glass behavior

let glassEffect = UIGlassEffect()
glassEffect.isInteractive = true

effectView.effect = glassEffect

23:20 - Animating glass out using dematerialize animation

// Animating glass out using dematerialize animation

UIView.animate {
    effectView.effect = nil
}

23:52 - Adding glass elements to a container

// Adding glass elements to a container

let container = UIGlassContainerEffect()
let containerView = UIVisualEffectView(effect: container)

let glassEffect = UIGlassEffect()
let view1 = UIVisualEffectView(effect: glassEffect)
let view2 = UIVisualEffectView(effect: glassEffect)

containerEffectView.contentView.addSubview(view1)
containerEffectView.contentView.addSubview(view2)

24:12 - Adjusting the container spacing

// Adjusting the container spacing

let containerEffect = UIGlassContainerEffect()
containerEffect.spacing = 20
containerEffectView.effect = containerEffect

24:27 - Merging two glass views

// Merging two glass views

UIView.animate {
    view1.frame = finalFrame
    view2.frame = finalFrame
}

24:33 - Dividing glass into multiple views

// Dividing glass into multiple views

UIView.performWithoutAnimation {
    for view in finalViews {
        containerEffectView.contentView.addSubview(view)
        view.frame = startFrame
    }
}

UIView.animate {
    for view in finalViews {
        view.frame = finalFrame(for: view)
    }
}

Build an AppKit app with the new design

Update your AppKit app to take full advantage of the new design system. We'll dive into key changes to tab views, split views, bars, presentations, search, and controls, and show you how to use Liquid Glass in your custom UI. To get the most out of this video, we recommend first watching “Get to know the new design system” for general design guidance.

Chapters

Resources

Related Videos

WWDC25

Transcript

Hi, I'm Jeff Nadeau, a frameworks engineering manager at Apple, and you’re watching “Build an AppKit app with the new design”.

The new design of macOS establishes a common foundation for the look and feel of Mac apps, with refreshed materials and controls throughout the system.

A key element of this new design is the Liquid Glass material, a translucent surface that reflects and refracts light, creating a sense of depth and dynamism within the user interface. AppKit has everything you need to adapt to this new design. I’ll take you through the important changes to the framework, outlining the behaviors that you can expect on macOS Tahoe and the new APIs that you can use to fine-tune your adoption of the new design.

I’ll go through these changes from the top down, starting with the basic structural components of your application. Then, I’ll introduce the scroll edge effect, a visual effect that provides legibility atop edge-to-edge scrolling content.

The new design also includes a big update to the appearance and layout of controls.

Finally, I’ll dig into the Liquid Glass material, how it works and the AppKit APIs that you can use to adopt glass in your custom UI elements.

I'll get started with app structure.

The new design system transforms the appearance of a Mac window, altering the window shape and framing its key structural regions in glass.

One of those regions is the toolbar. In the new design system, toolbar elements are placed on a glass material, and the entire toolbar appears to float above the content, enhancing the sense of hierarchy within the window.

The glass also brings controls together in logical groups. Since they all represent singular actions, AppKit automatically groups multiple toolbar buttons together on one piece of glass. Different types of controls are separated out into their own glass elements, like segmented controls, pop-up buttons and the search control. AppKit determines this grouping automatically based on the type of each item’s control view.

To override the automatic behavior, use NSToolbarItemGroup to group items together or insert spacers to separate items. The Liquid Glass material is adaptive, which means that it reacts intelligently to its context, changing its appearance to suit the brightness of the content behind it. The toolbar glass will even switch between a light and dark appearance if the scrolled content is especially bright or dark.

This appearance change is communicated to the toolbar’s content using the NSAppearance system so any work that you’ve done to support Dark Mode applies here as well. NSToolbar automatically puts the glass material behind every toolbar item but not every item should appear over glass. Non-interactive items like custom titles and status indicators should avoid the glass material.

The informational text in the Photos toolbar is a great example. With the glass material backing, it almost looks like a button. You can remove the glass from an NSToolbarItem by setting the isBordered property to false.

Now that's looking much better.

For the rest of your toolbar items, the glass material has one other neat feature - tinting. Use the new style property on NSToolbarItem to specify a prominent style. The prominent style tints the glass using the accent color, which is perfect for displaying state or emphasizing an important action.

To further customize the appearance of a prominent toolbar item, use the backgroundTintColor property to choose a specific color for the glass. There’s one other way to call attention to a toolbar item - badging.

Use the NSItemBadge API to indicate that a toolbar item navigates to new or pending content. For example, you can use a badge to indicate a number of unread messages or the presence of new notifications. With glass toolbars handled, I’ll move on to the main content of the window, which is often organized using a split view. In the new design, sidebars appear as a pane of glass that floats above the window’s content, whereas inspectors use an edge-to-edge glass that sits alongside the content. To get this effect in your application, use NSSplitViewController. When you create split items with a sidebar or inspector behaviors, AppKit presents them with the appropriate glass material automatically.

Now that the sidebar sits atop glass, the legacy sidebar material is no longer necessary. If you’re using an NSVisualEffectView to display that material inside of your sidebar, it will prevent the glass material from showing through. You should remove these visual effect views from your view hierarchy.

Since the sidebar glass appears to float above the window, it can appear over content from the adjacent split. This works great if you have horizontal scrolling content, list items which slide over to reveal swipe actions or rich content like a map or a movie poster that can extend into the sidebar region.

To allow your split content to appear underneath the sidebar, set the automaticallyAdjustsSafeAreaInsets property to true. Be sure to set this on the content that you want to extend under the sidebar and not on the sidebar itself. When this property is true, NSSplitView will extend that item’s frame beneath the sidebar and then apply a safe area layout guide to help you position your content within the unobscured area.

Rich content, like photographs or artwork, really showcase the floating glass material in the sidebar but it’s often undesirable to cover up some portion of the content to get that effect. This App Store poster creates a striking effect when displayed edge-to-edge but the artwork doesn’t include any extra negative space to accommodate the size of the sidebar. Hiding the sidebar reveals what’s really happening here. The content is being mirrored and blurred, extending the appearance of the artwork without actually obscuring any of its content. AppKit has a new API that provides this effect.

It's called NSBackgroundExtensionView. This view uses the safe area to position your content in the unobscured portion of the view, while extending its appearance edge-to-edge using a visual effect.

To put this into practice, create a NSBackgroundExtensionView and position it to fill the entire frame of the split item. Assign it a content view, which it positions in the safe area, avoiding the floating sidebar. And that's it. The background extension view will automatically create a replica of the content view to fill the space outside of the safe area. The floating sidebar, along with the toolbar, demonstrate a key element of the new design system: concentricity. Each element is designed with a curvature that sits neatly within the corner radius of its container, in this case the window itself. And this relationship goes both ways. In the new design system, windows now have a softer, more generous corner radius, which varies based on the style of window. Windows with toolbars now use a larger radius, which is designed to wrap concentrically around the glass toolbar elements, scaling to match the size of the toolbar. Titlebar-only windows retain a smaller corner radius, wrapping compactly around the window controls. These larger corners provide a softer feel and elegant concentricity to the window but they can also clip content that sits close to the edge of the window. To position content that nests into a corner, use the new NSView.LayoutRegion API. A layout region describes an area of a view, like the safe area, but with features like corner avoidance built in. You can inset a region either horizontally or vertically by the size of the corner. I'll take you through the API.

You can obtain a region for either the safe area or the area with standard layout margins.

The region includes corner adaptation, which can apply either a horizontal or vertical inset to the region. From a layout region, use the layoutGuide method to obtain a guide for applying auto layout constraints.

You can also obtain the raw geometry of the region in the form of edge insets or its current rectangle.

Here’s an example of the new API in action. My new folder button is colliding with this corner, and I want to constrain it to a region that avoids this collision.

So, in an updateConstraints method, I obtain a layout guide for the safe area, including horizontal corner adaptation. This layout guide is just like the typical safe area layout guide but it includes an extra inset on the edge with the corner.

Then, I create a few layout constraints to tie the button’s geometry to the safe area guide.

It only took a few lines of code, and now my button sits nicely alongside the corner. Next, I’ll introduce the scroll edge effect.

The new design encourages flowing your content edge-to-edge, with Liquid Glass elements floating atop. To provide separation between the glass and the content, the system applies a visual effect in the areas where these two overlap. This effect comes in two variants: a soft-edge-style, which progressively fades and blurs the content, and a hard-edge-style, which uses a more opaque backing to provide greater separation between the content and the floating elements.

For scrollable content, the scroll edge effect lives inside NSScrollView. The scroll view varies the size and shape of the effect based on the content floating above it. The effect adapts automatically as floating elements come and go. The scroll edge effect is applied automatically underneath toolbar items, titlebar accessories, and a new type of accessory, split item accessories. Split item accessories are very similar to titlebar accessories, except they only span one split within a split view controller, and they can be placed at either the top or bottom edge of the split. To add a split item accessory, create a NSSplitViewItemAccessoryViewController, and attach it to the split view item using the addTopAligned- or addBottomAlignedAccessoryViewController method.

Split item accessories, along with titlebar accessories, are the best way to incorporate floating content into the scroll edge effect. They influence the size and shape of the effect, and they inset the content safe area, simplifying content layout.

Now, no design system is complete without my personal favorite - controls. Controls have an all new look in macOS Tahoe. The new design creates a stronger family resemblance across devices, unifying the appearance of elements like buttons, switches and sliders between macOS, iOS and iPadOS. These changes have been thoughtfully applied to retain the character and capability that you expect from Mac controls.

macOS controls are available in a variety of standard sizes, ranging from mini all the way up to large. These sizes establish varying levels of density and hierarchy among your controls. macOS Tahoe adds one more size to this list - extra large - for emphasizing your most important actions.

The extra large size is ideal for showcasing the most prominent actions in your application. These are the actions that people launch your app to get done, like queuing up some music in a media player or placing a call in a communications app. In addition to the new size, we’ve also taken the opportunity to rethink the heights of controls.

Compared to previous releases of macOS, the mini, small, and medium controls are now slightly taller, providing a little more breathing room around the control label and enhancing the size of the click target. To adapt to varying control heights, use Auto Layout and avoid hard-coding the heights of controls. For compatibility with existing high-density layouts, like complex inspectors and popovers, AppKit provides an API to request control sizes that match previous releases of macOS.

Use the new prefersCompactControlSizeMetrics property on NSView. This property is inherited down the view hierarchy, and when it's set to true, AppKit controls will revert to sizing that is compatible with previous releases of macOS.

The new design system introduces some new control shapes as well. The mini through medium sizes retain a rounded-rectangle shape, which enables greater horizontal density, while the large and extra-large sizes round out into a capsule shape, making use of all that extra space. To achieve concentricity in your custom designs, you can override the preferred shape of a control.

In this example, I’ve built a custom call-out bar for spell-checking using medium-sized controls. The container for the bar has a capsule shape but it doesn’t fit well with the rounded rectangle controls inside.

This is a perfect use case for the new borderShape property. This API allows you to override the shapes of buttons, pop-up buttons and segmented controls.

By overriding these controls to use a capsule shape, they fit nicely within my custom container.

In addition to the shape, you can also customize the material of a button using the new glass bezel style.

This bezel style replaces the standard button backing with the Liquid Glass material, which is perfect for buttons that need to float atop other content. The glass bezel style is compatible with the existing bezelColor property, which tints the glass using the provided color.

The new design system also introduces the idea of control prominence to AppKit. By varying the prominence of a button, you can control the level of visual weight given to its tint color. This allows you to add color to a button without upstaging higher-prominence controls inside the same interface, such as the default button. This technique is used for destructive buttons. The distinctive red color is a helpful hint that the action is destructive but with a level of prominence that doesn’t overpower nearby controls.

The tint prominence type has four cases: automatic, which indicates that the control should choose a level of prominence appropriate for its style and configuration; none, which indicates minimal or no tint color; secondary, which indicates a more subdued application of the tint color; and primary, which applies the tint at the most prominent level.

To apply a lower prominence tint to a button, set the tintProminence property to secondary. By default, this will display using the accent color.

In this example, I’m treating the Play button a little differently because I want it to behave as the default button, so I’ve given it the return key equivalent. This ensures that the button responds to the keyboard in a predictable way, and since it’s the default button, it’ll automatically apply the most prominent level of tint.

Tint prominence also has a function with sliders.

The tintProminence API allows you to choose whether the track is filled with the accent color. A slider set to none will avoid filling its track, whereas a slider set to secondary or primary will fill it.

The slider fill has learned one more trick in macOS Tahoe. It can anchor itself at any location along the track, rather than just the minimum end. Use the new neutralValue property to set a value that serves as the anchor for the track fill.

In this example of a playback speed control, I’ve set the neutralValue to 1x, so that when the speed is made slower or faster, the blue fill helps communicate the difference between the selected value and the default value. The new design system also brings an update to menus with a refreshed appearance and a significant expansion in the use of icons.

Both menu bar menus and context menus now use icons to represent their key actions.

Within each section of a menu, the icons form a single column that’s easy to scan through. Adding clear, recognizable symbols to your menu items helps people quickly find the most important actions in the menu.

The “Get to know the new design system” video provides a ton of additional guidance for choosing symbols for your menu items. Be sure to check it out. Finally, integrating Liquid Glass elements into your app.

Before you integrate the Liquid Glass material into your custom UI elements, it’s important to think about the design intent behind this new material.

Liquid Glass elements float at the top level of the UI, elevating the controls and navigation into a distinct functional layer.

With that in mind, limit your use of Liquid Glass to the most important elements in your app, the controls that belong in this top level of hierarchy. Freeform’s inline editing controls are a great example. They float above the content rather than sitting alongside, and they work beautifully with the Liquid Glass material.

To place your content on glass, use the NSGlassEffectView API. Setting a contentView allows AppKit to apply all of the necessary visual treatments to keep your content legible as the glass adapts to its surroundings.

So avoid placing the NSGlassEffectView behind your content as a sibling view.

You can customize the appearance of the glass using the corner radius and tint color properties. I’ll take you through an example of adopting NSGlassEffectView for an existing element.

In this example, I have a fitness app which shows daily training stats and a custom control for picking the type of workout. I’m displaying them using a horizontal NSStackView. Now, this is a prominent part of my UI, so I’m going to put both parts of it on glass.

Adopting the Liquid Glass material takes just a few new lines of code.

First, create an NSGlassEffectView for each glass element that you want to display and set each one’s contentView property to the desired view. The glass effect view ties its geometry to the contentView using Auto Layout, so you don’t have to worry about keeping them in sync.

Then, put the glass effect views into the view hierarchy. In this example, I updated the stack view to swap in the new glass effect views.

If you have multiple glass shapes in close proximity, group them together using NSGlassEffectContainerView. The glass effect container view combines multiple glass elements together into a single rendering effect. This has a few benefits.

First, grouped glass elements can fluidly join and separate using a liquid visual effect. The glass shapes meld together based on their proximity and the value of the spacing property, which is available on NSGlassEffectContainerView.

Second, the adaptive appearance of the glass is shared across the grouped elements, which ensures that they maintain a uniform appearance as the underlying content changes.

And grouping is important for visual correctness. The Liquid Glass material reflects and refracts light, picking color from nearby content.

To create this effect, the glass material samples content from an area larger than itself. But what happens if that sampling region includes another glass element? Well, glass can’t directly sample other glass, so the visual results in this case will not be consistent.

Using a glass effect container allows these elements to share their sampling region. Not only does this provide a more consistent visual result but it also improves the performance of the glass effect, since it only needs one sampling pass for the entire group.

Revisiting the sample from earlier, these two glass effects are part of a logical group, so they need to be inside a glass effect container. It’s straightforward to set one up. In this example, I create an NSGlassEffectContainerView and set the stack view as its content view. The container and its content view are also constrained together using Auto Layout, So I can cleanly swap this container into my existing layout. The Liquid Glass material is a powerful tool for elevating your app’s key controls and enabling your content to flow seamlessly from edge to edge. It’s a great way to highlight the functionality that makes your app unique.

So what's next? As a first step, build your app with Xcode 26. A lot of the new design will start working right away. Extend your content edge-to-edge wherever possible, taking full advantage of the floating glass toolbar and sidebar.

Then, adapt to the new control sizes by auditing your app for hard-coded control heights or inflexible layout constraints. Enhance your menu actions with symbol icons and identify key elements of your interface to elevate with the Liquid Glass material.

Thanks for watching and thanks for making great Mac apps.

Code

3:11 - Removing toolbar item glass

// Removing toolbar item glass

toolbarItem.isBordered = false

3:30 - Tinted toolbar controls

// Tints the glass with the accent color.
toolbarItem.style = .prominent

// Tints the glass with a specific color.
toolbarItem.backgroundTintColor = .systemGreen

3:58 - Toolbar badges

// Numeric badge
NSItemBadge.count(4)

// Text badge
NSItemBadge.text("New")

// Badge indicator
NSItemBadge.indicator

5:25 - Content under the sidebar

// Content under the sidebar

splitViewItem.automaticallyAdjustsSafeAreaInsets = true

8:47 - Avoiding a window corner

// Avoiding a window corner


func updateConstraints() {
    guard !installedButtonConstraints else { return }

    let safeArea = layoutGuide(for: .safeArea(cornerAdaptation: .horizontal))

    NSLayoutConstraint.activate([
        safeArea.leadingAnchor.constraint(equalTo: button.leadingAnchor),
        safeArea.trailingAnchor.constraint(greaterThanOrEqualTo: button.trailingAnchor),
        safeArea.bottomAnchor.constraint(equalTo: button.bottomAnchor)
    ])
    installedButtonConstraints = true
}

15:31 - Levels of prominence

// Create buttons with varying levels of prominence

// Prefer a “secondary” tinted appearance for the shuffle and enqueue buttons
shuffleButton.tintProminence = .secondary
playNextButton.tintProminence = .secondary

// The "play" will automatically use primary prominence because it is the default button
playButton.keyEquivalent = "\r"

18:42 - Adopting NSGlassEffectView

// Adopting NSGlassEffectView

let userInfoView = UserInfoView()
let activityPickerView = ActivityPickerView()

let userInfoGlass = NSGlassEffectView()
userInfoGlass.contentView = userInfoView

let activityPickerGlass = NSGlassEffectView()
activityPickerGlass.contentView = activityPickerView

let stack = NSStackView(views: [userInfoGlass, 
                                activityPickerGlass])
stack.orientation = .horizontal

21:03 - Adopting NSGlassEffectContainerView

// Adopting NSGlassEffectContainerView

let userInfoView = UserInfoView()
let activityPickerView = ActivityPickerView()

let userInfoGlass = NSGlassEffectView()
userInfoGlass.contentView = userInfoView
userInfoGlass.cornerRadius = 999

let activityPickerGlass = NSGlassEffectView()
activityPickerGlass.contentView = activityPickerView
activityPickerGlass.cornerRadius = 999

let stack = NSStackView(views: [userInfoGlass, 
                                activityPickerGlass])
stack.orientation = .horizontal

let glassContainer = NSGlassEffectContainerView()
glassContainer.contentView = stack

Capture cinematic video in your app

Discover how the Cinematic Video API enables your app to effortlessly capture cinema-style videos. We'll cover how to configure a Cinematic capture session and introduce the fundamentals of building a video capture UI. We'll also explore advanced Cinematic features such as applying a depth of field effect to achieve both tracking and rack focus.

Chapters

Resources

Related Videos

WWDC25

WWDC24

WWDC23

Transcript

Hi, I'm Roy. I’m an engineer on the Camera Software team. Today, I’m excited to talk about how your apps can easily capture pro-level cinema-style videos with a Cinematic Video API.

With iPhone 13 and 13 Pro, we introduced Cinematic mode. With its intuitive user interface and powerful algorithms, It transformed iPhone into a cinematography powerhouse. In this talk, we will have a look at what makes Cinematic video magical and walk through some code together to see how to build a great Cinematic capture experience.

So, what is Cinematic video? At its heart are classic storytelling tools like rect focus and tracking focus. With a shallow depth of field, the director guides viewers attention to the key subjects in the scene, enhancing narrative impacts. When subjects move, as they often do in films, tracking focus keeps them sharply in view.

Though powerful, in the real world, these focus techniques require a great deal of expertise, which is why on a movie set, there are focus pollers whose main responsibility is to carry out these powerful but challenging shots. Cinematic video drastically simplifies this by intelligently driving focus decisions. For example, when a subject enters the frame, the algorithm automatically racks the focus to them and starts tracking. When a subject looks away, the focus automatically transitions to another point, returning to the subject when appropriate. This year, we're making these amazing capabilities available as the Cinematic Video API, so your apps can easily capture these amazing cinema-style videos. Let’s explore how we can build a great capture experience for Cinematic videos using the new API. Let’s start with a typical capture session for a video app.

Firstly, we select the device from which we want to capture movies.

Then we add it to a device input. Depending on the use cases, multiple outputs can be added to the session. Connection objects will be created when these outputs are added to the capture session.

This is not a trivial setup, but enabling Cinematic video capture is really easy. In iOS 26, we're adding a new property, isCinematicVideoCaptureEnabled on the AVCaptureDeviceInput class. By setting it to true, we configure the whole capture session to output Cinematic video. and each of the outputs will now receive the Cinematic treatment.

The movie file produced by the movie file output will be Cinematic. It contains the disparity data, metadata, and the original video that enables non-destructive editing. To play it back with the bokeh rendered or edit the bokeh effect, you can use the Cinematic Framework we introduced in 2023. To learn more about this framework, please check out the WWDC23 session Support Cinematic mode videos in your app. The video data output will produce frames with a shallow depth of field effect baked in. This is useful when you need direct access to the frames, such as when sending them to a remote device.

Similarly, the preview layer will have the bokeh rendered into it in real time. It's an easy way to build a viewfinder. With this high-level architecture in mind, let’s walk through some code in these following areas.

We will configure an AVCaptureSession with all its components required for Cinematic capture.

Then we build an interface for video capture using SwiftUI.

We will walk through how to get metadata like face detections and how to draw them on the screen.

With different ways to manually drive focus, we tap into the full power of Cinematic video.

And we finish off with some advanced features to make our app more polished.

Let’s get started with the capture session. First, let’s find the video device from which we want to capture the movie. To find the device, we create an AVCaptureDevice.DiscoverySession object.

Cinematic video is supported on both the Dual Wide camera in the back and the TrueDepth camera in the front. In this case, we specify .builtInDualWideCamera in the array of device types. Since we’re shooting a movie, we use .video as the mediaType.

And we request the camera in the back of the device.

As we’re only requesting a single device type, we can just get the first element in the discovery session's devices array.

In order to enable Cinematic video capture, a format that supports this feature must be used.

To find such formats, we can iterate through all the device’s formats and use the one whose isCinematicVideoCaptureSupported property returns true.

Here are all the supported formats.

For both Dual Wide and TrueDepth cameras, both 1080p and 4K are supported at 30 frames per second.

If you are interested in recording SDR or EDR content, you can use either 420 video range or full range. If we prefer 10-bit HDR video content, use x420 instead.

Since we’re not making a silent film, we want sound as well. We will use the same DiscoverySession API to find the microphone.

With our devices in hand, we create the inputs for each one of them. Then we add these inputs to the capture session. At this point, we can turn on Cinematic video capture on the video input. To enhance the Cinematic experience, we can capture spatial audio by simply setting first order ambisonics as the multichannelAudioMode.

To learn more about spatial audio, please check out this year's session, “Enhance your app’s audio content creation capabilities.” Moving on to the outputs, we create an AVCaptureMovieFileOutput object and add it to the session.

Our hands are never as steady as a tripod, so we recommend enabling video stabilization. To do so, we first find the video connection of the movieFileOutput and set its preferredVideoStabilizationMode. In this case, we use cinematicExtendedEnhanced.

Lastly, we need to associate our preview layer with the capture session. We’re done with the capture session for now. Let's move on to the user interface.

Since AVCaptureVideoPreviewLayer is a subclass of CALayer, which is not part of SwiftUI, to make them interoperate, we need to wrap the preview layer into a struct that conforms to the UIViewRepresentable protocol. Within this struct, we make a UIView subclass CameraPreviewUIView.

We override its layerClass property to make the previewLayer the backing layer for the view.

And we make a previewLayer property to make it easily accessible as an AVCaptureVideoPreviewLayer type.

We can then put our preview view into a ZStack, where it can be easily composed with other UI elements like camera controls.

As mentioned in the intro, shallow depth of field is an important tool for storytelling. By changing the simulatedAperture property on the device input, we can adjust the global strength of the bokeh effect. Displayed on the right, driving this property with a slider, we change the global strength of the blur.

This value is expressed in the industry standard f-stops, which is simply the ratio between the focal length and the aperture diameter.

Moving them around, the aperture is the focal length divided by the f number.

Therefore, the smaller the f number, the larger the aperture, and the stronger the bokeh will be.

We can find the minimum, maximum, and default simulated aperture on the format.

We use them to populate the appropriate UI elements, like a slider.

Now, let’s build some affordances that allow the user to manually interact with Cinematic video. For users to manually drive focus, we need to show visual indicators for focus candidates like faces. And to do that, we need some detection metadata.

We will use an AVCaptureMetadataOutput to get these detections so we can draw their bounds on the screen for users to interact with. The Cinematic video algorithm requires certain metadataObjectTypes to work optimally. And they are communicated with the new property requiredMetadataObjectTypesForCinematicVideoCapture. An exception is thrown if the metadataObjectTypes provided differ from this list when Cinematic video is enabled.

Lastly, we need to provide a delegate to receive the metadata and a queue on which the delegate is called.

we receive metadata objects in the metadata output delegate callback.

To easily communicate this metadata to our view layer in SwiftUI, we use an observable class.

When we update its property, the observing view will automatically refresh.

In our view layer, whenever our observable object is updated, the view is automatically redrawn. And we draw a rectangle for each metadataObject.

When creating these rectangles, it’s important that we transform metadata’s bounds into the preview layer’s coordinate space. Using the layerRectConverted fromMetadataOutputRect method.

Note that X and Y in the position method refer to the center of the view, instead of the upper left corner used by AVFoundation. So we need to adjust accordingly by using the midX and midY of the rect.

With metadata rectangles drawn on the screen, we can use them to manually drive focus.

The Cinematic Video API offers three ways to manually focus. Let's now walk through them one by one. The setCinematicVideoTrackingFocus detectedObjectID focusMode method can be used to rack focus to a particular subject identified by the detectedObjectID, which is available on the AVMetadata objects that you get from the metadata output. focusMode configures Cinematic video’s tracking behavior. The CinematicVideoFocusMode enum has three cases: none, strong, and weak. Strong tells Cinematic video to keep tracking a subject even when there are focus candidates that would have been otherwise automatically selected.

In this case, although the cat became more prominent in the frame, the strong focus, as indicated by the solid yellow rectangle, stayed locked on the subject in the back. Weak focus, on the other hand, lets the algorithm retain focus control. It automatically racks the focus when it sees fit. In this case, as the cat turned around, he was considered more important, and the weak focus shifted automatically to him, as indicated by the dashed rectangle.

The none case is only useful when determining whether a metadata object currently has the focus, so it should not be used when setting the focus.

The second focus method takes a different first parameter. Instead of a detected object ID, it takes a point in a view.

It tells Cinematic video to look for any interesting object at the specified point. When it finds one, it will create a new metadata object with the type salient object. So we can draw the rectangle around it on the screen.

The third focus method is setCinematicVideoFixedFocus that takes a point and the focus mode. It sets the focus at a fixed distance which is computed internally using signals such as depth. Paired with a strong focus mode, this method effectively locks the focus at a particular plane in the scene, ignoring other activities even in the foreground. Any apps can implement the focus logic that makes sense in their respective use cases. In our app, we do the following: Tapping on a detection rectangle not in focus, we rack the focus to it with a weak focus. With this, we can switch the focus back and forth between subjects in and out of focus.

Tapping on a metadata object already being weakly focused on turns it into a strong focus, indicated by the solid yellow rectangle.

Tapping at a point where there are no existing detections, we want Cinematic video to try to find any salient object and weakly focus on that. With a long press, we set a strong fixed focus. Here is how we can implement this logic in code. Firstly, we need to make two gestures.

The regular tap gesture can be easily done with a SpatialTapGesture, which provides the tap location that we need to set focus.

When tapped, we call the focusTap method on our camera model object, where we have access to the underlying AVCaptureDevice.

Long press, on the other hand, is a bit more complicated, as the built-in longPressGesture doesn’t provide the tap location we need to simulate a long press with a DragGesture.

When pressed, we start at 0.3 second timer.

When it fires, we call the focusLongPress method on the camera model.

Then we create a rectangle view to receive the gestures. It’s inserted at the end of the ZStack, which puts it on top of all the detection rectangles so the user’s gesture input is not blocked.

As we already saw in the previous videos, it's important to visually differentiate the focused rectangles between weak focus, strong focus, and no focus to help the user take the right action.

We do this by implementing a method that takes an AVMetadataObject and returns a focused rectangle view. Let’s not forget that we need to transform the bounds of the metadata from the metadata output’s coordinate space to that of the preview layer.

By setting different stroke styles and colors, we can easily create visually distinct rectangles for each focus mode.

With the point passed from the view layer, we can determine which focus method to use. First, we need to figure out whether the user has tapped on a metadata rectangle. And we do this in the helper method, findTappedMetadataObject.

Here, we iterate through all the metadata that we cache for each frame and check whether the point specified falls into one of their bounds. Again, we make sure the point and the rect are in the same coordinate space.

Coming back to the focusTap method, if a metadata object is found and is already in weak focus, then we turn it into a strong focus.

If it’s not already in focus, we focus on it weakly.

If the user didn’t tap on a metadata rectangle, then we tell the framework to try to find a salient object at this point. With a long press, we simply set a strong fixed focus at the specified point. At this point, we have a fully functional app that can capture Cinematic video. Let’s polish it up with a few more details. Currently, our video capture graph looks like this. We have three outputs to capture the movie, receive metadata, and the preview. If we want to support still image capture during the recording, we can do so by simply adding an AVCapturePhotoOutput to the session.

Since our graph is already configured to be Cinematic, the photo output will get a Cinematic treatment automatically.

The image returned by the photo output will have the bokeh effect burned in.

Lastly, the Cinematic video algorithm requires sufficient amount of light to function properly. So in a room that’s too dark or the camera is covered, We want to inform users of such problem in the UI. In order to be notified when this condition occurs, you can key-value observe the new property cinematicVideoCaptureSceneMonitoringStatuses on the AVCaptureDevice class. Currently, the only supported status for Cinematic video is not enough light.

In the KVO handler, we can update the UI properly when we see insufficient light.

An empty set means that everything is back to normal.

In today’s talk, we had a recap on how Cinematic video enables our users to capture gorgeous pro-level movies, even for everyday moments like hanging out with their pets. And we had a detailed walkthrough on how to build a great Cinematic capture experience with the Cinematic Video API. We can’t wait to see how your apps can tap into these capabilities to deliver richer, more cinematic content. Thank you for watching.

Code

4:26 - Select a video device

// Select a video device

let deviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInDualWideCamera], mediaType: .video, position: .back)
        
guard let camera = deviceDiscoverySession.devices.first else {
    print("Failed to find the capture device")
    return
}

5:07 - Select a format that supports Cinematic Video capture

// Select a format that supports Cinematic Video capture

for format in camera.formats {

    if format.isCinematicVideoCaptureSupported {

       try! camera.lockForConfiguration()
       camera.activeFormat = format
       camera.unlockForConfiguration()

       break
    }

}

5:51 - Select a microphone

// Select a microphone

let audioDeviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes [.microphone], mediaType: .audio, position: .unspecified)

guard let microphone = audioDeviceDiscoverySession.devices.first else {
    print("Failed to find a microphone")
    return
}

6:00 - Add devices to input & add inputs to the capture session & enable Cinematic Video capture

// Add devices to inputs

let videoInput = try! AVCaptureDeviceInput(device: camera)
guard captureSession.canAddInput(videoInput) else {
    print("Can't add the video input to the session")
    return
}

let audioInput = try! AVCaptureDeviceInput(device: microphone)
guard captureSession.canAddInput(audioInput) else {
    print("Can't add the audio input to the session")
    return
}

// Add inputs to the capture session

captureSession.addInput(videoInput)
captureSession.addInput(audioInput)

// Enable Cinematic Video capture

if (videoInput.isCinematicVideoCaptureSupported) {
  videoInput.isCinematicVideoCaptureEnabled = true
}

6:17 - Capture spatial audio

// Configure spatial audio

if audioInput.isMultichannelAudioModeSupported(.firstOrderAmbisonics) {
    audioInput.multichannelAudioMode = .firstOrderAmbisonics
}

6:33 - Add outputs to the session & configure video stabilization & associate the preview layer with the capture session

// Add outputs to the session

let movieFileOutput = AVCaptureMovieFileOutput()
guard captureSession.canAddOutput(movieFileOutput) else {
    print("Can't add the movie file output to the session")
    return
}
captureSession.addOutput(movieFileOutput)
        

// Configure video stabilization

if let connection = movieFileOutput.connection(with: .video), 
    connection.isVideoStabilizationSupported {
    connection.preferredVideoStabilizationMode = .cinematicExtendedEnhanced
}

// Add a preview layer as the view finder

let previewLayer = AVCaptureVideoPreviewLayer()
previewLayer.session = captureSession

7:11 - Display the preview layer with SwiftUI

// Display the preview layer with SwiftUI

struct CameraPreviewView: UIViewRepresentable {

    func makeUIView(context: Context) -> PreviewView {
        return PreviewView()
    }

    class CameraPreviewUIView: UIView {
	
			override class var layerClass: AnyClass {
    		AVCaptureVideoPreviewLayer.self
			}

			var previewLayer: AVCaptureVideoPreviewLayer {
  	  	layer as! AVCaptureVideoPreviewLayer
			}

			...
		}

...
}

7:54 - Display the preview layer with SwiftUI

// Display the preview layer with SwiftUI

@MainActor
struct CameraView: View {       

    var body: some View {
        ZStack {
            CameraPreviewView()  
          	CameraControlsView()
        }
    }
}

8:05 - Adjust bokeh strength with simulated aperture

// Adjust bokeh strength with simulated aperture


open class AVCaptureDeviceInput : AVCaptureInput {

	open var simulatedAperture: Float

	...

}

8:40 - Find min, max, and default simulated aperture

// Adjust bokeh strength with simulated aperture


extension AVCaptureDeviceFormat {

	open var minSimulatedAperture: Float { get }

	open var maxSimulatedAperture: Float { get }

	open var defaultSimulatedAperture: Float { get }

	...

}

9:12 - Add a metadata output

// Add a metadata output

let metadataOutput = AVCaptureMetadataOutput()

guard captureSession.canAddOutput(metadataOutput) else {
    print("Can't add the metadata output to the session")
    return
}
captureSession.addOutput(metadataOutput)

metadataOutput.metadataObjectTypes = metadataOutput.requiredMetadataObjectTypesForCinematicVideoCapture

metadataOutput.setMetadataObjectsDelegate(self, queue: sessionQueue)

9:50 - Update the observed manager object

// Update the observed manager object

func metadataOutput(_ output: AVCaptureMetadataOutput, didOutput metadataObjects: [AVMetadataObject], from connection: AVCaptureConnection) {

   self.metadataManager.metadataObjects = metadataObjects

}

// Pass metadata to SwiftUI

@Observable
class CinematicMetadataManager {
    
    var metadataObjects: [AVMetadataObject] = []
    
}

10:12 - Observe changes and update the view

// Observe changes and update the view

struct FocusOverlayView : View {

    var body: some View {

	        ForEach(
	      metadataManager.metadataObjects, id:\.objectID)
		  	{ metadataObject in

    		  rectangle(for: metadataObject)

			  }
		}
}

10:18 - Make a rectangle for a metadata

// Make a rectangle for a metadata

private func rectangle(for metadata: AVMetadataObjects) -> some View {
    
    let transformedRect = previewLayer.layerRectConverted(fromMetadataOutputRect: metadata.bounds)
    
    return Rectangle()
        .frame(width:transformedRect.width,
               height:transformedRect.height)
        .position(
            x:transformedRect.midX,
            y:transformedRect.midY)
}

10:53 - Focus methods

open func setCinematicVideoTrackingFocus(detectedObjectID: Int, focusMode: AVCaptureDevice.CinematicVideoFocusMode)

open func setCinematicVideoTrackingFocus(at point: CGPoint, focusMode: AVCaptureDevice.CinematicVideoFocusMode)

open func setCinematicVideoFixedFocus(at point: CGPoint, focusMode: AVCaptureDevice.CinematicVideoFocusMode)

10:59 - Focus method 1 & CinematicVideoFocusMode

// Focus methods

open func setCinematicVideoTrackingFocus(detectedObjectID: Int, focusMode: AVCaptureDevice.CinematicVideoFocusMode)


public enum CinematicVideoFocusMode : Int, @unchecked Sendable {

    case none = 0

    case strong = 1

    case weak = 2
}

extension AVMetadataObject {

   open var cinematicVideoFocusMode: Int32 { get }

}

12:19 - Focus method no.2

// Focus method no.2

open func setCinematicVideoTrackingFocus(at point: CGPoint, focusMode: AVCaptureDevice.CinematicVideoFocusMode)

12:41 - Focus method no.3

// Focus method no.3

open func setCinematicVideoFixedFocus(at point: CGPoint, focusMode: AVCaptureDevice.CinematicVideoFocusMode)

13:54 - Create the spatial tap gesture

var body: some View {

let spatialTapGesture = SpatialTapGesture()
    .onEnded { event in
        Task {
            await camera.focusTap(at: event.location)
        }
     }

...
}

14:15 - Simulate a long press gesture with a drag gesture

@State private var pressLocation: CGPoint = .zero
@State private var isPressing = false
private let longPressDuration: TimeInterval = 0.3

var body: some View {
  
  ...
  
	let longPressGesture = DragGesture(minimumDistance: 0).onChanged { value in
		if !isPressing {
			isPressing = true
			pressLocation = value.location
			startLoopPressTimer()
		}
	}.onEnded { _ in
		isPressing = false
	}
  
	...
  
}

private func startLoopPressTimer() {
	DispatchQueue.main.asyncAfter(deadline: .now() + longPressDuration) {
		if isPressing {
			Task {
				await camera.focusLongPress(at: pressLocation)
			}
		}
	}
}

14:36 - Create a rectangle view to receive gestures.

var body: some View {

let spatialTapGesture = ...
let longPressGesture = ...

ZStack {
  ForEach(
    metadataManager.metadataObjects,
    id:\.objectID)
  { metadataObject in

    rectangle(for: metadataObject)

  }
  Rectangle()
      .fill(Color.clear)
      .contentShape(Rectangle())
      .gesture(spatialTapGesture)
  .gesture(longPressGesture)}

  }
}

15:03 - Create the rectangle view

private func rectangle(for metadata: AVMetadataObject) -> some View {
    
    let transformedRect = previewLayer.layerRectConverted(fromMetadataOutputRect: metadata.bounds)
    var color: Color
    var strokeStyle: StrokeStyle
    
    switch metadata.focusMode {
    case .weak:
        color = .yellow
        strokeStyle = StrokeStyle(lineWidth: 2, dash: [5,4])
    case .strong:
        color = .yellow
        strokeStyle = StrokeStyle(lineWidth: 2)
    case .none:
        color = .white
        strokeStyle = StrokeStyle(lineWidth: 2)
    }
    
    return Rectangle()
        .stroke(color, style: strokeStyle)
        .contentShape(Rectangle())
        .frame(width: transformedRect.width, height: transformedRect.height)
        .position(x: transformedRect.midX, 
                  y: transformedRect.midY)
}

15:30 - Implement focusTap

func focusTap(at point:CGPoint) {
    
   try! camera.lockForConfiguration()
        
    if let metadataObject = findTappedMetadataObject(at: point) {
        if metadataObject.cinematicVideoFocusMode == .weak {
            camera.setCinematicVideoTrackingFocus(detectedObjectID: metadataObject.objectID, focusMode: .strong)
            
        }
        else {
            camera.setCinematicVideoTrackingFocus(detectedObjectID: metadataObject.objectID, focusMode: .weak)
        }
    }
    else {
        let transformedPoint = previewLayer.metadataOutputRectConverted(fromLayerRect: CGRect(origin:point, size:.zero)).origin
        camera.setCinematicVideoTrackingFocus(at: transformedPoint, focusMode: .weak)
    }
    
    camera.unlockForConfiguration()
}

15:42 - Implement findTappedMetadataObject

private func findTappedMetadataObject(at point: CGPoint) -> AVMetadataObject? {
    
    var metadataObjectToReturn: AVMetadataObject?
    
    for metadataObject in metadataObjectsArray {
        let layerRect = previewLayer.layerRectConverted(fromMetadataOutputRect: metadataObject.bounds)
        if layerRect.contains(point) {
            metadataObjectToReturn = metadataObject
            break
        }
    }
    
    return metadataObjectToReturn
}

16:01 - focusTap implementation continued

func focusTap(at point:CGPoint) {
    
   try! camera.lockForConfiguration()
        
    if let metadataObject = findTappedMetadataObject(at: point) {
        if metadataObject.cinematicVideoFocusMode == .weak {
            camera.setCinematicVideoTrackingFocus(detectedObjectID: metadataObject.objectID, focusMode: .strong)
            
        }
        else {
            camera.setCinematicVideoTrackingFocus(detectedObjectID: metadataObject.objectID, focusMode: .weak)
        }
    }
    else {
        let transformedPoint = previewLayer.metadataOutputRectConverted(fromLayerRect: CGRect(origin:point, size:.zero)).origin
        camera.setCinematicVideoTrackingFocus(at: transformedPoint, focusMode: .weak)
    }
    
    camera.unlockForConfiguration()
}

16:23 - Implement focusLongPress

func focusLongPress(at point:CGPoint) {
    
   try! camera.lockForConfiguration()

   let transformedPoint = previewLayer.metadataOutputRectConverted(fromLayerRect:CGRect(origin: point, size: CGSizeZero)).origin
       camera.setCinematicVideoFixedFocus(at: pointInMetadataOutputSpace, focusMode: .strong)
   
    camera.unlockForConfiguration()
}

17:10 - Introduce cinematicVideoCaptureSceneMonitoringStatuses

extension AVCaptureDevice {

   open var cinematicVideoCaptureSceneMonitoringStatuses: Set<AVCaptureSceneMonitoringStatus> { get }

}

extension AVCaptureSceneMonitoringStatus {

   public static let notEnoughLight: AVCaptureSceneMonitoringStatus

}

17:42 - KVO handler for cinematicVideoCaptureSceneMonitoringStatuses

private var observation: NSKeyValueObservation?

observation = camera.observe(\.cinematicVideoCaptureSceneMonitoringStatuses, options: [.new, .old]) { _, value in
    
    if let newStatuses = value.newValue {
        if newStatuses.contains(.notEnoughLight) {
            // Update UI (e.g., "Not enough light")
        }
        else if newStatuses.count == 0 {
            // Back to normal.
        }
    }
}

Summary

  • 0:00 - Introduction

  • Use the Cinematic Video API to capture pro-level cinema-style videos in your apps. iPhone 13 and 13 Pro introduced Cinematic mode, which transformed iPhone into a cinematography powerhouse.

  • 0:33 - Cinematic video

  • Cinematic video uses shallow depth of field and tracking focus to guide viewer attention, mimicking film techniques. The Cinematic Video API in iOS 26 simplifies this process, enabling apps to automatically rack and track focus. To build a Cinematic capture experience, set up a capture session, select a device, and then enable Cinematic video capture by setting 'isCinematicVideoCaptureEnabled' to true on the 'AVCaptureDeviceInput' class. This configures the session to output Cinematic video with disparity data, metadata, and the original video, allowing for non-destructive editing. You can play back or edit the bokeh rendering with the Cinematic Framework.

  • 3:44 - Build a great cinematic capture experience

  • The example begins by setting up an 'AVCaptureSession' to enable Cinematic video capture on compatible devices, such as the Dual Wide camera on the back and the TrueDepth camera on the front. The example selects an appropriate video format, with 'isCinematicVideoCaptureSupported' returning as true, and then adds audio input from the microphone to the session. Spatial audio capture is enabled to enhance the Cinematic experience. To learn more about Spatial Audio, see "Enhance your app's audio content creation capabilities". Next, video stabilization is enabled, to enhance the user experience, and the capture session is previewed using a SwiftUI view. Then, the example creates a custom representable struct to wrap the 'AVCaptureVideoPreviewLayer', allowing it to be integrated seamlessly into the SwiftUI interface. The example then delves into controlling the Cinematic video effect, specifically the shallow depth of field. By adjusting the 'simulatedAperture', the bokeh effect can be strengthened or weakened, providing more creative control over the video. To enable manual focus control, the example implements metadata detection to identify focus candidates, such as faces. Then, it draws rectangles on the screen to represent these candidates, allowing users to tap and focus on specific subjects. The Cinematic Video API provides several methods to control focus during video recording. The API outputs metadata objects that include information about detected subjects in the frame. The 'focusMode' configuration parameter determines the tracking behavior of Cinematic video. There are three cases for this enum: 'none', 'strong', and 'weak'. Strong focus locks onto a subject, ignoring other potential focus candidates. Weak focus allows the algorithm to automatically rack focus based on the scene. The none case is primarily used to determine focus status rather than set it. The API offers three focus methods: 'setCinematicVideoTrackingFocus' method takes a detected object ID as input and sets the focus to that object. 'setCinematicVideoTrackingFocus' method takes a point in the view as input. Cinematic video then searches for an interesting object at that point and creates a new metadata object of type 'salient object', which can then be focused on. 'setCinematicVideoFixedFocus' sets a fixed focus at a specific point in the scene, computing the distance internally using depth signals. When paired with a strong focus mode, this locks the focus on a particular plane, ignoring other activities in the scene. You can implement custom focus logic in your apps. For example, tapping on a detection rectangle can switch focus between subjects, and a long press can set a strong fixed focus. The app visually differentiates between weak, strong, and no focus to guide the user. Additionally, the API allows for still image capture during recording, which will automatically receive the Cinematic treatment with the bokeh effect. The app can also use key-value observation to observe the 'cinematicVideoCaptureSceneMonitoringStatuses' property, and inform the user when there is insufficient light for proper Cinematic video capture.

Code-along: Bring on-device AI to your app using the Foundation Models framework

Develop generative AI features for your SwiftUI apps using the Foundation Models framework. Get started by applying the basics of the framework to create an awesome feature. Watch step-by-step examples of how to complement the models with tools you build, stream results, and apply further optimizations for great performance.

Chapters

Resources

Related Videos

WWDC25

Transcript

Hi, I’m Naomy! Let’s dive into the world of SwiftUI and on-device intelligence. In this code-along, we’ll explore how to add exciting new features to your apps using the FoundationModels framework. I’ll take you through a step-by-step example as I create an app that’ll plan my next trip. The FoundationModels framework gives you direct access to Apple’s on-device large language model, and the possibilities of what you can create are truly endless! All of this runs on-device, so your users’ data can stay private. The model is offline, and it’s already embedded into the operating system. It won’t increase the size of your apps. With this framework, we can build features that are powerful, private and performant, on macOS, iPadOS, iOS, and visionOS. My friends and I want to go on a trip - but we need some inspiration of where to go and what to do. Planning can be hard, but using the FoundationModels framework is easy. Let’s use it to create an app that will do all the planning work for us! Here’s what we’ll build today. In my app, the landing page displays several featured landmarks, that we can browse through. Oh, those trees look interesting! Let’s select Joshua Tree. We can tap the Generate button, and that’s where the model will do the work for us! It will create an itinerary, and in the process, it will also use tool calling, to autonomously choose the best points of interest for our landmark. Ok, I think my friends will like this app, so let’s start building it! There are a few steps to using the framework that we’ll go over today. It all starts with prompt engineering. I’ll show you how to perfect your prompts using Xcode’s playground! We’ll use tool calling to complement the model, by letting it reach out to external sources to fetch points of interest for our landmarks. And we’ll stream output to start showing our itinerary as the model is generating it! To polish things off, we’ll profile our app and apply optimizations to get great performance with the FoundationModels framework. A great prompt is the key to getting great results. Prompting can often be a tedious task, requiring lots of iteration. I could run my app, examine the output, change my prompt, re-run and repeat this all over again, but then I’d be here all day, and I won’t actually have time for my trip! Luckily, Xcode’s updated Playground feature is here to help! To get started, ensure that the canvas is enabled.

This is like Previews for SwiftUI, but it will give us live feedback for any Swift code. Now, in any file in my project, I can import Playgrounds, and write a #Playground to iterate on my code.

Let’s start with the basics. I’ll import FoundationModels, create a session, and make a request.

In my prompt I’ll say: “Create an itinerary.” And just as I typed that, this automatically executed, and we can see the results in the canvas. I was a bit vague, wasn’t I? Let me start by adding a location.

Our prompt is much better. We got an itinerary this time. I’m satisfied for now, and I can always come back and tweak my prompt later on. We just discovered how easy it is to get started with the FoundationModels framework, by giving a prompt and receiving a string as output. But for my itinerary, I really want more structured output. I want to represent the result using my own data structures, but I don’t want to worry about parsing the model’s output. Guided generation enables the model to automatically create the data structures I want, as long as I annotate them as Generable. Let’s put this in practice. I have an Itinerary structure here that I’d like the model to generate. Let’s start by importing FoundationModels, and add the Generable annotation! The only requirement of Generable is that the types of your properties are also Generable. Luckily, common types like string are Generable out of the box. If you have nested types, like DayPlan here, then you can just make those Generable as well. Now that we have a fully Generable type hierarchy, we can proceed and the let the model generate Itinerary structures as our response. If you want greater control over the output, the Guide macro lets you put constraints on the values of your properties. You can add a guide with a description for example, like I'll do for the title here.

We can also constrain a property to a known subset of values.

I can add a count, to make sure my days array always has exactly three elements, and even use multiple guides for a property. I’ll go ahead and add a few more guides to my properties.

Guide descriptions are really another way of prompting the model. I highly recommend watching the “Deep Dive” video, where Louis will tell you all about Guided generation. And now that we have our prompt, and our Generable type, and we can put it all together.

Here’s our Itinerary Planner, where we’ll store all our Foundation Models logic. Let’s go ahead and create a session. And give it instructions. I can use the builder API here to create my instructions with ease. In this closure, I can pass in multiples strings, and even instances of my Generable type. Instructions are a higher level form of prompting. Here, I’ll define what the model’s job is.

We would like an itinerary, and I’ll give the model some information about the landmark selected.

Including an example is great, because it gives the model a better idea towards the type of response I’m looking for.

And I can pass in an instance of my Itinerary struct, which is defined below using a trip to Japan.

Because my Itinerary struct is Generable, Foundation Models will automatically convert it to text that the model can understand. Now, we’re ready to make our request, with our Generable type as output.

In our prompt, we’ll explicitly ask the model for an itinerary.

And to tie everything together - let’s set our itinerary to the response from the model.

Our final step is to display this in our UI. Let’s make our ItineraryPlanner Observable, so that out UI can know when our itinerary is generated.

And let's go ahead and add it as a state property to our LandmarkTripView, so that our view updates as the contents of the planner change.

If we initialize it here, it would be unnecessarily recreated even when the view doesn’t appear on screen, which has an undesirable performance cost. It’s better to defer the creation of the object using a task modifier. So let’s add a task and initialize our planner here.

This will only be called once when the view appears. When we receive an itinerary from the model, we can then display it.

I’m going to use another view here, called ItineraryView. In it, I’ll display my title, and then I’ll add some styling.

I’ll do the same for my description, and rationale.

I’ll go ahead and display my remaining itinerary properties in a similar manner using my other views. That’s a pretty good start, and the model will use the description we provided in the instructions to generate a basic itinerary. Let’s take it a step further. The framework offers a flexible tool protocol that enables the model to include external information in its responses. You can get super creative - from using people in your phone’s contacts, to events in your calendar, or even online content. The model autonomously decides when it makes sense to invoke your tools, and how often. To specialize my planner app ever further, I’ll create a tool that calls out to MapKit to fetch the best points of interest for a landmark. To create a tool, you’ll need to conform to the tool protocol. This includes a unique name to identify your tool, a natural language description of when to call the tool, and a call function - which is how the model invokes your tool, with arguments that you define yourself. Let’s start composing our tool. We’ll import FoundationModels, and MapKit.

I have a data structure that conforms to the tool protocol, with a name, and description.

The framework puts these strings in the instructions automatically, so that the model can understand what your tool does, and decide when to call it. A tool can also take input from the user, like the landmark that they selected.

We want our tool to fetch different kinds of points of interest, so let’s add an enum.

The model will use its world knowledge to decide which categories are most promising for a certain landmark. For example, it would be more likely to find a marina in the Great Barrier Reef than somewhere dry, and in a desert, like Joshua Tree. The model will generate this enum, so it needs to be Generable.

We’ll then define our Arguments struct, that uses our enum, together with a natural language query.

For the implementation, there’s the call method, which is how the model will invoke your tool when it decides to do so. I previously wrote some MapKit logic, that makes a request to MapKit, using the model generated natural language query as input, alongside the category selected by the model, and fetches points of interests within a 20km range of my landmark coordinates.

We’ll make a search with our requested constraints and return the results.

We can then implement the call method. Lets reach out to MapKit, filter our results, and return our output.

And that’s really all it takes to define a tool that fetches information from MapKit. To wire it up, let’s go back to our ItineraryPlanner.

And here’s the session that we created before. We’ll create an instance of our tool with the landmark that the user picked as input.

Then we can pass our tool into the session initializer.

This is enough to make the model call our tool, but we can even do additional prompting if we’d like it to be invoked more often. We can explicitly ask the model to use our tool and categories.

And there you go! We’re ready to test this out! In fact, if you don’t have a testing device at hand, that’s alright. If your development machine is running the latest macOS, and has Apple Intelligence enabled and ready, then we can conveniently run this in the iPhone and visionPro simulators. Let’s select Joshua tree, and ask for an itinerary.

Now, you may notice this is taking some time. This is because the model is returning us all the output at once - so we wait for each activity to be generated before receiving any results. Don’t worry - later on I’ll show you how to speed this up! And there we go, we received a fun itinerary! However, we actually forgot something pretty important. We just assumed the on-device Foundation Model is always available, but this isn’t always the case. The model’s availability depends on Apple Intelligence’s availability, and Apple Intelligence may not be supported, enabled, or ready on a given device! So, it’s important to check the status of the model, and handle that accordingly in our UI! Now, instead of having to test this with physical devices, or, the unthinkable - disabling AppleIntelligence just for testing purposes, we can use a nice scheme option in Xcode.

In our scheme, we can see the Foundation Models Availability override. Currently, it’s turned off. But any of these first three options are reasons why the models wouldn’t be on the device. So let’s go ahead and choose one, try to generate an itinerary, and see what happens in our app.

Oh, wow, that’s not good. We’re just showing an error here, and it’s not really actionable either. I need to go back and consider how I’ll integrate availability in my app. Let’s consider the three cases I showed earlier. If the device is not eligible to get Apple Intelligence, it doesn’t make sense to show the generate itinerary button. When selecting a landmark, let’s just let the user see a short description of it, using the the offline data we have in our app. In the second case, the device is capable, but simply hasn’t been opted into Apple Intelligence. We should let the user know this is why our itinerary planner is unavailable. They can decide if they’d like to opt in and access this feature. Finally, Model Not Ready just means that more time is needed for the model to finish downloading. We can simply tell the user to try again later, as our feature will soon be available. Now that we’ve designed our app behavior, we can take advantage of the availability API to determine what availability state our device is in! Here in my view, I’ll add a new variable for the model I’m using - in this case, the system language model. We can then switch on the availability state, if the model is available, we can just continue with the same behavior as before.

When Apple Intelligence is not enabled, let’s let the user know.

And if the model is not ready, we’ll tell them to try again later.

Otherwise - we’ll hide our itinerary button and simply display our fun fact.

I already have the Foundation Models override in my scheme set to Device Not Eligible. Let’s try this case again.

Ok - much better. Now, we just see a fun fact and the Generate Itinerary button has been removed to prevent the user from going down a path their device can’t support. If we look back now, our app waits for the entire itinerary to be generated, before showing it in the UI. Luckily, if we stream the itinerary as the model produces it, we can start reading recommendations right away! To use streaming, we’ll change the respond method that we call.

Instead of getting a full itinerary, we get a partial version. We can use the PartiallyGenerated data structure that’s automatically created for us. This is your Generable structure, but with all the properties made optional. So, I’ll go ahead and change the expected type of my itinerary.

Our result will now be a new async sequence with our PartiallyGenerated type as output.

Each element in our stream is an incrementally updated version of our itinerary. For example, the first element might have a title, but the other itinerary properties would be nil. And then the second element could have a title and description, and so on, until we received a fully generated itinerary. Now, I need to unwrap these properties in my views. Here it’s also good to think about which of your properties are ok to show without the others. In my case, itinerary has a title, then description, and then day plans, and this order makes sense. So, I’ll make my itinerary partially generated, and I'll unwrap my title, description, and rationale.

Now for my list of days: I also have to display my partially generated day plans. It’s great that PartiallyGenerable structures are automatically identifiable, so I don’t have to manage IDs myself. I can simply use Swift UI’s forEach with my partially generated structures.

It’s really that easy! Let’s add an animation based on our itinerary.

And some content transitions to our properties, so that our results stream smoothly into our views. I’ll go ahead and unwrap all the other properties, and now - we’re very close to the final product! Let’s test this out on my phone! Lets make the same request as before. And this time, we’ll immediately start to see the output streaming in our UI! As a user, I can go ahead and read the first day, as the content is being generated.

However, you may have noticed a bit of a delay before that first field appeared on screen. To fix this, it would really help to understand what’s going on behind the scenes. This is a great time to use the new Foundation Models Instrument, to get a deeper understanding of the factors influencing our performance. Let’s see what we can find by profiling our app! Earlier, we talked about running our app on the simulator. This is great for testing functionality - but it may not produce accurate performance results. For example, a simulator on an M4 Mac may yield faster results than an older iPhone. When looking at performance, it’s important to keep these differences in mind. I’ll be profiling on a physical iPhone. To get started, I have the Instruments app open on my Mac, and I’ll connect my phone.

We can go ahead and add the new Foundation Models instrument, and start recording, and then create an itinerary.

The Asset Loading track looks at the time taken to load the models. The default system language model, and the safety guardrail were loaded. The Inference track is also present in blue. Finally, the purple bar is the portion of time we spent tool calling. We can track the total time it took to generate the itinerary, as well as the input token count, which is proportionate to our instruction and prompt sizes. And this delay at the beginning was the portion of time it took to load in the system language model. There are a few ways we can make this faster. We just observed that part of that initial latency was captured in the Asset Loading track. The on-device language model is managed by the operating system, and it may not be held in memory if the system is serving other critical features, or if it’s been unused for some time. When I call session.respond, the operating system will load the model if it’s not already in memory. Prewarming can give your session a head start, by loading the model before you even make your request. It's best to do this when your app is relatively idle, and just after the user gives a strong hint that they'll use the session. A good example of when to prewarm is just after a user starts typing in a text field that will result in a prompt. In our app, when the user taps on a landmark, it’s pretty likely that they’ll make a request soon. We can prewarm before they press the Generate Itinerary button to proactively load the model. By the time they finish reading the description, our model will be ready to go! The second optimization can be added at request time. Remember the generating argument of the response functions? We used Itinerary here. Well, the framework automatically inserts the generation schemas of your data structures into your prompts. But this adds more tokens, increasing latency and context size. If the model already has a complete understanding of the response format before the request is made, then we can set IncludeSchemaInPrompt to false, and gain some performance improvements. When can we apply this optimization? The first case is if you’re making subsequent, same-type requests on a multi-turn conversation. The first request on the session already gave context for guided generation by including the schema in the prompt. So, we don’t need to do so for subsequent requests on that session. The second case is if your instructions include a full example of the schema. Remember how we passed in an example itinerary in our Instructions? In our case, this is sufficient - because our Itinerary structure has no optional properties. If you do have optional properties in your schema, then you’ll need to provide examples with all of the optional properties both populated, and nil. A final consideration - setting IncludeSchemaInPrompt to false means that we’ll lose the descriptions we added to our guides, although, if you’re using a thorough example, this shouldn’t be a problem. Let’s test these optimizations out! We’ll go ahead and set the IncludeSchemaInPrompt option to false in our request. We’ll also prewarm our session while the user is on the landmark description page. Let’s make a quick wrapper, and then invoke it on our session.

Now for the results! I went ahead and recorded this again and we can take a look at the results! The asset loading track already had some activity, before I tapped the Generate button! We can see a substantial reduction in input token count and the total response time is shorter now! Given the seconds we saved with these optimizations, we can rest assured we'll make our flight on time. And with that, I’m ready for my trip! Oh, but before I leave, here are some other sessions that may interest you. If you haven’t seen it yet, make sure to watch the "Meet" session to learn all about the framework. To go deep, there’s the "Deep dive" video and for more prompting best practices, check out "Prompt Design & Safety". Thanks for watching!

Summary

  • 0:00 - Introduction

  • Learn how to use Apple's FoundationModels framework to build an app that utilizes on-device intelligence to plan trips. The framework helps you create powerful, private, and performant features across macOS, iPadOS, iOS, and visionOS. The app generates itineraries for selected landmarks, autonomously choosing points of interest using tool calling. The process involves prompt engineering, utilizing Xcode's playground, streaming output, and profiling the app for optimal performance.

  • 2:30 - Prompt engineering

  • Xcode's updated Playground feature streamlines the code-iteration process for Swift developers. It provides live feedback, similar to SwiftUI Previews, so you can write and test code in real-time. Using the FoundationModels framework, you can interact with a model through prompts. The Playground automatically executes the code as prompts are typed, enabling quick feedback on the output. To enhance the output structure, the Guided generation feature allows you to annotate data structures as 'Generable', enabling the model to automatically create and populate these structures. Further refine the model's output using the 'Guide' macro, which provides constraints and descriptions for properties. This allows for more control over the generated data, ensuring it meets specific requirements. The framework also offers a flexible tool protocol that enables the model to include external information in its responses. By leveraging these features, you can create an Itinerary Planner app that generates structured itineraries based on user inputs and preferences. The app's UI updates dynamically as the itinerary is generated, providing a seamless user experience.

  • 11:19 - Tool calling

  • The example creates a specialized planner app that utilizes an on-device Foundation Model to enhance its functionality. To achieve this, the example defines custom tools that conform to a specific protocol. These tools have unique names, descriptions, and call functions. One tool fetches points of interest from MapKit based on a landmark selected by the person. The tool can take input from the person and generates an enum of different categories of points of interest, such as restaurants, museums, or marinas, using the model's world knowledge to determine the most promising categories for a specific landmark. You implement the call method for this tool, which interacts with MapKit, using a natural language query generated by the model and the selected category. The tool then filters and returns the relevant points of interest within a specified range. To integrate the tool into the planner app, create an instance of the tool with the user-selected landmark and pass it into the session initializer of the model. The model can then autonomously decide when to invoke the tool and how often. The example also demonstrates how to handle scenarios where the on-device Foundation Model is not available, such as when the device is not eligible for Apple Intelligence, the user has not opted in, or the model is not ready. The example implements appropriate UI updates and error messages to guide the user in these cases. The examples also explores the possibility of streaming the itinerary as the model produces it, allowing the person to start reading recommendations immediately rather than waiting for the entire itinerary to be generated.

  • 20:32 - Streaming output

  • The code uses a 'PartiallyGenerated' data structure — an optional version of the ‘Generable’ structure — to handle an incrementally updated itinerary. As new data arrives, the UI is updated with each partial version, showing available properties first (for example, title, then description, then day plans). Swift UI's 'forEach' displays the partially generated day plans. Animations and content transitions are added for smooth updates. Performance optimization is possible using the Foundation Models Instrument to reduce initial delay.

  • 24:32 - Profiling

  • To optimize the app's performance, the example conducts profiling on a physical iPhone using the Instruments app on a Mac. The Foundation Models instrument is added, and the time taken to load models and generate an itinerary is analyzed. Two main optimizations are: Prewarming the session. Loading the on-device language model before the user makes a request, such as when they tap on a landmark, reduces initial latency. Setting 'IncludeSchemaInPrompt' to 'false': This optimization avoids inserting generation schemas into prompts, decreasing token count and latency, especially for subsequent requests or when instructions include full examples of the schema. After implementing these optimizations, the example app shows a substantial reduction in input token count and total response time, significantly improving its efficiency.

Code-along: Cook up a rich text experience in SwiftUI with AttributedString

Learn how to build a rich text experience with SwiftUI's TextEditor API and AttributedString. Discover how you can enable rich text editing, build custom controls that manipulate the contents of your editor, and customize the formatting options available. Explore advanced capabilities of AttributedString that help you craft the best text editing experiences.

Chapters

Resources

Related Videos

WWDC25

WWDC22

WWDC21

Transcript

Hi, I'm Max, an Engineer on the SwiftUI team, and I’m Jeremy, an engineer on the Swift standard libraries team! We’re both delighted to share how you can build rich text-editing experiences using the power of SwiftUI and AttributedString. Together with Jeremy’s help, I’ll cover all the important aspects of rich text experiences in SwiftUI: first, I’ll discuss upgrading TextEditor to support rich text using AttributedString. Then, I’ll build custom controls, enhancing my editor with unique features. And finally, I’ll create my own text formatting definition so my editor - and its contents - always look great! Now, while I may be an engineer by day, I am also… a home cook on a mission to make the perfect croissant! So recently, I’ve been cooking up a little recipe editor to keep track of all my attempts! It has a list of recipes to the left, a TextEditor for editing the recipe text in the middle, and a list of ingredients in the inspector to the right. I’d love to make the most important parts of my recipe text stand out so I can easily catch them at a glance while cooking. Today I’ll make that possible by upgrading this editor to support rich text! Here is the editor view, implemented using SwiftUI’s TextEditor API. As indicated by the type of my text state, String, it currently only supports plaintext. I can just change String into AttributedString, dramatically increasing the view’s capabilities.

Now that I have support for rich text, I can use the system UI for toggling boldness and applying all kinds of other formatting, such as increasing the font size.

I can now also insert Genmoji from the keyboard, and thanks to the semantic nature of SwiftUI’s Font and Color attributes, the new TextEditor can support Dark Mode and Dynamic Type as well! In fact, TextEditor supports boldness, italics, underline, strikethrough, custom fonts point size, foreground and background colors, kerning, tracking, baseline offset, Genmoji, and… important aspects of paragraph styling! Line height, text alignment, and base writing direction are available as separate independent AttributedString attributes for SwiftUI as well.

All these attributes are consistent with SwiftUI’s non-editable Text, so you can just take the content of a TextEditor, and display it with Text later! Just like with Text, TextEditor substitutes the default value calculated from the environment for any AttributedStringKeys with a value of nil. Jeremy, I got to be honest here… I've managed my way through working with AttributedStrings so far, but I could really use a refresher to make sure all my knowledge is super solid before I get into building controls. Also, I really gotta make that croissant dough for later, so would you mind sharing a refresher while I do that? Sure thing Max! While you get started on preparing the croissant dough, I’ll take a moment to discuss some AttributedString basics that will come in handy when working with rich text editors. In short, AttributedStrings contain a sequence of characters and a sequence of attribute runs. For example, this AttributedString stores the text, “Hello! Who’s ready to get cooking?” It also has three attribute runs: a starting run with just a font, a run with an additional foreground color applied, and a final run with just the font attribute. The AttributedString type fits right in alongside other Swift types you use throughout your apps. It’s a value type, and just like String from the standard library, it stores its contents using the UTF-8 encoding. AttributedString also provides conformances to many common Swift protocols such as Equatable, Hashable, Codable, and Sendable. The Apple SDKs ship with a variety of pre-defined attributes - like those Max shared earlier - grouped into attribute scopes. AttributedString also supports custom attributes that you define in your app for styling personalized to your UI.

I’ll create my AttributedString from earlier using a few lines of code. First, I create an AttributedString with the start of my text. Next, I create the string “cooking,” set its foreground color to orange, and append it to the text. Then, I complete the sentence with the ending question mark. Lastly I set the entire text’s font to the largeTitle font. Now, I’m ready to display it in my UI on my device! For more on the basics of creating AttributedStrings, as well as creating and using your own custom attributes and attribute scopes, check out the What’s new in Foundation session from WWDC 2021.

Max, it looks like you’re done making the dough! Are you ready to dive into the details of using AttributedString in your recipe app? For sure, Jeremy, that sounds like just what I… kneaded. I’ve really been wanting to offer better controls connecting my text editor to the rest of my app. I want to build a button that allows me to add ingredients to the list in the inspector to the right, but without having to type out the name of the ingredient manually again. For example, I just want to be able to select the word “butter” that is already in my recipe, and mark it as an ingredient with a single press of a button.

My inspector already defines a preference key that I can use in my editor to suggest a new ingredient for the list.

The value I pass to the preference view modifier will bubble up the view hierarchy and can be read via the name “NewIngredientPreferenceKey” by any view that uses my recipe editor.

Let me define a computed property for this value below my view body.

All I need to provide for the suggestion is the name, as an AttributedString. Of course, I don’t just want to suggest the entire text that is in my editor. Instead, I want to suggest the text that is currently selected, like I showed with "butter." TextEditor communicates what is selected via the optional selection argument.

Let me bind that to a local state of type AttributedTextSelection.

Ok, now that I have all the context I need available in my view, let me head back to the property computing the ingredient suggestion.

Now, I need to get the text that is selected.

Let me try subscripting text with the result of this indices function on selection.

Hmm, that doesn’t seem to be the right type.

It returns AttributedTextSelection.Indices. Let me look that up.

Oh, that’s interesting, I just have one selection, but the Indices type’s second case, is represented by a set of ranges.

Jeremy can you explain why that is while while I get to folding my croissant dough? Ha, that’s funny. I’m also folding - under the anticipation of these tasty croissants. But no worries Max, I’ll explain why this API doesn’t use the Range type you expected. To explain why this API uses a RangeSet and not a single Range, I’ll dive into AttributedString selections. I’ll discuss how multiple ranges can form selections and demonstrate how to use RangeSets in your code. You’ve likely used a single range in AttributedString APIs or other collection APIs before. A single range allows you to slice a portion of an AttributedString and perform an action over that single chunk. For example, AttributedString provides APIs that allow you to quickly apply an attribute to a portion of text all at once. I’ve used the .range(of:) API to find the range of the text “cooking” in my AttributedString. Next I use the subscript operator to slice the AttributedString with that range to make the entire word “cooking” orange. However, an AttributedString sliced with just one range isn’t enough to represent selections in a text editor that works for all languages. For example, I might use this recipe app to store my recipe for Sufganiyot that I plan to cook during the holidays which includes some Hebrew text. My recipe says to, “Put the Sufganiyot in the pan,” which uses English text for the instructions and Hebrew text for the traditional name of the food. In the text editor, I’ll select a portion of the word “Sufganiyot” and the word “in” with just one selection. However, this is actually multiple ranges in the AttributedString! Since English is a left-to-right language, the editor lays out the sentence visually from the left side to the right side. However, the Hebrew portion, Sufganiyot, is laid out in the opposite direction since Hebrew is a right-to-left language. While the bidirectional nature of this text affects the visual layout on the screen, the AttributedString still stores all text in a consistent ordering. This ordering breaks apart my selection into two ranges: the start of the word “Sufganiyot” and the word “in,” excluding the end of the Hebrew text. This is why the SwiftUI text selection type uses multiple ranges rather than a single range.

To learn more about localizing your app for bidirectional text, check out the Get it right (to left) session from WWDC 2022 and this year’s Enhance your app’s multilingual experience session.

To support these types of selections, AttributedString supports slicing with a RangeSet, the type Max noticed earlier in the selection API. Just like you can slice an AttributedString with a singular range, you can also slice it with a RangeSet to produce a discontiguous substring. In this case I’ve created a RangeSet using the .indices(where:) function on the character view to find all uppercase characters in my text. Setting the foreground color to blue on this slice will make all uppercase characters blue, leaving the other characters unmodified. SwiftUI also provides an equivalent subscript that slices an AttributedString with a selection directly. Hey Max, if you finished folding that beautiful croissant dough, I think using the subscript API that accepts a selection might resolve the build error in your code! Let me give that a try! I can just subscript text with the selection directly, and then transform the discontiguous AttributedSubstring into a new AttributedString.

That’s awesome! Now, when I run this on device, and select the word “butter," SwiftUI automatically calls my property newIngredientSuggestion to compute the new value, which bubbles up to the rest of my app. My inspector then automatically adds the suggestion at the bottom of the ingredient list. From there, I can commit it to the ingredient list with a single tap! Features like that can elevate an editor, to a beautiful experience! I’m really happy with this addition, but with everything Jeremy has shown me so far, I think I can go even further! I want to better visualize the ingredients in the text itself! The first thing I need for that is a custom attribute that marks a range of text as an ingredient. Let me define this in a new file.

This attribute’s Value will be the ID of the ingredient it refers to. Now, I’ll head back to the property computing the IngredientSuggestion in the RecipeEditor file.

IngredientSuggestion allows me to provide a closure as a second argument.

This closure gets called when I press the plus button and the ingredient is added to the list. I will use that closure to mutate the editor text, marking up occurrences of the name with my Ingredient attribute. I get the ID of the newly created ingredient passed into the closure.

Next, I need to find all the occurrences of the suggested ingredient name in my text.

I can do this by calling ranges(of:) on AttributedString’s characters view.

Now that I have the ranges, I can just update the value of my ingredient attribute for each range! Here, I’m using a short name for the IngredientAttribute I had already defined.

Let’s give it a try! I don’t expect anything new here, after all, my custom attribute doesn’t have any formatting associated with it. Let me select “yeast” and press the plus button! Wait, what is that?! My cursor was at the top, not at the end! Let me try again! I select "salt," press the plus button, and my selection jumps to the end! Jeremy, I have to roll out my croissant dough, so I can’t debug this right now… do you know why my selection is getting reset? That’s definitely not an experience we want for the cooks using your app! Why don’t you get started on rolling out the dough, and I’ll dive into this unexpected behavior.

In order to demonstrate what’s happening here and how to fix it, I’ll explain the details of AttributedString indices which form the ranges and text selections used by a rich TextEditor.

AttributedString.Index represents a single location within your text. To support its powerful and performant design, AttributedString stores its contents in a tree structure, and its indices store paths through that tree. Since these indices form the building blocks of text selections in SwiftUI, the unexpected selection behavior in the app stems from how AttributedString indices behave within these trees. You should keep two key points in mind when working with AttributedString indices. First, any mutation to an AttributedString invalidates all of its indices, even those not within the bounds of the mutation. Recipes never turn out well when you use expired ingredients, and I can assure you that you’ll feel the same way about using old indices with an AttributedString. Second, you should only use indices with the AttributedString from which they were created.

Now I’ll explore indices from the example AttributedString I created earlier to explain how they work! Like I mentioned, AttributedString stores its contents in a tree structure, and here I have a simplified example of that tree. Using a tree allows for better performance and avoids copying lots of data when mutating your text. AttributedString.Index references text by storing a path through the tree to the referenced location. This stored path allows AttributedString to quickly locate specific text from an index, but it also means that the index contains information about the layout of the entire AttributedString’s tree. When you mutate an AttributedString, it might adjust the layout of the tree. This invalidates any previously recorded paths, even if the destination of that index still exists within the text.

Additionally, even if two AttributedStrings have the same text and attribute content, their trees may have different layouts making their indices incompatible with each other.

Using an index to traverse these trees to find information requires using the index within one of AttributedString’s views. While indices are specific to a particular AttributedString, you can use them in any view from that string. Foundation provides views over the characters, or grapheme clusters, of the text content, the individual Unicode scalars that make up each character, and the attribute runs of the string. To learn more about the differences between the character and Unicode scalar views, check out Apple’s developer documentation for the Swift Character type. You might also want to access lower level contents when interfacing with other string-like types that don’t use Swift’s Character type, such as NSString. AttributedString now also provides views into both the UTF-8 scalars and the UTF-16 scalars of the text. These two views still share the same indices as all of the existing views.

Now that I’ve discussed the details of indices and selections, I’ll revisit the problem that Max encountered with the recipe app. The onApply closure in the IngredientSuggestion mutates the attributed string, but it doesn’t update the indices in the selection! SwiftUI detects that these indices are no longer valid and moves the selection to the end of the text to prevent the app from crashing. To fix this, use AttributedString APIs to update your indices and selections when mutating text. Here, I have a simplified example of code that has the same problem as the recipe app. First, I find the range of the word "cooking" in my text. Then, I set the range of “cooking” to an orange foreground color and I also insert the word “chef” into my string to add some more recipe theming.

Mutating my text can change the layout of my AttributedString’s tree. Using the cookingRange variable after I’ve mutated my string is not valid. It might even crash the app. Instead, AttributedString provides a transform function which takes a Range, or an array of Ranges, and a closure which mutates the provided AttributedString in-place. At the end of the closure, the transform function will update your provided range with new indices to ensure you can correctly use the range in the resulting AttributedString. While the text may have shifted in the AttributedString, the range still points to the same semantic location - in this case, the word “cooking.” SwiftUI also provides an equivalent function that updates a selection instead of a range.

Wow, Max, those croissants are really shaping up great! If you’re ready to get back to your app, I think using this new transform function will help get your code into shape too! Thank you! That sounds like just what I was looking for! Let me see if I can apply this in code. First, I shouldn’t loop over the ranges like that. By the time I reach the last range, the text has been mutated many times, and the indices are outdated. I can avoid that problem entirely by first converting my Ranges to a RangeSet.

Then I can just slice with that and remove the loop.

This way everything is one change, and I don’t need to update the remaining ranges after each mutation.

Second, next to the ranges I want to change, there is also the selection representing my cursor position. I need to always make sure it matches the transformed text. I can do that using SwiftUI’s transform(updating:) overload on AttributedString.

Nice, now my selection is updated right as the text gets mutated! Let’s see if that worked! I can select “milk,” it appears in the list, and - when I add it - my selection remains intact! To double-check, when I press Command+B on the keyboard now, I can see the word “milk” turning bold - just as expected! Now that I have all the information in my recipe text, I want to emphasize the ingredients with some color! Thankfully, TextEditor provides a tool for that: the attributed text formatting definition protocol. A custom text formatting definition is all structured around which AttributedStringKeys your text editor responds to, and what values they might have. I already declared a type conforming to the AttributedTextFormattingDefinition protocol here. By default, the system uses the SwiftUIAttributes scope, with no constraints on the attributes’ values.

In the scope for my recipe editor, I only want to allow foreground color, Genmoji, and my custom ingredient attribute.

Back on my recipe editor, I can use the attributedTextFormattingDefinition modifier to pass my custom definition to SwiftUI’s TextEditor.

With this change, my TextEditor will allow any ingredient, any Genmoji, and any foreground color.

All of the other attributes now will assume their default value. Note that you can still change the default value for the entire editor by modifying the environment. Based on this change, TextEditor has already made some important changes to the system formatting UI. It no longer offers controls for changing the alignment, the line height, or font properties, since the respective AttributedStringKeys are not in my scope. However, I can still use the color control to apply arbitrary colors to my text, even if those colors don’t necessarily make sense.

Oh no, the milk is gone! I really only want ingredients to be highlighted green, and everything else to use the default color. I can use SwiftUI’s AttributedTextValueConstraint protocol to implement this logic.

Let me head back to the RecipeFormattingDefinition file and declare the constraint.

To conform to the AttributedTextValueConstraint protocol, I first specify the scope of the AttributedTextFormattingDefinition it belongs to, and then the AttributedStringKey I want to constrain, in my case the foreground color attribute. The actual logic for constraining the attribute lives in the constrain function. In that function, I set the value of the AttributeKey - the foreground color - to what I consider valid.

In my case, the logic all depends on whether the ingredient attribute is set.

If so, the foreground color should be green, otherwise it should be nil This indicates TextEditor should substitute the default color.

Now that I have defined the constraint, I just need to add it to the body of the AttributedTextFormattingDefinition.

From here, SwiftUI takes care of all the rest. TextEditor automatically applies the definition and its constraints to any part of the text before it appears on screen.

All the ingredients are green now! Interestingly, TextEditor has disabled its color control, despite foreground color being in my formatting definition’s scope. This makes sense considering the IngredientsAreGreen constraint I added. The foreground color now solely depends on whether text is marked with the ingredient attribute. TextEditor automatically probes AttributedTextValueConstraints to determine if a potential change is valid for the current selection. For example, I could try to set the foreground color of “milk” to white again. Running my IngredientsAreGreen constraint afterwards would change the foreground color back to green, so TextEditor knows this is not a valid change and disables the control. My value constraint will also be applied to text I paste into my editor. When I copy an ingredient using Command+C and paste it again using Command+V, my custom ingredient attribute is preserved. With CodableAttributedStringKeys, this can even work across TextEditors in different apps as long as both apps list the attribute in their AttributedTextFormattingDefinition.

This is pretty great, but there are still some things to improve: with my cursor at the end of the ingredient "milk," I can delete characters or continue typing and it just behaves like regular text. This makes it feel like it is just green text, and not an ingredient with a certain name. To make this feel right, I want the ingredient attribute not to expand as I type at the end of its run. And I would like the foreground color to reset for the entire word at once if I modify it.

Jeremy, if I promise I’ll give you an extra croissant later, will you help me getting that implemented? Hmm… I’m not sure one's gonna be enough, but make it a few and you’ve got a deal, Max! While you go get those croissants in the oven, I’ll explain what APIs might help with this problem. With the formatting definition constraints that Max demonstrated, you can constrain which attributes and which specific values each text editor can display. To help with this new issue with the recipe editor, the AttributedStringKey protocol provides additional APIs to constrain how attribute values are mutated across changes to any AttributedString. When attributes declare constraints, AttributedString always keeps the attributes consistent with each other and the text content to avoid unexpected state with simpler and more performant code. I’ll dive into a few examples to explain when you might use these APIs for your attributes. First, I’ll discuss attributes whose values are coupled with other content in the AttributedString, such as a spell checking attribute.

Here, I have a spell checking attribute that indicates the word “ready” is misspelled via a dashed, red underline. After performing spell checking on my text, I need to make sure that the spell checking attribute remains only applied to the text that I have already validated. However, if I continue typing in my text editor, by default all attributes of the existing text are inherited by inserted text. This isn’t what I want for a spell checking attribute, so I’ll add a new property to my AttributedStringKey to correct this. By declaring an inheritedByAddedText property on my AttributedStringKey type with a value of "false," any added text will not inherit this attribute value.

Now, when adding new text to my string, the new text will not contain the spell checking attribute since I have not yet checked the spelling of those words. Unfortunately, I found another issue with this attribute. Now, when I add text to the middle of a word that was marked as misspelled, the attribute shows an awkward break in the red line underneath the added text. Since my app hasn’t checked if this word is misspelled yet, what I really want is for the attribute to be removed from this word to avoid stale information in my UI. To fix this problem, I’ll add another property to my AttributedStringKey type: the invalidationConditions property. This property declares situations when a run of this attribute should be removed from the text. AttributedString provides conditions for when the text changes and when specific attributes change, and attribute keys can invalidate upon any number of conditions. In this case, I need to remove this attribute whenever the text of the attribute run changes so I’ll use the “textChanged” value. Now, inserting text into the middle of an attribute run will invalidate the attribute across the entire run, ensuring that I avoid this inconsistent state in my UI. I think both of those APIs might help keep the ingredient attribute valid in Max’s app! While Max is finishing up with the oven, I’ll demonstrate one more category of attributes: attributes that require consistent values across sections of text. For example, a paragraph alignment attribute. I can apply different alignments to each paragraph in my text, however just a single word cannot use a different alignment than the rest of the paragraph. To enforce this requirement during AttributedString mutations, I’ll declare a runBoundaries property on my AttributedStringKey type. Foundation supports limiting run boundaries to paragraph edges or the edges of a specified character. In this case, I’ll define this attribute as constrained to paragraph boundaries to require that it has a consistent value from the start to the end of a paragraph. Now, this situation becomes impossible. If I apply a left alignment value to just one word in a paragraph, AttributedString automatically expands the attribute to the entire range of the paragraph. Additionally, when I enumerate the alignment attribute AttributedString enumerates each individual paragraph, even if two consecutive paragraphs contain the same attribute value. Other run boundaries behave the same: AttributedString expands values from one boundary to the next and ensures enumerated runs break on every run boundary.

Wow Max, those croissants smell delicious! If the croissasnts are all set in the oven, do you think some of these APIs might complement your formatting definition to achieve the behavior you want for your custom attribute? That sounds like just the secret ingredient I needed! The croissants are all set in the oven, so I can try this out right away! In my custom IngredientAttribute here, I will implement the optional inheritedByAddedText requirement to have the value false, that way if I type after an ingredient, it won’t expand.

Second, let me implement invalidationConditions with textChanged, so when I delete characters in an ingredient it will no longer be recognized! Let’s give it a try! When I add a “y” at the end of “milk,” the “y” is no longer green, and when I delete a character of “milk,” the ingredient attribute gets removed from the entire word at once. Based on my AttributedTextFormattingDefinition, the foreground color attribute continues to follow my custom ingredient attribute’s behavior perfectly! Thank you Jeremy, this app really turned out great! No problem! Now, about those croissants you promised… Don’t worry, they’re almost ready. Why don’t you guard the oven though, since I’m slightly worried Luca might steal them away from us! Ah, THE Luca, I’ve heard of him, lover of all things widgets and croissants. I’m on it chef! Now, before I go join Jeremy, let me give you some final tips: You can download my app as a sample project where you can learn more about using SwiftUI’s Transferable Wrapper for lossless drag and drop or export to RTFD, and persisting AttributedString with Swift Data. AttributedString is part of Swift’s open source Foundation project. Find its implementation on GitHub to contribute to its evolution or get in touch with the community on the Swift forums! With the new TextEditor, it has also never been easier to add support for Genmoji input into your app, so consider doing that now! I just can’t wait to see how you will use this API to upgrade text editing in your apps. Just a little sprinkle can make it pop! Mmm, so delicious! No, so RICH!

Code

1:15 - TextEditor and String

import SwiftUI

struct RecipeEditor: View {
    @Binding var text: String

    var body: some View {
        TextEditor(text: $text)
    }
}

1:45 - TextEditor and AttributedString

import SwiftUI

struct RecipeEditor: View {
    @Binding var text: AttributedString

    var body: some View {
        TextEditor(text: $text)
    }
}

4:43 - AttributedString Basics

var text = AttributedString(
  "Hello 👋🏻! Who's ready to get "
)

var cooking = AttributedString("cooking")
cooking.foregroundColor = .orange
text += cooking

text += AttributedString("?")

text.font = .largeTitle

5:36 - Build custom controls: Basics (initial attempt)

import SwiftUI

struct RecipeEditor: View {
    @Binding var text: AttributedString
    @State private var selection = AttributedTextSelection()

    var body: some View {
        TextEditor(text: $text, selection: $selection)
            .preference(key: NewIngredientPreferenceKey.self, value: newIngredientSuggestion)
    }

    private var newIngredientSuggestion: IngredientSuggestion {
        let name = text[selection.indices(in: text)] // build error

        return IngredientSuggestion(
            suggestedName: AttributedString())
    }
}

8:53 - Slicing AttributedString with a Range

var text = AttributedString(
  "Hello 👋🏻! Who's ready to get cooking?"
)

guard let cookingRange = text.range(of: "cooking") else {
  fatalError("Unable to find range of cooking")
}

text[cookingRange].foregroundColor = .orange

10:50 - Slicing AttributedString with a RangeSet

var text = AttributedString(
  "Hello 👋🏻! Who's ready to get cooking?"
)

let uppercaseRanges = text.characters
  .indices(where: \.isUppercase)

text[uppercaseRanges].foregroundColor = .blue

11:40 - Build custom controls: Basics (fixed)

import SwiftUI

struct RecipeEditor: View {
    @Binding var text: AttributedString
    @State private var selection = AttributedTextSelection()

    var body: some View {
        TextEditor(text: $text, selection: $selection)
            .preference(key: NewIngredientPreferenceKey.self, value: newIngredientSuggestion)
    }

    private var newIngredientSuggestion: IngredientSuggestion {
        let name = text[selection]

        return IngredientSuggestion(
            suggestedName: AttributedString(name))
    }
}

12:32 - Build custom controls: Recipe attribute

import SwiftUI

struct IngredientAttribute: CodableAttributedStringKey {
    typealias Value = Ingredient.ID

    static let name = "SampleRecipeEditor.IngredientAttribute"
}

extension AttributeScopes {
    /// An attribute scope for custom attributes defined by this app.
    struct CustomAttributes: AttributeScope {
        /// An attribute for marking text as a reference to an recipe's ingredient.
        let ingredient: IngredientAttribute
    }
}

extension AttributeDynamicLookup {
    /// The subscript for pulling custom attributes into the dynamic attribute lookup.
    ///
    /// This makes them available throughout the code using the name they have in the
    /// `AttributeScopes.CustomAttributes` scope.
    subscript<T: AttributedStringKey>(
        dynamicMember keyPath: KeyPath<AttributeScopes.CustomAttributes, T>
    ) -> T {
        self[T.self]
    }
}

12:56 - Build custom controls: Modifying text (initial attempt)

import SwiftUI

struct RecipeEditor: View {
    @Binding var text: AttributedString
    @State private var selection = AttributedTextSelection()

    var body: some View {
        TextEditor(text: $text, selection: $selection)
            .preference(key: NewIngredientPreferenceKey.self, value: newIngredientSuggestion)
    }

    private var newIngredientSuggestion: IngredientSuggestion {
        let name = text[selection]

        return IngredientSuggestion(
            suggestedName: AttributedString(name),
            onApply: { ingredientId in
                let ranges = text.characters.ranges(of: name.characters)

                for range in ranges {
                    // modifying `text` without updating `selection` is invalid and resets the cursor 
                    text[range].ingredient = ingredientId
                }
            })
    }
}

17:40 - AttributedString Character View

text.characters[index] // "👋🏻"

17:44 - AttributedString Unicode Scalar View

text.unicodeScalars[index] // "👋"

17:49 - AttributedString Runs View

text.runs[index] // "Hello 👋🏻! ..."

18:13 - AttributedString UTF-8 View

text.utf8[index] // "240"

18:17 - AttributedString UTF-16 View

text.utf16[index] // "55357"

18:59 - Updating Indices during AttributedString Mutations

var text = AttributedString(
  "Hello 👋🏻! Who's ready to get cooking?"
)

guard var cookingRange = text.range(of: "cooking") else {
  fatalError("Unable to find range of cooking")
}

let originalRange = cookingRange
text.transform(updating: &cookingRange) { text in
  text[originalRange].foregroundColor = .orange
  
  let insertionPoint = text
    .index(text.startIndex, offsetByCharacters: 6)
  
  text.characters
    .insert(contentsOf: "chef ", at: insertionPoint)
}

print(text[cookingRange])

20:22 - Build custom controls: Modifying text (fixed)

import SwiftUI

struct RecipeEditor: View {
    @Binding var text: AttributedString
    @State private var selection = AttributedTextSelection()

    var body: some View {
        TextEditor(text: $text, selection: $selection)
            .preference(key: NewIngredientPreferenceKey.self, value: newIngredientSuggestion)
    }

    private var newIngredientSuggestion: IngredientSuggestion {
        let name = text[selection]

        return IngredientSuggestion(
            suggestedName: AttributedString(name),
            onApply: { ingredientId in
                let ranges = RangeSet(text.characters.ranges(of: name.characters))

                text.transform(updating: &selection) { text in
                    text[ranges].ingredient = ingredientId
                }
            })
    }
}

22:03 - Define your text format: RecipeFormattingDefinition Scope

struct RecipeFormattingDefinition: AttributedTextFormattingDefinition {
    struct Scope: AttributeScope {
        let foregroundColor: AttributeScopes.SwiftUIAttributes.ForegroundColorAttribute
        let adaptiveImageGlyph: AttributeScopes.SwiftUIAttributes.AdaptiveImageGlyphAttribute
        let ingredient: IngredientAttribute
    }

    var body: some AttributedTextFormattingDefinition<Scope> {

    }
}

// pass the custom formatting definition to the TextEditor in the updated `RecipeEditor.body`:

        TextEditor(text: $text, selection: $selection)
            .preference(key: NewIngredientPreferenceKey.self, value: newIngredientSuggestion)
            .attributedTextFormattingDefinition(RecipeFormattingDefinition())

23:50 - Define your text format: AttributedTextValueConstraints

struct IngredientsAreGreen: AttributedTextValueConstraint {
    typealias Scope = RecipeFormattingDefinition.Scope
    typealias AttributeKey = AttributeScopes.SwiftUIAttributes.ForegroundColorAttribute

    func constrain(_ container: inout Attributes) {
        if container.ingredient != nil {
            container.foregroundColor = .green
        } else {
            container.foregroundColor = nil
        }
    }
}

// list the value constraint in the recipe formatting definition's body:
    var body: some AttributedTextFormattingDefinition<Scope> {
        IngredientsAreGreen()
    }

29:28 - AttributedStringKey Constraint: Inherited by Added Text

static let inheritedByAddedText = false

30:12 - AttributedStringKey Constraint: Invalidation Conditions

static let invalidationConditions:
  Set<AttributedString.AttributeInvalidationCondition>? =
  [.textChanged]

31:25 - AttributedStringKey Constraint: Run Boundaries

static let runBoundaries:
  AttributedString.AttributeRunBoundaries? =
  .paragraph

32:46 - Define your text format: AttributedStringKey Constraints

struct IngredientAttribute: CodableAttributedStringKey {
    typealias Value = Ingredient.ID

    static let name = "SampleRecipeEditor.IngredientAttribute"

    static let inheritedByAddedText: Bool = false

    static let invalidationConditions: Set<AttributedString.AttributeInvalidationCondition>? = [.textChanged]
}

Code-along: Elevate an app with Swift concurrency

Learn how to optimize your app's user experience with Swift concurrency as we update an existing sample app. We'll start with a main-actor app, then gradually introduce asynchronous code as we need to. We'll use tasks to optimize code running on the main actor, and discover how to parallelize code by offloading work to the background. We'll explore what data-race safety provides, and work through interpreting and fixing data-race safety errors. Finally, we'll show how you can make the most out of structured concurrency in the context of an app.

Chapters

Resources

Related Videos

WWDC25

WWDC23

Transcript

Hi! I’m Sima, and I work on Swift and SwiftUI. In this video, you will learn how to elevate your app with Swift concurrency. As app developers, most of the code we write is on the main thread.

Single-threaded code is easy to understand and maintain. But at the same time, a modern app often needs to perform time-consuming tasks, like a network request, or an expensive computation. In such cases, it is a great practice to move work off the main thread to keep the app responsive. Swift gives you all the tools you need to write concurrent code with confidence. In this session, I will show you how by building an app with you. We will start with a single-threaded app, and gradually introduce asynchronous code as we need to. Then, we will improve the performance of the app by offloading some of the expensive tasks and running them in parallel. Next, we’ll discuss some common data-race safety scenarios you might encounter and explore ways to approach them. And finally, I will touch on structured concurrency and show you how to use tools such as a TaskGroup for more control over your concurrent code. I can’t wait to get started! I love journaling, and decorating my entries with stickers, so I will walk you through building an app for composing sticker packs out of any set of photos. Our app will have two main views. The first view will feature a carousel with all stickers with a gradient reflecting the colors from the original photo, and the second one will show a grid preview of the entire sticker pack, ready to be exported. Feel free to download the sample app below to follow along! When I created the project, Xcode enabled a few features that provide a more approachable path to introducing concurrency, including main actor by default mode and a few upcoming features. These features are enabled by default in new app projects in Xcode 26.

In the approachable concurrency configuration, the Swift 6 language mode will provide data-race safety without introducing concurrency until you are ready. If you want to enable these features in existing projects, you can learn how in the Swift migration guide. In code, the app will have two main views—StickerCarousel and StickerGrid. These views will use the stickers that the PhotoProcessor struct is responsible for extracting. The PhotoProcessor gets the raw image from the photo library before it returns the sticker. The StickerGrid view has a ShareLink which it can use for sharing the stickers. The PhotoProcessor type performs two expensive operations: the sticker extraction and the dominant colors computation. Let’s see how Swift concurrency features can help us optimize for smooth user experience, while still letting the device perform these expensive tasks! I’m going to start with the StickerCarousel view. This view displays the stickers in a horizontal scroll view. Inside of the scroll view, it has a ForEach which iterates over the array of selected photos from the photo library stored in the view model. It checks the processedPhotos dictionary in the viewModel to get the processed photo corresponding to the selection in the photo library. Currently, we don’t have any processed photos, since I haven’t actually written any code to get an image from the photo picker. If I run the app now, all we will see in the scroll view, is the StickerPlaceholder view. I’ll navigate to StickerViewModel using command-click. The StickerViewModel stores an array of currently selected photos from the photo library, represented as a SelectedPhoto type. I’ll open Quick Help with Option-click to learn more about this type. SelectedPhoto is an Identifiable type that stores a PhotosPickerItem from the PhotosUI framework and its associated ID. The model also has the dictionary called processedPhotos that maps the ID of the selected photo to the SwiftUI Image it represents. I have already started working on the loadPhoto function that takes the selected photo. But currently it does not load any data from the photo picker item that it stores. The PhotosPickerItem conforms to the Transferable protocol from the SDK, which allows me to load the representation I request with the asynchronous loadTransferable function. I will request the Data representation.

Now, we have a compiler error.

It’s because the call to `loadTransferable` is asynchronous, and my `loadPhoto` function where I call it is not set up to handle asynchronous calls, so Swift helps me by suggesting to mark `loadPhoto` with the async keyword. I’m going to apply this suggestion.

Our function is capable of handling asynchronous code. But, there’s still one more error. While `loadPhoto` can handle asynchronous calls, we need to tell it what to wait for. To do this, I need to mark the call to `loadTransferable` with the `await` keyword. I’ll apply the suggested fix.

I’ll call this function in the StickerCarousel view. With command-shift-O, I can use Xcode’s Open Quickly to navigate back to the StickerCarousel.

I would like to call the loadPhoto function when the StickerPlaceholder view appears. Because this function is asynchronous, I will use the SwiftUI task modifier to kick off photo processing when this view appears.

Let’s check this out on my device! Great, it’s up and running. Let’s try selecting a few photos to test it out.

Great! Looks like the images are getting loaded from my photo library. The task allows me to keep the app’s UI responsive while the image is being loaded from the data. And because I'm using LazyHStack for displaying the images, I'm only kicking off photo loading tasks for views that need to be rendered on screen, so the app is not performing more work than necessary. Let’s talk about why async/await improves responsiveness of our app.

We added the `await` keyword when calling the `loadTransferable` method, and annotated the `loadPhoto` function with `async`. The `await` keyword marks a possible suspension point. It means that initially, the loadPhoto function starts on the main thread, and when it calls loadTransferable at the await, it suspends while it’s waiting for loadTransferable to complete. While loadPhoto is suspended, the Transferable framework will run loadTransferable on the background thread. When loadTransferable is done, loadPhoto will resume its execution on the main thread and update the image. The main thread is free to respond to UI events and run other tasks while the loadPhoto is suspended. The await keyword indicates a point in your code where other work can happen while your function is suspended. And just like that, we are done with loading the images from the photo library! Along the way, we learned what asynchronous code means, how to write and think about it. Now, let’s add some code to our app that would extract the sticker from the photo, and its primary colors that we can use for the background gradient when displayed in a carousel view. I’m going to use command-click to navigate back to loadPhoto where I can apply these effects.

The project already includes a PhotoProcessor, which takes the Data, extracts the colors and the sticker, and returns the processed photo. Rather than providing the basic image from the data, I’m going to use the PhotoProcessor instead.

The PhotoProcessor returns a processed photo, so I’ll update the dictionary’s type.

This ProcessedPhoto will provide us the sticker extracted from the photo and the array of colors to compose the gradient from.

I’ve already included a GradientSticker view in the project that takes a processedPhoto. I’m going to use Open Quickly to navigate to it.

This view shows a sticker stored in a processed photo on top of a linear gradient in a ZStack. I’m going to add this GradientSticker in the carousel.

Currently, in the StickerCarousel we are just resizing the photo, but now that we have a processed photo, we can use the GradientSticker here instead.

Let’s build and run the app to see our stickers! It’s working! Oh no! While the stickers are being extracted, scrolling through the carousel isn’t that smooth.

I suspect the image processing is very expensive. I have profiled the app using Instruments to confirm that. The trace shows that our app has Severe Hangs. If I zoom in on it and look at the heaviest stack trace, I can see the photo processor blocking the main thread with the expensive processing tasks for more than 10 seconds! If you want to learn more about analyzing hangs in your app, check out our session “Analyze hangs with Instruments”. Now, let’s talk more about the work our app is doing on the main thread.

The implementation of `loadTransferable` handled offloading the work to the background to avoid causing the loading work to happen on the main thread. Now, that we’ve added the image processing code, which is running on the main thread, and takes a long time to complete, the main thread is unable to receive any UI updates, like responding to scrolling gestures, leading to poor user experience in my app.

Previously, we adopted an asynchronous API from the SDK, which offloaded the work on our behalf. Now, we need to run our own code in parallel to fix the hang. We can move some of the image transformations into the background. Transforming the image is composed of these three operations. Getting the raw image and updating the image have to interact with the UI, so we can't move this work to the background, but we can offload the image processing. This will ensure the main thread is free to respond to other events while the expensive image processing work is happening. Let’s look at the PhotoProcessor struct to understand how we can do this! Because my app is in the main actor by default mode, the PhotoProcessor is tied to the @MainActor, meaning all of its methods must run on the main actor. The `process` method calls extract sticker and extract colors methods, so I need to mark all methods of this type as capable of running off the main actor. To do this, I can mark the whole PhotoProcessor type with nonisolated. This is a new feature introduced in Swift 6.1. When the type is marked with nonisolated, all of its properties and methods are automatically nonisolated.

Now that the PhotoProcessor is not tied to the MainActor, we can apply the new `@concurrent` attribute to the process function and mark it with `async`. This will tell Swift to always switch to a background thread when running this method. I’ll use Open Quickly to navigate back to the PhotoProcessor.

First, I’m going to apply nonisolated on the type to decouple the PhotoProcessor from the main actor and allow its methods to be called from concurrent code.

Now that PhotoProcessor is nonisolated, to make sure that the process method will get called from the background thread, I will apply @concurrent and async.

Now, I’ll navigate back to the StickerViewModel with Open Quickly.

Here, in the loadPhoto method I need to get off the main thread by calling the process method with the `await` keyword, which Swift already suggests. I’m going to apply this suggestion.

Let’s build and run our app to see if moving this work off the main actor helped with the hangs! Looks like there are no more hangs on scroll! But even though I can interact with the UI, the image is taking a while to appear in the UI while I'm scrolling. Keeping an app responsive isn't the only factor for improving user experience. If we move work off the main thread but it takes a long time to get results to the user, that can still lead to a frustrating experience using the app.

We moved the image processing operation to a background thread, but it still takes a lot of time to complete. Let’s see how we can optimize this operation with concurrency to have it complete faster. Processing the image requires extraction of stickers and the dominant colors, but these operations are independent of each other. So we can run these tasks in parallel with each other using async let. Now, the concurrent thread pool, which manages all of the background threads, will schedule these two tasks to start on two different background threads at once. This allows me to take advantage of multiple cores on my phone.

I’ll command-click on the process method to adopt async let.

By holding down control + shift and down arrow key, I can use multiline cursor to add async in front of sticker and colors variables.

Now that we’ve made these two calls run in parallel, we need to await on their results to resume our process function. Let’s fix all of these issues using the Editor menu.

But, there’s still one more error. This time it’s about a data race! Let’s take some time to understand this error.

This error means that my PhotoProcessor type is not safe to share between concurrent tasks. To understand why, let’s look at its stored properties. The only property the PhotoProcessor stores is an instance of ColorExtractor, needed to extract the colors from the photo. The ColorExtractor class computes the dominant colors that appear in the image. This computation operates on low-level, mutable image data including pixel buffers, so the color extractor type is not safe to access concurrently.

Right now, all color extraction operations share the same instance of the ColorExtractor. This leads to concurrent access to the same memory. This is called a “data race”, which can lead to runtime bugs like crashes and unpredictable behavior. The Swift 6 language mode will identify these at compile time, which defines away this set of bugs when you’re writing code that runs in parallel. This moves what would’ve been a tricky runtime bug into a compiler error that you can address right away. If you click the “help” button on the error message, you can learn more about this error on the Swift website. There are multiple options you can consider when trying to solve a data race. Choosing one depends on how your code uses the shared data. First, ask yourself: Does this mutable state need to be shared between concurrent code? In many cases, you can simply avoid sharing it. However, there are cases where the state needs to be shared by such code. If that is the case, consider extracting what you need to share to a value type that’s safe to send. Only if any of these solutions aren’t applicable to your situation, consider isolating this state to an actor such as the MainActor. Let’s see if the first solution would work for our case! While we could refactor this type to work differently to handle multiple concurrent operations, instead we can move the color extractor to a local variable in the extractColors function, so that each photo being processed has its own instance of the color extractor. This is the correct code change, because the color extractor is intended to work on one photo at a time. So we want a separate instance of it for each color extraction task. With this change, nothing outside of the extractColors function can access the color extractor, which prevents the data race! To make this change, let’s move the color extractor property to the extractColors function.

Great! With the compiler’s help, we’ve been able to detect and eliminate a data race in our app. Now, let’s run it! I can feel the app running faster! If I collect a profiler trace in Instruments and open it, I no longer see the hangs. Let’s quickly recap the optimizations we made with Swift concurrency! By adopting the `@concurrent` attribute, we have successfully moved our image processing code off the main thread. We have also parallelized its operations, sticker and color extraction with each other using `async let`, making our app much more performant! The optimizations you make with Swift concurrency should always be based on data from analysis tools, such as the time profiler instrument. If you can make your code more efficient without introducing concurrency, you should always do that first. The app feels snappy now! Let’s take a break from image processing and add something fun! Let’s add a visual effect for our processed stickers that will make the sticker scrolled past fade away and blur. Let’s switch to Xcode to write that! I’ll go back to the StickerCarousel using the Xcode project navigator.

Now, I’m going to apply the visual effect on each image in the scroll view using the visualEffect modifier.

Here, I’m applying some effects to the view. I want to change the offset, the blur, and opacity only for the last sticker in the scroll view, so I need to access the viewModel’s selection property to check if the visual effect is applied to the last sticker.

Looks like there’s a compiler error because I’m trying to access main-actor protected view state from the visualEffect closure. Because computing a visual effect is an expensive computation, SwiftUI offloads it off the main thread for maximizing performance of my app. If you feel adventurous and want to learn more, check out our session Explore concurrency in SwiftUI. That’s what this error is telling me: this closure will be evaluated later on the background. Let’s confirm this by looking at the definition of the `visualEffect`, which I’ll command-click on.

In the definition, this closure is @Sendable, which is an indication from SwiftUI that this closure will be evaluated on the background.

In this case, SwiftUI calls visual effect again whenever selection changes, so I can make a copy of it using the closure's capture list.

Now, when SwiftUI calls this closure, it will operate on a copy of selection value, making this operation data-race free.

Let’s check out our visual effect! It’s looking great, and I can see how the previous image blurs and fades out as I’m scrolling.

In both of these data-race scenarios we’ve encountered, the solution was to not share data that can be mutated from concurrent code. The key difference was that in the first example, I introduced a data-race myself by running some of the code in parallel. In the second example though, I used a SwiftUI API that offloads work to the background thread on my behalf.

If you must share mutable state, there are other ways to protect it. Sendable value types prevent the type from being able to be shared by concurrent code. For example, extractSticker and extractColors methods are running in parallel and both take the same image’s data. But there’s no data-race condition in this case because Data is a Sendable value type. Data also implements copy-on-write, so it’s only copied if it’s mutated. If you can’t use a value type, you can consider isolating your state to the main actor. Luckily, the main actor by default mode already does that for you. For example, our model is a class, and we can access it from a concurrent task. Because the model is implicitly marked with the MainActor, it is safe to reference from concurrent code. The code will have to switch to the main actor to access the state. In this case, the class is protected by the main actor but the same applies to other actors that you might have in your code. Our app is looking great so far! But it still doesn’t feel complete. To be able to export the stickers, let’s add a sticker grid view that kicks off a photo processing task for each photo that hasn't been processed yet, and displays all of the stickers at once. It will also have a share button that would allow for export of these stickers. Let’s jump back to the code! First, I’ll use command-click to navigate to the StickerViewModel.

I’m going to add another method to our model, `processAllPhotos()`.

Here, I want to iterate over all processed photos saved so far in my model, and if there are still unprocessed photos, I want to start multiple parallel tasks to kick off processing for them at once.

We’ve used async let before, but that only worked because we knew that there’s only two tasks to kick off —the sticker and color extraction. Now, we need to create a new task for all raw photos in the array, and there can be any amount of these processing tasks.

APIs like TaskGroup allow you to take more control over the asynchronous work your app needs to perform.

Task groups provide fine grained control over child tasks and their results. The task group allows to kick off any number of child tasks which can be run in parallel. Each child task can take arbitrary amounts of time to finish, so they might be done in a different order. In our case, the processed photos will be saved into a dictionary, so the order doesn't matter.

TaskGroup conforms to AsyncSequence, so we can iterate over the results as they’re done to store them into the dictionary. And finally, we can await on the whole group to finish the child tasks. Let's go back to the code to adopt a task group! To adopt the task group, I’ll start by declaring it.

Here, inside the closure I have a reference to the group which I can add image processing tasks to. I’m going to iterate over the selection saved in the model.

If this photo has been processed, then I don’t need to create a task for it.

I’ll start a new task of loading the data and processing the photo.

Because the group is an async sequence I can iterate over it to save the processed photo into the processedPhotos dictionary once it’s ready.

That’s it! Now we are ready to display our stickers in the StickerGrid. I’ll use Open Quickly to navigate to the StickerGrid.

Here, I have a state property finishedLoading which indicates if all photos have finished processing.

If the photos haven’t been processed yet, a progress view will be displayed. I’m going to call the processAllPhotos() method we just implemented.

After all photos are processed, we can set the state variable. And finally, I will add the share link in the toolbar to share the stickers! I’m populating the share link items with a sticker for each selected photo. Let’s run the app! I will tap on the StickerGrid button. Thanks to the TaskGroup, the preview grid starts processing all photos at once. And when they are ready, I can instantly see all of the stickers! Finally, using the Share button in the toolbar, I can export all of the stickers as files that I can save.

In our app, the stickers will be collected in the order they’re done processing. But you can also keep track of the order, and the task group has many more capabilities. To learn more, check out the session “Beyond the basics of structured concurrency”. Congrats! The app is done and now I can save my stickers! We’ve added new features to an app, discovered when they had an impact on the UI, and used concurrency as much as we needed to improve responsiveness and performance. We also learned about structured concurrency and how to prevent data races. If you didn’t follow along, you can still download the final version of the app and make some stickers out of your own photos! To familiarize yourself with new Swift concurrency features and techniques mentioned in this talk, try to optimize or tweak the app further. Finally, see if you could bring these techniques to your app —remember to profile it first! To dive deeper into understanding the concepts in Swift's concurrency model, check out our session “Embracing Swift concurrency”. For migrating your existing project to adopt new approachable concurrency features, check out the "Swift Migration Guide"! And my favorite part, I got some stickers for my notebook! Thanks for watching!

Code

6:29 - Asynchronously loading the selected photo from the photo library

func loadPhoto(_ item: SelectedPhoto) async {
    var data: Data? = try? await item.loadTransferable(type: Data.self)

    if let cachedData = getCachedData(for: item.id) { data = cachedData }

    guard let data else { return }
    processedPhotos[item.id] = Image(data: data)

    cacheData(item.id, data)
}

6:59 - Calling an asynchronous function when the SwiftUI View appears

StickerPlaceholder()
    .task {
        await viewModel.loadPhoto(selectedPhoto)
    }

9:45 - Synchronously extracting the sticker and the colors from a photo

func loadPhoto(_ item: SelectedPhoto) async {
    var data: Data? = try? await item.loadTransferable(type: Data.self)

    if let cachedData = getCachedData(for: item.id) { data = cachedData }

    guard let data else { return }
    processedPhotos[item.id] = PhotoProcessor().process(data: data)

    cacheData(item.id, data)
}

9:56 - Storing the processed photo in the dictionary

var processedPhotos = [SelectedPhoto.ID: ProcessedPhoto]()

10:45 - Displaying the sticker with a gradient background in the carousel

import SwiftUI
import PhotosUI

struct StickerCarousel: View {
    @State var viewModel: StickerViewModel
    @State private var sheetPresented: Bool = false

    var body: some View {
        ScrollView(.horizontal) {
            LazyHStack(spacing: 16) {
                ForEach(viewModel.selection) { selectedPhoto in
                    VStack {
                        if let processedPhoto = viewModel.processedPhotos[selectedPhoto.id] {
                            GradientSticker(processedPhoto: processedPhoto)
                        } else if viewModel.invalidPhotos.contains(selectedPhoto.id) {
                            InvalidStickerPlaceholder()
                        } else {
                            StickerPlaceholder()
                                .task {
                                    await viewModel.loadPhoto(selectedPhoto)
                                }
                        }
                    }
                    .containerRelativeFrame(.horizontal)
                }
            }
        }
        .configureCarousel(
            viewModel,
            sheetPresented: $sheetPresented
        )
        .sheet(isPresented: $sheetPresented) {
            StickerGrid(viewModel: viewModel)
        }
    }
}

14:13 - Allowing photo processing to run on the background thread

nonisolated struct PhotoProcessor {
 
    let colorExtractor = ColorExtractor()

    @concurrent
    func process(data: Data) async -> ProcessedPhoto? {
        let sticker = extractSticker(from: data)
        let colors = extractColors(from: data)

        guard let sticker = sticker, let colors = colors else { return nil }

        return ProcessedPhoto(sticker: sticker, colorScheme: colors)
    }

    private func extractColors(from data: Data) -> PhotoColorScheme? {
        // ...
    }

    private func extractSticker(from data: Data) -> Image? {
        // ...
    }
}

15:31 - Running the photo processing operations off the main thread

func loadPhoto(_ item: SelectedPhoto) async {
    var data: Data? = try? await item.loadTransferable(type: Data.self)

    if let cachedData = getCachedData(for: item.id) { data = cachedData }

    guard let data else { return }
    processedPhotos[item.id] = await PhotoProcessor().process(data: data)

    cacheData(item.id, data)
}

20:55 - Running sticker and color extraction in parallel.

nonisolated struct PhotoProcessor {

    @concurrent
    func process(data: Data) async -> ProcessedPhoto? {
        async let sticker = extractSticker(from: data)
        async let colors = extractColors(from: data)

        guard let sticker = await sticker, let colors = await colors else { return nil }

        return ProcessedPhoto(sticker: sticker, colorScheme: colors)
    }

    private func extractColors(from data: Data) -> PhotoColorScheme? {
        let colorExtractor = ColorExtractor()
        return colorExtractor.extractColors(from: data)
    }

    private func extractSticker(from data: Data) -> Image? {
        // ...
    }
}

24:20 - Applying the visual effect on each sticker in the carousel

import SwiftUI
import PhotosUI

struct StickerCarousel: View {
    @State var viewModel: StickerViewModel
    @State private var sheetPresented: Bool = false

    var body: some View {
        ScrollView(.horizontal) {
            LazyHStack(spacing: 16) {
                ForEach(viewModel.selection) { selectedPhoto in
                    VStack {
                        if let processedPhoto = viewModel.processedPhotos[selectedPhoto.id] {
                            GradientSticker(processedPhoto: processedPhoto)
                        } else if viewModel.invalidPhotos.contains(selectedPhoto.id) {
                            InvalidStickerPlaceholder()
                        } else {
                            StickerPlaceholder()
                                .task {
                                    await viewModel.loadPhoto(selectedPhoto)
                                }
                        }
                    }
                    .containerRelativeFrame(.horizontal)
                    .visualEffect { [selection = viewModel.selection] content, proxy in
                        let frame = proxy.frame(in: .scrollView(axis: .horizontal))
                        let distance = min(0, frame.minX)
                        let isLast = selectedPhoto.id == selection.last?.id
                        
                        return content
                            .hueRotation(.degrees(frame.origin.x / 10))
                            .scaleEffect(1 + distance / 700)
                            .offset(x: isLast ? 0 : -distance / 1.25)
                            .brightness(-distance / 400)
                            .blur(radius: isLast ? 0 : -distance / 50)
                            .opacity(isLast ? 1.0 : min(1.0, 1.0 - (-distance / 400)))
                    }
                }
            }
        }
        .configureCarousel(
            viewModel,
            sheetPresented: $sheetPresented
        )
        .sheet(isPresented: $sheetPresented) {
            StickerGrid(viewModel: viewModel)
        }
    }
}

26:15 - Accessing a reference type from a concurrent task

Task { @concurrent in
    await viewModel.loadPhoto(selectedPhoto)      
}

29:00 - Processing all photos at once with a task group

func processAllPhotos() async {
    await withTaskGroup { group in
        for item in selection {
            guard processedPhotos[item.id] == nil else { continue }
            group.addTask {
                let data = await self.getData(for: item)
                let photo = await PhotoProcessor().process(data: data)
                return photo.map { ProcessedPhotoResult(id: item.id, processedPhoto: $0) }
            }
        }

        for await result in group {
            if let result {
                processedPhotos[result.id] = result.processedPhoto
            }
        }
    }
}

30:00 - Kicking off photo processing and configuring the share link in a sticker grid view.

import SwiftUI

struct StickerGrid: View {
    let viewModel: StickerViewModel
    @State private var finishedLoading: Bool = false

    var body: some View {
        NavigationStack {
            VStack {
                if finishedLoading {
                    GridContent(viewModel: viewModel)
                } else {
                    ProgressView()
                        .frame(maxWidth: .infinity, maxHeight: .infinity)
                        .padding()
                }
            }
            .task {
                await viewModel.processAllPhotos()
                finishedLoading = true
            }
            .toolbar {
                ToolbarItem(placement: .topBarTrailing) {
                    if finishedLoading {
                        ShareLink("Share", items: viewModel.selection.compactMap {
                            viewModel.processedPhotos[$0.id]?.sticker
                        }) { sticker in
                            SharePreview(
                                "Sticker Preview",
                                image: sticker,
                                icon: Image(systemName: "photo")
                            )
                        }
                    }
                }
            }
            .configureStickerGrid()
        }
    }
}

Code-along: Explore localization with Xcode

Learn how to localize your app into additional languages using Xcode. We'll walk step-by-step through the process of creating a String Catalog, translating text, and exchanging files with external translators. You'll learn best practices for providing necessary context to translators and how Xcode can help to provide this information automatically. For larger projects, we'll also dive into techniques to manage complexity and streamline string management using type-safe Swift code.

Chapters

Resources

Related Videos

WWDC23

Transcript

Hello! I'm Andreas from the localization team. In this session we are going to explore localization with Xcode.

No prior knowledge is required for this session. We are going to explore together how to set up an app for localization. Then, we'll talk about providing the right context to people working on translating your app, And finally, we'll dive into some of the complexity you might run-into as your project grows, and talk about new features to help you manage it! Let's get started! This session is a code-along. That means that you can apply steps in this video to the sample project linked in the description. Download the Landmarks project and start localizing it with me! I have the project open in Xcode. This version of Landmarks works great in English, but doesn't have any translations yet. To get started, let's add a String Catalog using the File menu.

We can use the default name “Localizable”, but I want it to go into the Resources group. That’s where the Asset Catalog is too. Now that the String Catalog has been added, let's build the project. When a String Catalog is present, Xcode discovers localizable strings after each build, and adds them to the Catalog automatically. We don't have to do anything special to keep our strings in the Catalog in sync with the code. But how does Xcode know which strings we want to localize? Well, most SwiftUI API makes strings localizable by default. That includes views like Text and Button. In the rest of our code, String(localized: ) makes strings available for localization as well.

Now, I'll use the Assistant Editor to understand where strings in the String Catalog have been extracted from.

This one comes from a confirmation dialog.

Here we have a string used as title of a LabeledContent This string is used as a navigation title. And this one comes from Text using an interpolated variable. As you can see, most SwiftUI API is localizable out of the box. You might have already noticed, that this string represents some number of items. The placeholder %lld will be replaced with the number of landmarks at runtime. And we want this string to be different, depending on that number. For example, we want it to say: "1 item", and "2 items".

To do that, let's open the context menu and select "Vary by Plural".

Now we can write the phrases for one item, and multiple items.

At runtime, the system will pick the right string to match the number. That was an easy fix.

Now, localization is all about other languages. I happen to speak German, so I can write some of the translations myself, right in the String Catalog. Let’s close the Assistant. To add a language to the project, I click the plus button in the bottom bar, and select German. That's all it takes, and I'll start adding a few translations.

Notice how, the state changes from NEW to TRANSLATED as I make progress, indicated by the green checkmark on the right! At the same time, the overall translation percentage in the sidebar has increased to 8%. I'm a better developer than I am a translator though, so I teamed up with a language expert. I'd rather have them finish the remaining German translations.

To send them what I have so far, I go to the Product menu in Xcode, and select "Export Localizations". Because I’m translating my app into German, I'd like to export only German.

This will produce a localization-catalog file with all German translations I have so far, as well as the English strings that have not yet been translated. This package-file contains an industry-standard XLIFF file, which translation services can easily work with. Once they're done, they will send back the fully translated localization catalog.

To import it back into the project, I go to the Product menu again, and select "Import Localizations". Xcode takes a moment to build the project, but then all strings are marked as translated, and German is at 100%. You can do that too! The sample project contains the fully translated de.xcloc file that you can import into Xcode just like I did. Now it's time to test this in action, and I'd like to run the app in German. I'll open the scheme editor, edit the scheme, select "Run" and navigate to "Options".

Here we can change the app's language to German for the next debug run.

Now I'll build and run the app on my Mac.

It's fully localized to German and I love the new look! That is how easy it is to set up a new app for localization. Now let's take a deeper look at how we can provide additional context to our translators to ensure high quality translations. The Assistant Editor in Xcode is fantastic to see the code right next to the String Catalog. But translators often don't see our code, or the running app when translating strings. We need to provide additional context to help them write the best translation.

This context is added in form of a comment. Either directly in code, or in the String Catalog's column for comments. Without a comment, it can be difficult to understand how a string is used. For example, "Landmarks" is just a single word. Are we referring to the app's name, or landmarks on a map? This string key says "%@ is contained in %@". How can your translators tell what the %@ placeholders represent? This will impact the way the string gets translated! A good comment explains which interface element a string is used in, like a tab bar, a button, or a subtitle.

It's also helpful to describe surrounding user interface elements. For example, adding, that the first string is an entry in the sidebar helps a lot! The second string is the subtitle of a landmark within a list, so let's add that too. Finally, a comment should explain what kind of content can appear in each placeholder. Here, the first placeholder is the name of a landmark, and the second one is the name of a collection it is a part of. Without a comment this would be impossible to translate correctly. And this is why it's critical to provide a good comment! Last year, we gave String Catalogs the ability to track where in code a string is extracted from. This year, we use that information to help you. Introducing automatic comment generation in Xcode 26! Xcode uses an on-device model to analyze your code and can now write comments for you! Let's see it in action! So far, we have only provided comments for some of the strings in code. Having learned how important comments are, let’s improve the context we provide to translators! Here we have a string without a comment. It looks like it’s used in a Button. I’ll open the context menu for it, and select “Generate Comment”. After analyzing where the string is used, Xcode generated “The text label on a button to cancel the deletion of a collection”. Very on point! This string also has no comment yet, so I’ll let Xcode generate one for us.

It created “A label displayed above the location of a landmark”. That works. I want to highlight, that we can still make edits here. Your input always overrides a generated comment. I like to work together with the model to provide some extra context so I’ll add, that this string is shown in the Inspector.

I think this is a very useful feature, and I want Xcode to generate a comment for all new strings it extracts from code! To enable this, I open Settings, and navigate to Editing.

Here, I enable the setting “automatically generate string catalog comments”. From now on, when Xcode detects that new localizable strings were added in code, it generates a comment automatically. This makes it really easy to provide translators with the context they need.

To help developers of translation tools indicate when a comment was generated by Xcode, the XLIFF file it exports, annotates them as “auto-generated".

To learn more about interoperability with other tools, and everything else a String Catalog can do for you, check out "Discover String Catalogs".

As your project grows and becomes more complex, there are additional Xcode features and localization APIs that can help you stay organized. For example, as a project increases in size, and maybe multiple developers start working on it, we sometimes split our codebase into extensions, frameworks, and Swift Packages. And each of them can contain one or more String Catalogs. In these cases, we now have to use another parameter on the localization API: bundle. This will tell the system where the string can be found at runtime. Now, `Bundle.main` always refers to the main app. If we don't include the bundle parameter, .main is used by default. New this year is the #bundle macro. You can use it to refer to the bundle that contains resources for the current target. That is the main app, if the code runs in the main app, and it automatically finds resources of your framework or Swift Package otherwise. It also works on older versions of the OS, and does the right thing for you! Another way to organize your strings is to group related ones together, for example, grouping all strings related to a specific screen, a feature, or user flow. We call groups of strings a "Table", and each String Catalog represents one table. By default, all strings are extracted into a table called "Localizable". This matches the default file name when you create a String Catalog. Of course, we can change the name! The parameter tableName lets us put strings into the String Catalog of our choice. For example, using the table name "Discover" automatically puts them into "Discover.xcstrings" While the Landmarks app works great when creating private collections of Landmarks, I want to develop a feature where it's possible to discover more content. This content either comes from friends who I follow, or from a curated feed. Let's start developing that feature in a new framework.

To get started, I’ll open the File menu, and add a new Target, I'll search for “framework”.

Because this framework is about discovering new landmarks, I'll call it "DiscoverKit".

I'm starting a new screen from scratch, and I want to put all of its strings in a separate table. Let’s add a new file to DiscoverKit, select “String Catalog” and call it “Discover”.

For convenience, I'll open the code in the Editor on the right side, by holding down shift and option while I click my new Swift file.

And then I’ll make some more space by closing the Navigator.

I'll start developing this feature in the model layer with an enum. My new enum defines whether content comes from a friend, or is curated. It has a property to expose a localized title for itself, let's implement it. I start by using String(localized: ) to expose the string for localization. Then, I'll use the table argument for better organization. Because we're in a framework, I need to use the bundle argument too. And let's do the same for the other case.

Our enum is done. Now, I'll import SwiftUI and add a view to get some content on the screen.

I don't have any business logic in place yet, so for now I'll show a placeholder that says there will be 42 new posts.

That's enough typing, it's time to let Xcode do some work! I'll change the scheme to the new framework and build.

As soon as my build finishes, the new strings appear in my Catalog.

And they already have a comment! Great! For the remaining UI work I want to introduce you to a new workflow in Xcode 26. Since the introduction of String Catalogs, they have supported extracting strings from code. This year, they can help you write your code by generating symbols for manually added strings. Let's continue building our view with this new workflow. My goal is to add a navigation title, and a navigation subtitle. Now, this entire view is very much in development. By separating the string keys from their values I can iterate on the exact wording without needing to update my code.

To get started, I click the + button in the String Catalog to add a new string. Many projects prefer a setup of using uppercase keys to indicate a semantic meaning of the strings. I'll do that here too. The key should be "TITLE", and the value "Discover Landmarks”. Because I added this string manually, I'll write the comment myself.

The Attribute Inspector tells me how to use the string in code. That’s really helpful! I will do exactly that! To show a title in the navigation bar, I'll use the view modifier `.navigationTitle`.

For its value, I’ll type a leading dot, and start typing the name of the table. Xcode can auto-complete the table name for me, and all manual strings contained in this table are also suggested.

That was easy! Notice that I didn't have to manually type out a bundle and table name this time. Let's do this again, but for the navigation subtitle! I’ll add a new string to the Catalog, and call it “SUBTITLE”.

I want it to summarize how many posts are from friends, and how many were curated. For that I need a placeholder and format specifiers do just that! I’ll start by typing %, and now Xcode suggests me a few to use. Here, I want a number, so I choose a placeholder for an integer.

The placeholder represents the number of posts from friends, so I call it “friendsPosts”.

I continue by adding another placeholder for curated posts.

Then it's time for a comment.

The string is now ready to be used in code. I’ll use the modifier “Navigation Subtitle” this time.

Again, I start typing `.Discover` to find the right table, and auto-completion takes it from here.

That was much less typing! Notice how Xcode suggested the correct types for me! This new feature makes it really easy to work with manual strings, and I can now rely on autocompletion and the compiler for loading these localized resources! Now, when I want to change the values of my strings later, I can simply update them in the String Catalog without having to modify my code. How many different ways have you spelled "OK" in the past, and wanted to correct them all with one simple action? To generate a symbol name that feels just right in Swift, Xcode uses the key and value of the string. Strings with no placeholder can be accessed like any other static property. If a string contains a placeholder, Xcode generates a function instead, and uses the placeholder name as the argument label. The generated symbols are static variables or functions on the type LocalizedStringResource. That's really powerful, because you can use them anywhere a LocalizedStringResource is used! This includes SwiftUI views such as Text or Button, or view modifiers like .navigationSubtitle`! If you don't use SwiftUI, Foundation's String(localized: ) works with the type LocalizedStringResource as well.

Custom views and other declarations using LocalizedStringResource, can now also be called using a generated symbol. Your symbols are directly accessible on LocalizedStringResource if you use the default table name "Localizable". When using a non-default table name, generated symbols are nested in the namespace of that table. This means you can access them in code starting with your table name. New projects created by Xcode 26 have symbol generation enabled by default. To use it in an existing project, turn on the build setting "Generate String Catalog Symbols”.

As we've seen, Xcode now fully supports two different workflows for managing your strings: extracting them from code, and referencing them with a type-safe API. This brings up the question: which workflow should I use? We recommend you start by relying on string extraction. You write the strings where you develop your UI, allowing you to read and understand the code more quickly. Using this workflow, you can make use of Xcode's comment generation, which saves you some typing, while still providing meaningful context to your translators. As your project grows, you might find yourself wanting more control over the organization of your strings. In this case we recommend using generated symbols. This allows you to separate keys from their values, so you can iterate on your text without changing your code. In addition, Xcode's autocompletion makes it easy to reference strings across all of your tables. Finally, generated symbols help you avoid boilerplate code in frameworks and packages.

Both approaches have their strengths, and we believe you should be free to decide what works best for your project. That's why we've added a powerful refactoring feature so you can easily switch between the two. Let's try it out in the DiscoverKit framework! I think the placeholder text inside the Navigation Stack is a great candidate to be referenced by a symbol. I open the context menu for it, and select “Refactor > Convert Strings to Symbols”.

A preview UI opens up, showing me the exact locations, where the symbol will be used instead of the string. Clicking the highlighted section allows me to compare the symbol with the original code.

Let’s change the name of the key to "feedTitle" to make it more semantic. I can also add a nice name for Argument 1! I’ll call it "newPosts". That looks good! I'll confirm the refactoring.

After thinking about both approaches a little, I've decided to use generated symbols for all strings in this table. Let's select the remaining two, select Refactor > Convert Strings to Symbols.

Because I’m happy with the symbol names, I’ll click on “Convert”.

That's how easy it is to refactor an entire table at once! I encourage you to explore these localization features in Xcode on your own. Start by relying on string extraction to localize your project. Provide meaningful comments to your translators, either by writing them yourself, or using Xcode's comment generation. As the complexity of your project grows, consider using generated symbols to maintain precise control over your strings. Finally, for more details on String Catalogs, check out our previous video "Discover String Catalogs". Thank you for watching, and I hope these new features help you streamline your localization workflow.

Code

1:34 - Localizable strings

// import SwiftUI
Text("Featured Landmark", comment: "Big headline in the hero image of featured landmarks.")

Button("Keep") { }

// import Foundation
String(localized: "New Collection", comment: "Default name for a new user-created collection.")

6:00 - Adding a comment

Text("Delete",
comment: "Delete button shown in an alert asking for confirmation to delete the collection.")

String(localized: "Shared by Friends", comment: "Subtitle of post that was shared by friends.")

9:13 - XLIFF file

// Field for automatically generated comments in the XLIFF

<trans-unit id="Grand Canyon" xml:space="preserve">
<source>Grand Canyon</source>
<target state="new">Grand Canyon</target>
<note from="auto-generated">Suggestion for searching landmarks</note>
</trans-unit>

9:58 - Localized String in the main app and a Swift Package or Framework

// Localized String in the main app:
Text("My Collections", 
comment: "Section title above user-created collections.")

// Localized String in a Swift Package or Framework
Text("My Collections", 
bundle: #bundle, 
comment: "Section title above user-created collections.")

10:56 - Localized String with a tableName parameter

// Localized String in the main app:
Text("My Collections",
tableName: "Discover",
comment: "Section title above user-created collections.")

// Localized String in a Swift Package or Framework
Text("My Collections",
tableName: "Discover",
bundle: #bundle, 
comment: "Section title above user-created collections.")

17:31 - Symbol usage

// Symbol usage in SwiftUI
Text(.introductionTitle)

.navigationSubtitle(.subtitle(friendsPosts: 42))


// Symbol usage in Foundation
String(localized: .curatedCollection)


// Working with generated symbols in your own types
struct CollectionDetailEditingView: View {
    let title: LocalizedStringResource
    
    init(title: LocalizedStringResource) {
        self.title = title
    }
}
CollectionDetailEditingView(title: .editingTitle)

Combine Metal 4 machine learning and graphics

Learn how to seamlessly combine machine learning into your graphics applications using Metal 4. We'll introduce the tensor resource and ML encoder for running models on the GPU timeline alongside your rendering and compute work. Discover how shader ML lets you embed neural networks directly within your shaders for advanced effects and performance gains. We'll also show new debugging tools for Metal 4 ML workloads in action using an example app.

Chapters

Resources

Related Videos

WWDC25

WWDC24

Transcript

Hi! My name is Preston Provins and I am an engineer on the Metal Framework team at Apple, and I'll be joined later by my colleague Scott. I'll share the additions coming to Metal that combine machine learning and games, and Scott will introduce the GPU tools additions designed to enhance your debugging experience for machine learning in Metal 4. I'm excited to share how to combine Metal 4 Machine learning and graphics in this session. If you are interested in what all Metal 4 has to offer, check out the Metal 4 foundations talk to learn about what else is new to Metal 4. Machine learning is transforming games and graphics with techniques like upscaling, asset compression, animation blending, and neural shading. These techniques push the frontier of creativity and immersion. They simulate complex phenomena, enhance visual fidelity, and streamline the exploration of new styles and effects. CoreML is fantastic for a wide range of machine learning tasks, such as segmentation, classification, generative AI and more. It makes it easy to author machine learning models. If your application of machine learning requires tight integration with the GPU timeline, Metal 4 has you covered. In a typical frame, a game may perform vertex skinning in a compute pass, rasterize the scene in a render pass, and apply antialiasing in another compute pass. Antialiasing is typically done using image-processing techniques such as temporal anti-aliasing. Cutting edge techniques replace these traditional methods with a machine learning network.

This network upscales the image, allowing the rest of the rendering to happen at lower resolution, improving performance.

It’s also becoming more common to execute tiny neural networks inside a shader. For example, a traditional fragment shader would sample material textures, but groundbreaking techniques use small neural networks to decompress textures on the fly and achieve higher compression ratios.

This neural rendering technique compresses material sets to 50% of the block compressed footprint. In this session, we’ll meet MTLTensors, Metal 4’s new resource for machine learning workflows. We’ll dive into the new MTL4MachineLearningCommandEncoder, which runs entire networks on the GPU timeline alongside your other draws and dispatches. We’ll introduce Shader ML, which lets you embed machine learning operations inside your own shaders. Finally, we’ll show how you can seamlessly integrate ML into your application with the Metal Debugger. You’re already familiar with MTLBuffers and MTLTextures. This year Metal 4 introduces MTLTensor, a new resource that lets you apply machine learning to data with unprecedented ease. The MTLTensor is a fundamental machine learning data type that can be used for compute, graphics and machine learning contexts. Machine learning workloads will extensively utilize tensors. MTL4MachineLearningCommandEncoder uses MTLTensors to represent inputs and outputs, and Shader ML uses MTLTensors to represent weights as well as inputs and outputs.

MTLTensors are multi-dimensional containers for data, and are described by a rank and number of dimensions for each rank. MTLTensors can extend well beyond two dimensions, providing the flexibility to describe any data layout you need for practical machine learning usage. MTLTextures, for example, are limited to four channels at most and have strict limits for the extents that depend on the texture format. For machine learning, it's common to use data that has more than two dimensions, such as convolution operations. Using a flat representation of data like MTLBuffer would require complicating indexing schemes for data with multiple dimensions. Indexing multidimensional data in a MTLTensor is way simpler because the stride and dimension of each rank are baked into the MTLTensor object and automatically used in the indexing calculations. Let's work through the process of creating a MTLTensor. The MTLTensor’s rank describes how many axes it has. This MTLTensor has a rank of two. It contains rows of columns of data. The extents of the dimension describes how many data points are along that axis. The dataType property defines what format of data the MTLTensor is wrapping.

Usage properties indicate how the MTLTensor will be utilized. MTLTensorUsageMachineLearning for MTL4MachineLearningCommandEncoder MTLTensorUsageCompute or MTLTensorUsageRender for usage inside of your shader programs. It's also possible to combine usages like the usage property for textures. Those are the important MTLTensor properties that should be populated on a MTLTensorDescriptor object, now lets make a MTLTensor in code. With the descriptor properties filled, create a new MTLTensor by calling newTensorWithDescriptor:offset:error: on a MTLDevice object. MTLTensors are created from either a MTLDevice or MTLBuffer object; however, MTLTensor created from device offer the best performance. Similar to how MTLTextures can be swizzled, creating a MTLTensor from MTLDevice object results in an opaque layout that is optimized for reading and writing. Now, let's focus on creating MTLTensors from a pre-existing MTLBuffer. Unlike MTLTensors created from a MTLDevice MTLTensors from MTLBuffer aren't assumed to be tightly packed, so you need to specify its strides. The innermost stride, should always be one.

The second stride indicates how many elements are jumped over when the row index is incremented.

It's possible the source MTLBuffer contains padding, such as unused columns at the end of the row. Padding needs to be accounted so the MTLTensor wraps the appropriate elements.

To create a MTLTensor from underlying buffer, set the dataType and usage properties like you would for a device-allocated tensor. Then fill out the strides property of the MTLTensorDescriptor so the resulting MTLTensor will appropriately wrap the contents of the MTLBuffer. Finally, use the newTensorWithDescriptor:offset:error: on the source MTLBuffer. Now that we know how to allocate and create MTLTensors, let's dive into the new machine learning encoder to add ML work to the GPU Timeline. Metal 4 enables compute and render commands to be easily added to the GPU timeline. with the MTL4ComputeCommandEncoder and MTL4RenderCommandEncoder, respectively. This year we're taking unification even further by adding machine learning work to the GPU Timeline.

The MTL4MachineLearningCommandEncoder enables full models to run alongside, and synchronize, with other Metal commands on the GPU and ensures seamless integration with other MTLCommands. It's a new encoder for encoding machine learning commands, it has an interface similar to the compute and render encoders. The Metal 4 synchronization primitives also operate with machine learning commands, just like with compute and render. Synchronization enables control over work orchestration and facilitate parallelization to maintain high performance.

The MTL4MachineLearningCommandEncoder creation workflow can be separated into two parts: offline and runtime. The offline portion of the workflow takes place prior to application launch and the runtime portion will happen during the application life time, such as in the middle of a frame. Let's start with the offline portion of the workflow, creating a MTLPackage.

A MTLPackage is a container for one or more functions that each represent an ML network, that you can use in Metal to execute machine learning work. This format is optimized for loading and execution with Metal.

To create a MTLPackage, you first need to have a CoreML package. Here we use CoreML converter to convert from the ML framework the network was authored in, such as PyTorch or Tensorflow, into a CoreML Package.

Here is an example of exporting a PyTorch model using the CoreML Tools library in Python. Simply import the tools and run convert on the model to generate an export. Finally, save that export as an ML package. There is one thing I would like to highlight here: not every CoreML package is an ML program and only ML programs are supported. If the CoreML package was exported an older OS checkout this article to learn more on exporting those CoreML model files as an ML package. With the CoreML package created, it is as simple as running the metal-package-builder command line on the saved model to produce a MTLPackage. This converts the CoreML package into a format that can be efficiently loaded at runtime. So, that's it for creating a MTLPackage. The offline portion of the workflow is complete and the rest is carried out at run-time.

To compile the network, first open the MTLPackage as an MTLLibrary. Create a function descriptor using the name of the function that represent the network in the package. In this case, the main function.

To compile the network, create a MTL4MachineLearningPipelineState. This is done using a MTL4MachineLearningPipelineStateDescriptor with the function descriptor. If the network has dynamic inputs, specify the size of each input on the MTL4MachineLearningPipelineStateDescriptor.

Compile the network for the specific device by creating the MTL4MachineLearningPipelineState with the MTL4MachineLearningPipelineStateDescriptor.

That is how an MTL4MachineLearningPipelineState object is created, Now the next step is creating the MTL4MachineLearningCommandEncoder and encoding work.

Let's take a deeper look at using the MTL4MachineLearningCommandEncoder object to dispatching work on the GPU timeline.

Simply create the MTL4MachineLearningCommandEncoder object, just like creating and encoder for compute or render. Set the created the MTL4MachineLearningPipelineState object, and bind inputs and outputs being used. Finally, dispatch the work using the dispatchNetworkWithIntermediatesHeap method.

The machine learning encoder uses the heap to store intermediate data between operations, instead of creating and releasing buffers it enables the reuse of resources for different dispatches.

To create this MTLHeap, create a MTLHeapDescriptor and set the type property to MTLHeapTypePlacement You can get the minimum heap size for the network from querying the intermediateHeapSize of the pipeline, set the size property of the heap to be greater than or equal to that.

After encoding your network dispatches, end encoding and submit your commands to run them on the GPU timeline.

As previously mentioned, Metal4 synchronization primitives also operate with machine learning commands, just like with compute and render.

Work that doesn't depend on machine learning output can happen in tandem if synchronized correctly.

Only the work consuming the network output needs to wait for the schedule machine learning work to conclude.

To synchronize MTL4MachineLearningCommandEncoder dispatches, you can use standard Metal 4 synchronization primitives such as MTLBarriers and MTLFences. The new MTLStageMachineLearning is used to identify ML workloads in barriers. For example, to make your rendering work wait on outputs produced by a network, you could use a barrier between the appropriate render stage and the machine learning stage. Let's look at MTL4MachineLearningCommandEncoder in action - in this example, MTL4MachineLearningCommandEncoder is used to dispatch a fully convolutional network to predict occlusion values per pixel. Evaluating this requires careful synchronization. The depth buffer and view space normals are populated prior to launching the ML workload. While the network is processing the data, the renderer dispatches other render related tasks in parallel and waits for the neural results before compositing the final frame. MTL4MachineLearningCommandEncoder isn't limited to just processing full frame information for games, you can use it for any network that fits into a real time budget and leverage Metal 4 synchronization primitives to however best suites your integration needs. That's how Metal 4’s MTL4MachineLearningCommandEncoder makes it easy to run large machine learning workloads on the GPU timeline. To summarize: Machine learning is joining compute and render in Metal 4 through the MTL4MachineLearningCommandEncoder. MTL4MachineLearningCommandEncoder enables entire networks to run on the GPU timeline. Resources are shareable with other GPU commands and the robust set of Metal 4 synchronization primitives enable high performance machine learning capabilities. Metal 4 also introduces Shader ML for embedding smaller machine learning operations inside your existing kernels and shaders. Cutting-edge games are adopting machine learning to replace traditional rendering algorithms. ML based techniques can offer solutions for global illumination, material shading, geometry compression, material compression and more. These techniques can often improve performance or decrease memory footprint. As a motivating example, let’s consider neural material compression - a technique that enables up to 50% compression when compared to block compressed formats.

With traditional materials, you sample material textures, such as albedo and normal maps. Then you use the sampled values to perform shading.

With neural material compression, you’ll sample latent texture data, perform inference using the sampled values, and use the network’s output to perform shading.

Splitting each step into it's own pipeline is inefficient since each step needs to sync tensors to device memory, operate and sync outputs back for later operations.

To get the best performance, your app should combine these steps into a single shader dispatch. With Shader ML, Metal enables you to run your ML network directly within your fragment shader, without having to go through device memory between steps. You can initialize input tensors, run your network, and shade only the necessary pixels each frame. That improves your runtime memory footprint and your game’s disk space.

Lets take a look at neural material evaluation in greater detail.

Initializing input MTLTensors can be split into two parts, loading the networks weights and building the input feature MTLTensor. The input feature MTLTensor is made by sampling bound textures with a UV coordinate for the fragment.

Inference is where the input feature MTLTensor is transformed by learned weight matrices to extract features, compute activations, and propagate information through layers. This evaluation is repeated for multiple layers and the result is a decompressed material. Finally, the decompressed materials are used for the shading calculations of the fragment.

Let's see how to initialize our input MTLTensors with Shader ML. First, let’s declare a fragment shader that will utilize Shader ML and pass in network weights. Start by including the new metal_tensor header. We’ll use the MTLTensor type to access the network weights. MTLTensors are bound to the shader using buffer binding slots. It's also possible to pass in MTLTensors using argument buffers as well. The MTLTensor type is templated. The first template argument is the MTLTensor’s dataType. These MTLTensors were created in device memory so we use the device address space qualifier. The second argument represents the MTLTensor's dimensions and the type to be used for the indexing into the MTLTensor. Here, we’re using dextents to define a rank two tensor with dynamic extents. With that, our fragment shader is set up. Let’s implement the neural material compression algorithm.

With the weights of the network passed in, we can create the input MTLTensor by sampling four latent textures. MTLTensor are not just a resource that can be bound: you also can create inline MTLTensor directly within your shaders! Create a MTLTensor wrapping the sampled values, and use it for the evaluation of the network. Inline MTLTensors are assumed to be tightly packed, so there is no need to pass strides at creation.

With that, initializing the input MTLTensors is complete and we are all setup to infer values from the neural network. Evaluation transforms inputs using learned parameters which are then activated. The activations are passed to subsequent layers, the final layer's activations form the decompressed material.

This year, Metal introduces Metal Performance Primitives to make MTLTensor operations accessible in the shading language. This library is a set of high performance APIs that enable performance portable solutions on MTLTensors. It provides matrix multiplication and convolution operations.

Matrix multiplication is at the heart of neural network evaluation. We'll use the matmul2d implementation provided by Metal Performance Primitives to implement a performance portable network evaluation routine. To get started include the new MetalPerformancePrimitives header inside your Metal shader The parameters of your matrix multiplication are configured using the matmul2d_descriptor object. The first set of template parameters specify the problem size of the matrix multiplication. The next set of template parameters control whether the inputs to the matrix multiplication need to be transposed when performing the operation. And the last template parameter controls your precision requirements.

In addition to the descriptor, the matmul2d operation must be specialized with the number of threads that will be participating in the operation. Here, since we are within a fragment shader, we’ll use execution_thread to indicate that the full matrix multiplication will be performed by this thread. Then, run the matrix multiplication with that configuration.

Finally, activate each element of the result of our matrix multiplication using the ReLU activation function. This process is repeated for the second layer to fully evaluate our network right in our shader. After evaluation is complete, the decompressed materials are available to be used for shading.

The output MTLTensor holds channel information which can then be used like any other value sampled from a texture. Here’s a realtime demo of neural material compression compared to traditional materials. There’s no perceived quality loss from using neural materials, especially when shaded. Here’s the base color in isolation. It’s still very difficult to notice any differences between neural materials and traditional ones, and yet neural materials use half the memory and take up half the disk space.

MTLTensor operations aren’t exclusive to just fragment shaders. They can be used inside of all functions and all shader stages. If an entire simdgroup or threadgroup will be doing the same operations on the same data, you can leverage the hardware to your advantage by choosing a larger execution group. But if your MTLTensor operations are divergent with respect to data or exhibit non-uniform control flow at the MTLTensor operation call site, you must use a single thread execution group. Other execution schemes assume there is no divergence and uniform control flow for the execution group.

To summarize, you can now perform ML operations like matrix multiplication and convolution in your own shaders Shader ML makes it easy to perform multiple ML operations in a single shader. This is cache-friendly, requires less dispatches, and uses less memory bandwidth, especially when using smaller networks. And Shader ML gives you the fine-grained control you need to create custom operations. It’s never been easier to implement cutting-edge ML techniques in your Metal apps. And thats how you can use Shader ML to embed neural networks into your shader program. Now, I'll turn things over to my colleague Scott to show how Metal 4's new debug tools make debugging machine learning workloads a breeze. Hi everyone, I'm Scott Moyers and I'm a software engineer on the GPU Tools team.

Earlier, Preston showed you an application that uses machine learning to calculate ambient occlusion. The app encodes a machine learning network directly into its Metal rendering pipeline.

While helping develop this app, I hit an issue where the output had some severe artifacts. Let me enable just the ambient occlusion pass to highlight the problem I had.

There should be shadows in the corners of objects, but instead there's lots of noise and the structure of the scene is barely visible.

I’ll show you how I used the new tools to find and fix this issue. First let’s capture a GPU trace of the app in Xcode. To do that I’ll click on the Metal icon at the bottom of the screen, then the capture button.

Once the capture completes I can find my captured frame available in the summary.

The debug navigator on the left provides the list of commands that the application used to construct the frame.

For instance, the offscreen command buffer contains many encoders including the G-buffer pass. And the next command buffer contains my MTL4MachineLearningCommandEncoder. Using Metal 4 allowed me to have fine grained control over synchronization, and whilst I was careful about setting up barriers and events between dependent passes, I wondered if a synchronization problem could be causing these issues. To check this, I turned to the Dependency Viewer, which is a useful tool to get an overview of the structure of your Metal application. I’ll click on the Dependencies icon at the top left.

With this interface, I can see all of the application’s commands along with any synchronization primitives such as barriers and events. Zooming into a command encoder reveals even more detail. There’s the completion of my first command buffer.

The command below it copies the normals into a MTLTensor. Then there’s a barrier followed by a MTL4MachineLearningCommandEncoder. I’ll zoom back out so I can review the overall structure. My new ambient occlusion pass is in the command buffer on the right. Before I added this pass the application was working fine, so I can assume the dependencies within the top and bottom command buffer are correct.

I’ll inspect the new command buffer containing the MTL4MachineLearningCommandEncoder.

Before the command buffer can start there’s a wait for a shared event signal.

Then at the end of the command buffer there’s a signal to unblock the next one. So there can’t be any other commands running in parallel with this command buffer. And within the command buffer there’s barriers between each encoder ensuring that each command executes one after the other. I was fairly confident at this stage that there weren’t any synchronization issues, at least within this frame. With that ruled out, I decided to check the MTL4MachineLearningCommandEncoder directly. Clicking on the dispatch call for the ambient occlusion network takes me to its bound resources.

On the right the assistant editor is displaying the output MTLTensor. I can see it has the same artifacts as the running application, so clearly it’s not correct. I’ll double click the input MTLTensor to display it next to the output. The input has what I would expect for view space normals; the objects facing a different direction do have different component intensities. So the problem must be inside my machine learning network. Let’s go back to our bound resources view and this time I’ll double click Network to open it in the new ML Network Debugger. This tool is essential for understanding what's happening inside the model.

This graph represents the structure of my ambient occlusion network. I wrote it in PyTorch, and in my target's build phases I do what Preston suggested earlier, I export it as a CoreML package, then convert to an MTLPackage. The boxes are the operations and the connections show the data flow through the model from left to right. I wanted to find out which operation was responsible for introducing the artifacts. I knew the final output was bad and that the input was good, so I decided to bisect the graph to narrow it down. Let’s pick an operation roughly in the middle.

Selecting an operation shows its description on the right, along with its attributes, inputs, and outputs. What’s more is that I am able to inspect the intermediate MTLTensor data that any operation outputs. I’ll click on the preview to open it in the MTLTensor viewer.

I can see the artifacts are already present here, so I’ll check an earlier operation.

This operation also has artifacts in the output. Let’s inspect its input.

This MTLTensor however appears to be highlighting edges in the scene, which is expected, the input to our network is the edges extracted from our depth buffer. So something must be going wrong within this region of the network.

This stitched region can be expanded by clicking on the arrows in the top left of the operation.

From the order and types of these operations, I recognize this as my SignedSmoothstep function. It first takes the absolute value of the input. Then clamps the value between 0 and 1. But then it’s raising the result to the power of itself, which doesn’t seem right to me, I don’t remember there being a power operation in the SignedSmoothstep function. Let’s jump into the Python code to find out what's going on. I’ll stop the debug session and go back to the source code.

The model I'm running is in this class called LightUNet. I’ll navigate to its forward propagation function to check it's doing what I expect.

The first custom operation it's performing is SignedSmoothstep, which is the stitched region I saw in the ML network debugger. I’ll jump to its forward propagation function.

This should be a straight forward smoothstep operation where I maintain the sign of the input. But, on this line I can see the bug, I typed too many asterisks making my multiply a power operator. Let's delete that extra one and try running it again.

And there you have it, a working implementation of neural ambient occlusion using Metal 4’s built in MTL4MachineLearningCommandEncoder.

In this demo I showed you how I used the Metal Debugger to debug a Metal 4 machine learning application. First the dependency viewer helped me validate synchronization. After that I inspected the inputs and outputs of the network using the MTLTensor viewer, this verified the problem was inside the network.

Finally I used the ML network debugger to step through the operations in the network and pinpoint the issue.

These tools are part of a larger family of tools available for debugging and optimizing Metal apps. Now let’s recap what we covered today. Metal 4 introduces MTLTensor, a new multi-dimensional resource designed specifically for machine learning data. MTLTensors provide flexibility for complex data layouts beyond two dimensions, and with baked-in stride and dimension information, greatly simplifies indexing. New features in Metal 4 make it possible to combine machine learning workloads into your metal pipelines. The MTL4MachineLearningCommandEncoder enables entire machine learning networks to run directly on the GPU timeline. This allows seamless integration and synchronization with your compute and render work. For smaller networks, Shader ML and the Metal Performance Primitives library allows you to embed machine learning operations directly into your shaders. Lastly, the Metal Debugger gives you incredible visibility into what’s happening in your Metal 4 application. The new ML network debugger makes it easy to understand your network and how it executes on device. This kind of insight is essential for ensuring correctness and optimizing performance. For some next steps, try out Metal 4’s MTL4MachineLearningCommandEncoder and Shader ML for yourself, by installing the latest OS and Xcode. To find out more about how the Metal developer tools can help you, head over to the Apple Developer website. And to get the most out of your Metal 4 application, make sure you check out other Metal 4 talks. We're truly excited to see what you'll build with these new capabilities, thank you.

Code

8:13 - Exporting a Core ML package with PyTorch

import coremltools as ct

# define model in PyTorch
# export model to an mlpackage

model_from_export = ct.convert(
    custom_traced_model,
    inputs=[...],
    outputs=[...],
    convert_to='mlprogram',
    minimum_deployment_target=ct.target.macOS16,
)

model_from_export.save('model.mlpackage')

9:10 - Identifying a network in a Metal package

library = [device newLibraryWithURL:@"myNetwork.mtlpackage"];

functionDescriptor = [MTL4LibraryFunctionDescriptor new]
functionDescriptor.name = @"main";
functionDescriptor.library = library;

9:21 - Creating a pipeline state

descriptor = [MTL4MachineLearningPipelineDescriptor new];
descriptor.machineLearningFunctionDescriptor = functionDescriptor;

[descriptor setInputDimensions:dimensions
                 atBufferIndex:1];

pipeline = [compiler newMachineLearningPipelineStateWithDescriptor:descriptor
                                                             error:&error];

9:58 - Dispatching a network

commands = [device newCommandBuffer];
[commands beginCommandBufferWithAllocator:cmdAllocator];
[commands useResidencySet:residencySet];

/* Create intermediate heap */
/* Configure argument table */

encoder = [commands machineLearningCommandEncoder];
[encoder setPipelineState:pipeline];
[encoder setArgumentTable:argTable];
[encoder dispatchNetworkWithIntermediatesHeap:heap];

10:30 - Creating a heap for intermediate storage

heapDescriptor = [MTLHeapDescriptor new];
heapDescriptor.type = MTLHeapTypePlacement;
heapDescriptor.size = pipeline.intermediatesHeapSize;
        
heap = [device newHeapWithDescriptor:heapDescriptor];

10:46 - Submitting commands to the GPU timeline

commands = [device newCommandBuffer];
[commands beginCommandBufferWithAllocator:cmdAllocator];
[commands useResidencySet:residencySet];

/* Create intermediate heap */
/* Configure argument table */

encoder = [commands machineLearningCommandEncoder];
[encoder setPipelineState:pipeline];
[encoder setArgumentTable:argTable];
[encoder dispatchNetworkWithIntermediatesHeap:heap];

[commands endCommandBuffer];
[queue commit:&commands count:1];

11:18 - Synchronization

[encoder barrierAfterStages:MTLStageMachineLearning
          beforeQueueStages:MTLStageVertex
          visibilityOptions:MTL4VisibilityOptionDevice];

15:17 - Declaring a fragment shader with tensor inputs

// Metal Shading Language 4

#include <metal_tensor>

using namespace metal;
 
[[fragment]]
float4 shade_frag(tensor<device half, dextents<int, 2>> layer0Weights [[ buffer(0) ]],
                  tensor<device half, dextents<int, 2>> layer1Weights [[ buffer(1) ]],
                  /* other bindings */)
{
    // Creating input tensor
    half inputs[INPUT_WIDTH] = { /* four latent texture samples + UV data */ };

    auto inputTensor = tensor(inputs, extents<int, INPUT_WIDTH, 1>());
    ...
}

17:12 - Operating on tensors in shaders

// Metal Shading Language 4

#include <MetalPerformancePrimitives/MetalPerformancePrimitives.h>

using namespace mpp;

constexpr tensor_ops::matmul2d_descriptor desc(
              /* M, N, K */ 1, HIDDEN_WIDTH, INPUT_WIDTH,
       /* left transpose */ false,
      /* right transpose */ true,
    /* reduced precision */ true);

tensor_ops::matmul2d<desc, execution_thread> op;
op.run(inputTensor, layerN, intermediateN);

for (auto intermediateIndex = 0; intermediateIndex < intermediateN(0); ++intermediateIndex)
{
    intermediateN[intermediateIndex, 0] = max(0.0f, intermediateN[intermediateIndex, 0]);
}

18:38 - Render using network evaluation

half3 baseColor          = half3(outputTensor[0,0], outputTensor[1,0], outputTensor[2,0]);
half3 tangentSpaceNormal = half3(outputTensor[3,0], outputTensor[4,0], outputTensor[5,0]);

half3 worldSpaceNormal = worldSpaceTBN * tangentSpaceNormal;

return baseColor * saturate(dot(worldSpaceNormal, worldSpaceLightDir));

Create a seamless multiview playback experience

Learn how to build advanced multiview playback experiences in your app. We'll cover how you can synchronize playback between multiple players, enhance multiview playback with seamless AirPlay integration, and optimize playback quality to deliver engaging multiview playback experiences.

Chapters

Resources

Related Videos

WWDC21

Transcript

Hi everyone, I’m Julia, an AVFoundation engineer.

In this video, I’ll discuss how to create an engaging user experience in your app across multiple players. People love to get multiple perspectives from live events like sporting competitions or watch multiple channels simultaneously. A multiview playback experience consists of playing multiple streams of audio and video at once. One use case is playing multiple different streams of the same event. For example, a soccer game with an audio stream for the announcer and two video streams with different perspectives of the field. In this case, it’s important to synchronize playback between the streams. This way, important moments line up. Other examples of synchronized streams might include a music concert which has multiple camera angles, or a keynote speech that has both a main content stream along with a corresponding sign language stream.

Another multiview use case is playing multiple different streams of completely different events. For instance, showing streams of different events like Track and Field and Swimming during the Olympics, with some background music. in these cases, the audio and video streams do not have to be synchronized with each other.

AVFoundation and AVRouting have API that makes it easier to build rich, multiview playback experiences. I’ll go over these API in this video I’ll start with how to synchronize playback across multiple streams. then discuss how to handle routing across multiple views for AirPlay.

Finally, I’ll share how to optimize playback quality across multiple players.

When showing multiple streams that need to be coordinated, such as for a sports game, it’s critical to synchronize playback across all of the players so that important moments line up.

This means that all playback behaviors like play, pause, and seek need to be coordinated.

However, this can be a complicated process. In addition to rate changes and seeks, complex behaviors also need to be managed.

The AVPlaybackCoordinationMedium, from the AVFoundation framework, makes it easier to tightly synchronize playback across multiple players. It handles the coordination of rate changes and time jumps, as well as other complex behaviors like stalling, interruptions, and startup synchronization.

I’ll demonstrate how to use the “AV Playback Coordination Medium” to coordinate between multiple players in your app.

In the demos I’ll show throughout this video, I’ll use the example of different camera angles of a train on tracks moving through different scenes such as plants and other objects and landmarks.

This train example helps to illustrate what multiview content could look like with multiple camera angles. In the coordination demo, I’ll be adding in multiple camera angles that were filmed from around the train track just as if I were watching a sports game and wanted to add in different camera angles from a game.

This is an iPad app with several different video streams of a toy train moving around the track.

I start with a birds-eye view of the train track playing. I want to see more angles of the train, so I add in a side view of the track.

The second stream matches up with the currently playing stream. I’ll also add in two more camera angles recorded from around the track. Each additional stream with join in sync. If I pause, all players will pause in sync. Taking a closer look, I can see the train from multiple angles in all four players. In the top left, bird's-eye view, I notice the train is positioned near the top edge of the table by the plants. From the other camera angles, I can see that the train is beginning to enter the straight part of the tracks and is approaching the monkey from behind. Looking at the timestamp on each video, I can confirm that all are at the same time.

Now, I’ll play and all the players with begin playback in sync. I can watch the train in perfect coordination from the various angles.

Next, I’ll issue seek forwards by 10 seconds.

With each of the actions, the players remain in sync.

Even if I leave my iPad app and switch to a Picture-In-Picture view. the streams remain synchronized.

If I return to the app, all of the videos are still playing in sync.

This also works great across system interfaces, like with the Now Playing interface. Playback behaviors are also coordinated. I can pause and play and the players remain in sync.

Coordinating across all players creates a great user experience. In the demo, I’ve showed an example with a train moving through landmarks. In a real world scenario, this could be sports events, sign language streams, or other multiview use cases where you want to coordinate playback. Now that I’ve gone over what it looks like in action, I’ll show how you can build this experience in your app.

The “AV Playback Coordination Medium API” builds on the existing AVPlaybackCoordination architecture that exists for SharePlay. Each AVPlayer has an AVPlaybackCoordinator that negotiates between the playback state of the player and all other connected players.

To learn more details about the playback coordinator and how it works, check out the video “Coordinate media experiences with Group Activities”.

If there are multiple video players, the playback coordinator needs to handle remote state management and make sure that each player is in sync with the others. The “AV Playback Coordination Medium” communicates state changes across all playback coordinators. The coordination medium passes states from one coordinator to the other playback coordinators, and keeps them all in sync. This is achieved through messaging. The coordination medium passes messages between players for important state changes like playback rate, time, and other state changes. For instance, if the one player pauses, it sends that message to the coordination medium. Then, the coordination medium will send this to all other connected playback coordinators.

The playback coordinators will handle and apply the playback state. This way, all players are able to stay in sync when playing coordinated multiview content. Implementing this only takes a few lines of code.

I start by setting up my AVPlayers, each with a different asset. Here, I’m using two videos. One for a close-up shot, and one for a bird’s eye shot. I’m configuring these separately with different assets. Next, I create the coordination medium. Then, I connect each player to the coordination medium by using the coordinate method. This method can throw errors, so it’s important to handle them.

Finally, I’ll do the same for my second player, the bird's-eye shot. Now, both playback coordinators are connected to the coordination medium, and the actions on each player will be synchronized. All I have to do is call an action on one player, and all of the other connected players will do the same. In this example, I only used 2 players, but you can connect more players.

The AVPlaybackCoordinationMedium is great for coordinating multiview playback.

Next, I’ll talk about tools that can apply to any type of multiview playback, both coordinated and non-coordinated. AirPlay enables an awesome external playback experience. People can route video streams to a larger screen in their home, or route an audio stream to a HomePod. It’s important to route the right view to the right device. I’ll go over how to support AirPlay in your app with multiview experiences. Let’s see that in action! In the example I’ll show, I’m watching a bird's-eye view video along with a track close-up video.

Both videos are playing on my iPad. Both videos are playing on my iPad. However, I want to AirPlay my video from my iPad to the Apple TV. The AirPlay receiver can only support a single stream, so if I route to it, I prefer the bird's-eye view video to play on the big screen so I can see all the details more clearly.

I begin playback of two videos on my iPad and route to an Apple TV. When I do so, the close-up view will continue to play on my iPad. And… The birdseye view video will be played on the big screen since it’s my preferred player. If I want to change which video is playing on the TV, I can switch between streams by updating the preferred player to be the closeup stream and the two videos will switch places. On this iPad app, this is done through pressing the star button, which will set the close-up video to be the preferred player. Now, I select it. And the close-up stream will play on the TV, while the bird’s eye view plays on the iPad. Additionally, I can pause and play, and if the streams are coordinated, then, they’ll remain in sync.

This is an example of a coordinated playback use case. However, uncoordinated multiview streams also work seamlessly with AirPlay.

The AVRoutingPlaybackArbiter, which is part of the AVRouting framework, enables you to easily integrate AirPlay support for multiview experiences. The playback arbiter ensures that multiview works smoothly with AirPlay or other external playback experiences that only support a single video or audio stream.

It manages the complexities of switching to the correct video or audio stream. The AVRoutingPlaybackArbiter is responsible for managing and applying preferences on non-mixable audio routes. These are audio routes where only a single audio stream can be played and concurrent audio playback on the receiver is not possible. The playback arbiter also handles constrained external playback video routes. These are routes where only a single video stream can be played on the receiver, such as with AirPlay video and Lightning Digital AV Adapters.

In a multiview playback case, such as with the train multiview videos, I might have a bird's-eye view video and multiple close-up shots. I want the bird's-eye view to take priority whenever I AirPlay video. First, I obtain the playback arbiter singleton. Next, set the bird's-eye view as the “preferredParticipantForExternalPlayback”, a property on the playback arbiter. Now, if I route to an Apple TV from my iPad while playing multiview content, The bird's-eye view routes its video to the Apple TV while the other videos continue to play locally on my iPad.

Similarly, if there are multiple players and bird's-eye view should take audio priority, then, first obtain the playback arbiter singleton and set the bird's-eye player as the “preferredParticipant ForNonMixableAudioRoutes”. If multiview content is playing and I route audio to a HomePod from my iPad, the audio of the bird's-eye view will be played.

Next, I’ll show an example of how to use this API.

First, I set up two AVPlayers, one of the close-up shot and one of a bird's-eye view.

Then, I obtain the playback arbiter singleton from an instance of the AVRoutingPlaybackArbiter.

I want to see the bird's-eye video on the big screen whenever I route to AirPlay, so I set it as the preferred participant for external playback And I want to hear its audio if I route to a HomePod, so I also set it as the preferred participant for non mixable audio routes.

In this example, I’ve chosen the same player for both properties, but this can be set to be different players.

Through the AVRoutingPlaybackArbiter, ensure a seamless integration of AirPlay and other external audio and video playback experiences in your multiview app.

Next, I’ll tell you how to manage the quality of these streams.

When watching multiview content, some streams may be more important than other streams. For example, when watching a sports game in multiview one stream might be a bird's-eye view of the field. Two other streams could be of different perspectives of the field and another stream could be close-up views of the crowd.

In this example, I care more about the bird's-eye view of the field. I want to see it more clearly and have it play at a higher quality. I care less about the close-up views of the crowd, so I don’t need to see it in detail and don’t mind if it plays at a lower quality.

In a multiview scenario, different players may have different quality needs. Indicate this through setting the AVPlayer’s networkResourcePriority. I’ll discuss in detail how this works. When streaming content, each player consumes network bandwidth.

If these players were equal sized, you may want each to consume an equal amount of network bandwidth and play at the same quality. However, each player may have different network bandwidth and quality needs. To support this, set the networkResourcePriority of the AVPlayer. Each player starts with a default priority level. You can set the priority level to high or low.

A priority level of high means that the player requires a high level of network resources and streaming in a high-quality resolution is crucial.

A priority level of low means that the player requires minimal network bandwidth and streaming in high-quality resolution is not as crucial.

I’ll walk through an example of how you might achieve this with the networkResourcePriority First, create an AVPlayer, and then set the player’s networkResourcePriority.

In the sports game example, the field bird's-eye view is most important, so I set that priority to high. The crowd close-up view is less important, so I set it to low. As a result, the field bird's-eye view will see a higher network priority while the crowd close-up view will see a lower one.

These network priorities are there to help indicate the priority of the player when the system allocates network bandwidth resources. The exact network bandwidth distribution takes a variety of other factors into consideration such as number of other players, video layer size, hardware constraints, and more.

Next, I’ll show a demo of network priorities in action.

In this example, I’ll show the train multiview example, and this can extend to the sports game example and other multiview scenarios, where different playback qualities are required.

I’m watching two different streams - a bird's-eye view of the train track and a close-up view of the train. It’s important for me to watch the train without missing a moment, so I really want to see the bird's-eye view clearly. I set the network resource priority of the bird's-eye view to high.

Both videos are currently playing in high resolution. The resolution tags are at the bottom of the videos. If I encounter poor network conditions and network bandwidth is limited, the close-up view on the right will switch down to a lower resolution first. I can see that happen now. The *more important* bird’s-eye view on the left will maintain a high-definition resolution.

Through setting the network resource priority of a player, you have greater control over the quality at which a stream plays.

The AVFoundation and AVRouting API that I’ve discussed all work together to enable you to build seamless multiview experiences.

Now that you’ve seen these advanced multiview features involving playback coordination, AirPlay integration, and quality optimization, build and enhance your own app with multiview.

Use the AVPlaybackCoordinationMedium to create compelling synchronized multiview experiences. Synchronize multiple camera angles from your favorite sporting event.

Explore the AVRoutingPlaybackArbiter to enhance a multiview app with AirPlay integration. Take multi-stream playback, such as ASL streams, to the big screen.

Fine-tune and optimize playback quality through network bandwidth allocation. Ensure important streams are playing in high quality.

We look forward to all the exciting multiview playback experiences you create. Thank you for watching!

Code

7:55 - Coordinate playback

import AVFoundation

var closeUpVideo = AVPlayer()
var birdsEyeVideo = AVPlayer()

let coordinationMedium = AVPlaybackCoordinationMedium()

do {
  try closeUpVideo.playbackCoordinator.coordinate(using: coordinationMedium)
}catch let error {
  // Handle error
}

do {
  try birdsEyeVideo.playbackCoordinator.coordinate(using: coordinationMedium)
}catch let error {
  // Handle error
}

13:17 - Set preferred participant

import AVFoundation
import AVRouting

var closeUpVideo = AVPlayer()
var birdsEyeVideo = AVPlayer()

let routingPlaybackArbiter = AVRoutingPlaybackArbiter.shared()

routingPlaybackArbiter.preferredParticipantForExternalPlayback = birdsEyeVideo

routingPlaybackArbiter.preferredParticipantForNonMixableAudioRoutes = birdsEyeVideo

16:15 - Set network resource priority

birdsEyeVideo.networkResourcePriority = .high
closeUpVideo.networkResourcePriority = .low

Create icons with Icon Composer

Learn how to use Icon Composer to make updated app icons for iOS, iPadOS, macOS, and watchOS. Find out how to export assets from your design tool of choice, add them to Icon Composer, apply real-time glass properties and other effects, and preview and adjust for different platforms and appearance modes.

Chapters

Resources

Related Videos

WWDC25

Transcript

Hey, welcome to Create icons with Icon Composer. I’m Lyam, a designer here on the Design Team, and in this session, I’m going to show you how you can use our new tool, Icon Composer, to help you prepare your own app icons to look and feel at home across iPhone, iPad, Mac, and Watch.

First off, if you haven’t already watched our other session, “Say hello to the new look of app icons”, it covers all our icon design language updates across Apple platforms, as well as some pretty cool additional modes on iOS and macOS that give people even more customizability at their fingertips. I highly recommend you start there, then dive into this session afterwards, to find out how you can achieve all this for yourself. All right, before I jump straight in, I'd love to talk a little about how icons have changed over time. You might remember, for many years, Mac app icons were created at all sorts of sizes so that each artwork could be optimized for every place it showed up. This was before retina screens, when it was important that elements snap to the pixel grid to maximize contrast and legibility.

Then came along iOS and watchOS, and 2 X displays with double the pixel density, and then 3 X, and that made for a lot of icons to create. So a few years back, with all the advancements in screen resolution and auto scaling, we added the option to deliver just one image per platform and let our system handle the rest, which really simplified things.

Roll on, 2025.

And with dark and tinted modes expanding even more this year on iOS, plus macOS also adopting these appearances and Watch getting the new look, we found ourselves at a similar turning point. So we figured, since we’re bringing consistency to our app icon language, we’d use the opportunity to make the process a lot simpler again.

Now all this can be achieved in one single file using Icon Composer.

Now, if your icon is very complex or illustrative, you might still prefer to upload individual images to Xcode. And without doing anything more on your part, they'll still get the new beautiful edge treatment on device. The technical term for this being a specular highlight. However, if your artwork is a little more translatable to the design language, like this more graphic version, take it into Icon Composer, and you'll be able to bring it to life with all the new possibilities that come with Liquid Glass.

Icon Composer pairs with your existing design tools, giving you full control over our materials. And it'll streamline making app icons across our different platforms and appearances too. It's the same tool we used this year to update all our own icons and it definitely saved us some time. From one artwork, from today, you can produce designs for iPhone, iPad, Mac, and Watch, giving your app a consistent identity wherever it shows up. Take full advantage of all the exclusive dynamic glass properties, previewing how it’ll look realtime, and testing all six of our new appearances: Default, Dark, Clear light, Clear dark, Tinted light, and Tinted dark.

Once you’re happy with your artwork, you can even export out the images for any marketing or other needs you might have. Oh, and there’s no need to worry about creating all those different sizes anymore. We’ve designed the materials to adapt and scale to your icon.

So let’s take a look at what the new workflow looks like with Icon Composer. Start in your preferred design tool, then export out layers, bring those into Icon Composer where you can begin tuning for glass, appearance modes, and platforms. And you’re ready to save the file out and deliver to Xcode. Let’s dig into each of these a bit more. OK, first, the design process. If you’re working with flat graphics it’s best to use a tool that can draw in vectors since having the ability to export SVGs will give you the most scalability later down the line. Once you have that open, we want to set up the right canvas size. The simplest way to do this I find is by using one of our app icon templates. We’ve made these available for Figma, Sketch, Photoshop, and Illustrator, all found on the Apple Design Resources website. If designing for iPhone, iPad or Mac, these now use the same 1024px canvas, which makes things a lot simpler, and have a new grid, and rounded-rectangle shape.

And Watch is now 1088px so it overshoots our rounded rectangle and uses the same grid, which makes designs more easily translate between platforms.

Next, you’re ready to start designing your icon with layers. If you’re familiar with making app icons for tvOS or visionOS, you already have a good understanding of layering. Essentially, each layer represents a step in Z depth, where the bottom is the background and the other stack on top.

For a lot of cases, this is as simple as one foreground and one background – say Messages. In other instances, your artwork might look a little more layered – like Home.

Alongside layering by Z, splitting different colors out by layers will give you the most control later in Icon Composer. Take this Translate example. The speech bubbles use two separate layers, use two separate layers, which is a pretty good start. If I also separate the type from the bubbles, this is going to give me even more control, so that when I go to make my dark mode variant in Icon Composer, all I have to do now is change one fill, and I’m done.

One other thing to think about when designing is what creative decisions can be made once you’re in Icon Composer. So back to translate. I actually think it could be quite cool to give it a bit of blur in the overlap, maybe even a subtle shadow to give it some lift.

But because I’m gonna make these layers out of liquid glass, these are all dynamic properties that could be added directly in Icon Composer, along with specular, opacity, and translucency.

So instead of trying to bake these into my file, it's best for me to still my source art down to its graphic essence, so that it’s flat, opaque, and easy to control later.

Once the artwork is in a good place, next we want to export the layers as SVGs. For every tool, this can look a bit different. For those using Illustrator, we've created a layer to SVG script that will automate this for you, which you can download. Exporting out the canvas size ensures everything drops right into position in Icon Composer.

Number them in order of Z, and it’ll automatically follow that same order. Otherwise, you can always reorder them later.

And simple background colors and gradients get added directly in Icon Composer, so don’t need to be exported.

If I was to use any text too, since the SVG format doesn’t preserve fonts, it will need converting to outlines before exporting.

And whenever using custom gradients, raster images, or any elements or software that can’t be expressed through SVG, we export these layers as PNG, since this is a lossless format that can retain a transparent background. And one final tip to remember is that we never include the rounded rectangle or circle mask in our exports. So for this Siri example, we don’t want to be exporting it like this, since this mask is automatically applied later, ensuring the perfect crop. Ah, that’s better.

Once you’ve exported your layers, you’re ready to open up Icon Composer. Let’s take a look around an existing project.

On the left, we have our sidebar with the canvas, groups and layers. Centered, the preview panel with all the different artworks and preview controls. And right, the inspector, where we’ll find the appearance properties and document options.

When you first open Icon Composer though, it will look a little more like this, with just the canvas on show.

But you’re probably not going to want to use this exact shade of blue. So to set the background color, all you need to do is head over to the sidebar and select the canvas, head to the inspector and pick a color or gradient. I’m going to use one of these system presets, they’re ready-made light and dark backgrounds that we’ve optimized for our new materials, and watch it update in the preview panel like so.

You can start to see how each of these sections come together. Let’s take a closer look at each area, starting with the sidebar. Drag and drop your layers in and they’ll alphabetically organize here into a group.

In Icon Composer, groups control how elements stack and receive glass properties. By default, it’ll always be one, but you can go all the way up to four. We found this number provides the right bounds for how much visual complexity an icon should have. For home I’ve used all four, so that I can make each layer its own unique piece of glass.

At the bottom, we have our platforms and appearance modes, the same three appearance annotations you may already be familiar with for designing, for light, dark and tinted.

This year we renamed these to default, dark and mono, with the artwork producing all the appearances for clear and for tinted.

And all these previewable with a click of the thumbnail.

In the inspector, here we have all the controls for appearance properties and our document settings. Helpful for choosing which platforms to design for.

Let’s take a look back at the appearance controls.

When you drag a layer into Icon Composer you’ll see it will automatically get our liquid glass.

On the layer level, you can choose whether to toggle this off or on. Plus, there are a number of other useful controls you'll recognize from design tools.

Color controls are especially helpful when creating variants for dark and mono mode, and composition controls, great for reworking artwork for different platforms.

Then, if we now go to the group level, we’ll see the controls look a little different. Here you’ll find all the options for Liquid Glass. Some are set automatically, but we recommend you continue to dial these in to get the look you want.

And as you start tailoring these properties, it's important to note that some are pre-configured to apply per appearance, like opacity, blend mode, and fill, and others to all, since the attributes are more commonly consistent across modes. If you're looking for even more control, click the plus on the hover over and you can create an individual variant of a property.

Here are some tips when using these properties, some things I usually like to keep an eye out for.

Take the day on the Calendar icon. It gets a little bit complex and pillowy in the narrow areas for my liking. We can either solve this by switching off the specular on the group or switching off liquid glass entirely on the layer.

Shadows. Neutral shadows are the preset shadows. They're nuanced, versatile, and look great on any background. When using color against white though, this is a great opportunity to test out chromatic shadows. The color from the artwork spills onto the background, creating a nice look that emphasizes the lighting and physicality of the material.

And we can still keep our neutral shadows for dark and mono by creating a variant.

Speaking of dark appearance, there’s also some common considerations here.

For example, fills. I always use these to help optimize artworks. So take dictionary. If I was to do nothing for dark, it would look like this, which is a great example because it highlights two things. One, the maroon bookmark gets lost against black. And two, I'm kind of missing that distinctive coral red now.

So we should change the fill. This logic can also apply to other color related properties like opacity and blend mode.

Say I’ve imported a PNG though, and can’t use a fill. Well, the same principle can be achieved by creating another image in our design software and importing it as a variant.

Legibility is also key for mono appearances. Setting at least one element of your icon to be white, usually the most prominent or recognizable part, make sure it shows up strong. and the other colors can be mapped to tones of gray.

Icon Composer will do an automatic conversion for this, but it’s important to tune it to get the best contrast. And when designing between rounded rectangle and circle platforms, for a lot of cases, you won’t have to do anything since the new watch canvas is optically larger and built on the same grid. One thing to look out for though is the composition.

Consider optical adjustments for the circle shape. And if you have any elements touching the edge of the canvas, Scale them up so they touch the edges again for Watch.

Alternatively, if you’re familiar with the concept of designing with bleed, you can integrate this into your source art. With all that said, let’s take a final look at the preview panel.

The controls in the top right allow you to change the background while offering your icon. Great for seeing it in a different context. And trying out some wallpapers or images behind the new modes, to test legibility.

You can also overlay our icon grids, see how light moves, as well as look up close and down small.

Once you’re done in Icon Composer, to deliver, all you have to do is save the .icon file out, drag it into Xcode, and choose your icon in the Project Editor. When you build and run your app, you’ll be able to see how it adjusts to platforms and appearances.

To wrap up, it used to be that the design process started and finished in traditional design tools. With Icon Composer, we’re providing an extra touchpoint where people can further bring designs to life in new, dynamic ways across our products, spend less time generating assets, and less time in Photoshop trying to recreate a load of glass effects. We've all been there. Icon Design is moving from a past of simply static images to a future of expressive, multi-layered artworks that respond to user input and adapt between appearances. They've become a much richer and more integrated experience on device.

If you want to take advantage of this, Icon Composer is available in beta and will continue to introduce new features and improve based on your feedback. We encourage you to download, use Feedback Assistant to request enhancements, and explore the resources we’ve made available for this new tool. Thank you for watching.

Customize your app for Assistive Access

Assistive Access is a distinctive, focused iOS experience that makes it easier for people with cognitive disabilities to use iPhone and iPad independently. In iOS and iPadOS 26, you can customize your app when it's running in Assistive Access to give people greater ease and independence. Learn how to tailor your app using the AssistiveAccess SwiftUI scene type, and explore the key design principles that can help you create a high-quality Assistive Access experience for everyone.

Chapters

Resources

Related Videos

WWDC25

WWDC23

Transcript

Hi everyone. My name is Anne, and I’m a member of the Accessibility team at Apple. Today, I’m excited to talk all about Assitive Access, Apple’s streamlined iOS and iPadOS experience designed for people with cognitive disabilities. Assistive Access reimagines how people interact with their devices by distilling apps and controls to their essence. This clarified experience ensures everyone can navigate their devices with ease and independence. In iOS and iPadOS 26, your app has the opportunity to seamlessly integrate with this experience through the Assistive Access scene type.

Today you’ll learn how to build a great experience for your app in Assistive Access.

I’ll share how to create an Assistive Access scene with SwiftUI as well as the principles to keep in mind as you tailor your app further.

First, I’ll get started with a refresher on Assistive Access.

Apple introduced Assistive Access in iOS and iPadOS 17.

This streamlined system experience is designed specifically for people with cognitive disabilities. It seeks to reduce cognitive load by providing apps and interfaces with streamlined interactions, clear pathways to success, and consistent design practices.

If you’re new to Assistive Access, check out the session Meet Assistive Access to learn more.

Assistive Access supports several distilled built-in apps, Camera and Messages, shown here.

These apps share a common style: they have large controls, streamlined interfaces, and visual alternatives to text.

In establishing a clear and focused design language in Assistive Access, these apps reduce the cognitive strain that is required when interacting with new and varied UI. The design is consistent, so the expectations are consistent.

By default, apps not optimized for Assistive Access are displayed in a reduced frame. This is to make room for the back button, which is always shown along the bottom of the device. The back button navigates from the app to the Home Screen.

For those of you with apps already designed for people with cognitive disabilities, you may want to bring your app as is to Assistive Access. A great example of this is if you support an Augmentative and Alternative Communication app, also known as an AAC app. If, like an AAC app, your app is designed for cognitive disabilities and you want to bring the same layout and the same experience to Assistive Access, then your app is ready to take advantage of full screen. To display your app in full screen, set the UISupportsFullScreenInAssistiveAccess key to true in your app’s Info.plist. Your app’s appearance will be the same as when Assistive Access is turned off, with the exception that it will fill the full dimensions of the screen rather than displaying in a reduced frame. If you need to make runtime variations, use the SwiftUI and UIKit support to detect if an Assistive Access session is active.

Apps that adopt full screen in Assistive Access are displayed under the Optimized Apps list in Accessibility Settings.

If your app isn’t designed for cognitive disabilities, get started with support in iOS and iPadOS 26 to create an Assistive Access scene With this scene type, you’ll provide a tailored experience, where controls are automatically displayed in the familiar and prominent style of built-in apps like Camera and Messages. This differs from full screen support, where your app is unchanged aside from the screen dimensions.

If you’re unsure which path to take, opt for the scene to take advantage of all that iOS and iPadOS 26 have to offer for Assistive Access. I’ll demonstrate how this scene type elevates your app in Assistive Access. Assistive Access apps share a design that focuses on clarity. And this clarity starts with the unique way native controls are displayed when Assistive Access is turned on.

With the scene type, your app is shown in the larger, clearer control style that matches the existing experience. This familiar design helps people with cognitive disabilities get the most out of your app. To set up your app for Assistive Access, First set the UISupportsAssistiveAccess Info.plist key to true in your app bundle.

This ensures your app is listed under Optimized Apps in Assistive Access Settings.

Your app will also launch in full screen, instead of the default reduced frame.

Next, adopt the Assistive Access scene and create your streamlined app experience. I’ll put this into practice.

I’ll update an app that I’ve been working on. This drawing app lets me sketch on a canvas, sort my drawings into folders and favorites, and use a range of editing tools to create fun and engaging illustrations.

My app is using the SwiftUI lifecycle. I have 1 scene, which is a window group that declares my main content view.

After setting UISupportsAssistiveAccess to true in my app’s Info.plist, I’ll add the AssistiveAccess scene to my app. Within the Assistive Access scene, I create a new content view with the custom hierarchy I’m designing for Assistive Access. This new content view will host my streamlined, lighterweight app experience.

When the app is launched in Assistive Access, this scene is created and attached.

When a scene is active, an app’s native SwiftUI controls are displayed in the distinctive Assistive Access design Buttons, lists, and navigation titles are all shown in a more prominent style with no additional work Controls also automatically adhere to the grid or row screen layout, which is configured in Assistive Access settings.

To test how your app is laid out when the scene is active, pass the assitive access trait to the SwiftUI preview macro.

For UIKit apps, you can achieve something similar with support in iOS and iPadOS 26 to define and activate SwiftUI scenes in UIKit-lifecycle apps. Declare the AssistiveAccess SwiftUI scene from UIKit in the static rootScene property of your scene delegate class. To activate the scene, scene delegate class configured to host the Assistive Access scene from your app delegate.

For more information on scene bridging, check out this year’s session on What’s New in SwiftUI Assistive Access is built around simplified and easy to use interactions. Now that your app is set up for Assistive Access, bring this focus to your app content.

I’ll go over a few guiding principles to help focus your app’s content for Assistive Access. When thinking about this experience, distill your app to the essentials. Consider how to build understandable, supportive pathways that are safe from disruptive changes and convey information in multiple ways.

Start with identifying the core functionality of your app. Ask yourself what one or two features are most important to support in your app, and bring these to Assistive Access. It may seem counterintuitive to remove functionality, but fewer options reduce distractions and lighten the overall cognitive load.

When in doubt, keep the Assistive Access experience streamlined and focused. For my app, I’ll bring two features to Assistive Access: the ability to sketch on a canvas, and the ability to view those drawings. These will be the ONLY UI elements in my app’s root view.

I’ll save the other functionality that my app provides, like marking favorites or editing sketches, For outside of Assistive Access to keep the experience focused.

Here is the redesigned root view of my app. I’ve implemented the two features I identified: Draw And Gallery These two features are represented as a list of navigation links. Since the Assistive Access scene is active, the links automatically conform to the preferred grid or row layout style.

When I navigate into a view, the Assistive Access back button traverses back up the app’s navigation stack with no additional work.

Already, my app is more focused. The only UI elements onscreen are the app’s essential features, and the UI is displayed in a familiar Assistive Access style. I also present two clear pathways: draw, or gallery. Fewer options correspond to streamlined choices, and streamlined choices increase the likelihood of success when someone is exploring the app. After deciding how best to distill your app experience, you may still want to add some level of customization. I’ll go through how to add to your features in Assistive Access. When approaching Assistive Access, remember the aim is to reduce cognitive strain.

While rich features and customization add to your app’s level of completeness, a large amount of content on the screen may prove challenging for people with cognitive disabilities. Reduce the number of options shown at any given time to focus decisions. This applies both to the number of onscreen elements, as well as the purpose of the view.

Reducing options removes distractions when navigating in your app. Too many options may prove overwhelming and add to cognitive load. Next, make sure presented controls are clearly visible. Avoid hidden gestures or nested UI. These types of interactions may be less discoverable to a person with cognitive disabilities.

Instead, use prominent controls that are clearly visible.

People will navigate your app at different speeds. To make sure everyone can complete tasks at their own pace, avoid timed interactions.

Redesign experiences where UI or views disappear or change state automatically after a timeout. Give people the time they need to be successful in your app.

In Assistive Access, design experiences that guide a person through a selection process. Build an incremental, step by step flow, rather than presenting multiple options at once. Lead your audience in a reasonable number of steps, avoiding lengthy setup processes that detract from the experience.

Note to add customization that builds slowly and deliberately, you may need to reorder where certain decisions are made. The aim to is ensure decisions are processed separately, which in turn reduces cognitive strain and leads to a more pleasant app experience. There are some important actions that may be difficult to recover from, like deleting a photo. Consider removing this functionality entirely, or if you do plan to implement more permanent actions like deletion, ask twice for confirmation where appropriate. The goal is to make sure that people don’t end up in a situation that they didn’t intend.

You’ve learned best practices to build great interactions in Assistive Access, such as reducing the presented options, supporting prominent UI, and avoiding timed interactions. You’ve also learned to build incremental, guided flows and to reconsider how and where destructive actions are implemented. I’ll apply these same design guidelines to my app.

I’m adding support to draw with color in my app’s Assistive Access experience. Outside of Assistive Access, a stroke color is selected with the color picker.

Some Assistive Access apps may benefit from color picker options like opacity, but for my app, I only need color. this is one experience I’ll tailor.

In the spirit of reducing options, I’ve designed a dedicated color selection view for my app in Assistive Access. It distills the color picker to a single decision: the stroke color. It also reduces the number of colors presented to a select few. To draw with a color, simply tap it.

I’ve added this view in between the option to Draw and entering the Canvas.

This provides a step by step approach that guarantees everyone using the app arrives at the Canvas with a color selected. Note this ordering differs from the color picker implementation, where color selection happens as an option within the canvas. Instead, I isolate the decision to pick a color to a single view that is always presented on the way to the canvas. in my own app, I’ve also removed functionality that could be confusing or difficult to undo. In the canvas view, my app doesn’t support the option to undo a stroke.

In the gallery, there is no option to delete a drawing.

These decisions were made to create a safe and supportive environment that removes the risk of someone performing an action that they didn’t intend to do in the app. A key aspect of Assistive Access is intuitive and understandable design. For some people with cognitive disabilities, mages and icons are more understandable than text alone. This means information should be presented in multiple ways. Rather than relying on text alone, provide visual alternatives to convey meaning when your app is running in Assistive Access.

when implementing controls like buttons and navigation links, include both an icon and a label.

In Assistive Access, visual design applies to navigation titles too as icons are supported in the navigation bar.

implement this in your own app with the assistive access navigation icon modifiers pass an image or the name of a system Image to display alongside a navigation title when an Assistive Access scene is active. Make sure all of your navigation titles have an accompanying icon.

now that you’ve learned how to design for people with cognitive disabilities, adopt the SwiftUI scene and bring your app to Assistive Access. Go through the design exercises discussed in this session with your own app, keeping in mind the goal to refine and focus the experience you support in Assistive Access. And remember, the best source of feedback is from your audience: find opportunities to test within the Assistive Access community.

Thanks for watching, and thanks for making sure your app is designed for everyone.

Code

5:21 - Create a scene for Assistive Access

// Create a scene for Assistive Access

import SwiftUI
import SwiftData

@main
struct WWDCDrawApp: App {
  var body: some Scene {
    WindowGroup {
      ContentView()
        .modelContainer(for: [DrawingModel.self])
    }
    AssistiveAccess {
      AssistiveAccessContentView()
          .modelContainer(for: [DrawingModel.self])
    }
  }
}

6:25 - Display an Assistive Access preview

// Display an Assistive Access preview

import SwiftUI

struct AssistiveAccessContentView: View {
  @Environment(\.modelContext) var context
  var body: some View {
    VStack {
      Image(systemName: "globe")
        .imageScale(.large)
        .foregroundStyle(.tint)
      Text("Hello, world!")
    }
    .padding()
  }
}

#Preview(traits: .assistiveAccess)
    AssistiveAccessContentView()
}

6:35 - Declare a SwiftUI scene with UIKit

// Declare a SwiftUI scene with UIKit

import UIKit
import SwiftUI

class AssistiveAccessSceneDelegate: UIHostingSceneDelegate {

  static var rootScene: some Scene {
    AssistiveAccess {
      AssistiveAccessContentView()
    }
  }
    
    /* ... */
}

6:55 - Activate a SwiftUI scene with UIKit

// Activate a SwiftUI scene with UIKit

import UIKit

@main
class AppDelegate: UIApplicationDelegate {
  func application(_ application: UIApplication, configurationForConnecting connectingSceneSession: UISceneSession, options: UIScene.ConnectionOptions) -> UISceneConfiguration {
    let role = connectingSceneSession.role
    let sceneConfiguration = UISceneConfiguration(name: nil, sessionRole: role)
    if role == .windowAssistiveAccessApplication {
      sceneConfiguration.delegateClass = AssistiveAccessSceneDelegate.self
    }
    return sceneConfiguration
  }
}

14:36 - Display an icon alongside a navigation title

// Display an icon alongside a navigation title

import SwiftUI

struct ColorSelectionView: View {
  var body: some View {
    Group {
      List {
        ForEach(ColorMode.allCases) { color in
          NavigationLink(destination: DrawingView(color: color)) {
            ColorThumbnail(color: color)
          }
        }
      }
      .navigationTitle("Draw")
      .assistiveAccessNavigationIcon(systemImage: "hand.draw.fill")
    }
  }
}

Deep dive into the Foundation Models framework

Level up with the Foundation Models framework. Learn how guided generation works under the hood, and use guides, regexes, and generation schemas to get custom structured responses. We'll show you how to use tool calling to let the model autonomously access external information and perform actions, for a personalized experience. To get the most out of this video, we recommend first watching “Meet the Foundation Models framework”.

Chapters

Resources

Related Videos

WWDC25

Transcript

Hi, I’m Louis. Today we’ll look at getting the most out of the Foundation Models framework.

As you may know, the Foundation Models framework gives you direct access to an on-device Large Language Model, with a convenient Swift API. It’s available on macOS, iPadOS, iOS, and visionOS. And because it runs on-device, using it in your project is just a simple import away. In this video, we will look at how sessions work with Foundation Models. How to use Generable to get structured output. How to get structured output with dynamic schemas defined at runtime, and using tool calling to let the model call into your custom functions.

Let’s start simple, by generating text with a session.

Now, I’ve been working on this pixel art game about a coffee shop, and I think it could be really fun to use Foundation Models to generate game dialog and other content to make it feel more alive! We can prompt the model to respond to a player’s question, so our barista gives a unique dialog.

To do this, we’ll create a LanguageModelSession with custom instructions. This let’s us tell the model what its purpose is for this session and for the prompt we’ll take the user’s input. And that’s really all it takes for a pretty fun new game element. Let’s ask the Barista “How long have you worked here?”, and let it respond to our question.

That was generated entirely on-device. Pretty amazing. But how does this actually work? Let’s get a better sense of how Foundation Models generates text, and what to look out for. When you call respond(to:) on a session, it first takes your session’s instructions, and the prompt, in this case the user’s input, and it turns that text into tokens. Tokens are small substrings, sometimes a word but typically just a few characters. A large language model takes a sequence of tokens as input, and it then generates a new sequence of tokens as output. You don’t have to worry about the exact tokens that Foundation Models operates with, the API nicely abstracts that away for you. But it is important to understand that tokens are not free. Each token in your instructions and prompt adds extra latency. Before the model can start producing response tokens, it first needs to process all the input tokens. And generating tokens also has a computational cost, which is why longer outputs take longer to generate.

A LanguageModelSession is stateful. Each respond(to:) call is recorded in the transcript.

The transcript includes all prompts and responses for a given session.

This can be useful for debugging, or even showing it in your UI.

But a session has a limit for how large it can grow.

If you’re making a lot of requests, or if you’re giving a large prompt or getting large outputs, you can hit the context limit.

If your session exceeds the available context size, it will throw an error, which you should be prepared to catch.

Back in our game, when we’re talking with a character and hit an error, the conversation just ends, which is unfortunate, I was just getting to know this character! Luckily there are ways to recover from this error.

You can catch the exceededContextWindowSize error.

And when you do, you can start a brand new session, without any history. But in my game that would mean the character suddenly forgets the whole conversation.

You can also choose some of the transcript from your current session to carry over into the new session.

You can take the entries from a session’s transcript, and condense it into a new array of entries.

So for our game dialog, we could take the first entry of the session’s transcript, which is the instructions. As well as the last entry, which is the last successful response.

And when we pass that into a new session, our character is good to chat with for another while.

But keep in mind, the session’s transcript includes the initial instructions as the first entry. When carrying over a transcript for our game character, we definitely want to include those instructions.

Including just a few relevant pieces from the transcript can be a simple, and effective, solution. But sometimes it’s not that simple.

Let’s imagine a transcript with more entries.

You definitely always want to start by carrying over the instructions. But a lot of entries in the transcript might be relevant, so for this use case you could consider summarizing the transcript.

You could do this with some external library, or perhaps even summarize parts of the transcript with Foundation Models itself.

So that’s what you can do with the transcript of a session.

Now let’s take a brief look at how the responses are actually generated.

In our game, when you walk up to the barista, the player can ask any question.

But if you start two new games, and ask the exact same question in each, you will probably get different output. So how does that work? Well that’s where sampling comes in.

When the model is generating its output, it does so one token at a time. And it does this by creating a distribution, for the likelihood of any given token By default, Foundation Models will pick tokens within some probability range. Sometimes it might start by saying “Ah”, and other times it might pick “Well” for the first token. This happens for every token that’s generated. Picking a token is what we call sampling. And the default behavior is random sampling. Getting varied output is great for use cases like a game. But sometimes you might want deterministic output, like when you’re writing a demo that should be repeatable. The GenerationOptions API let’s you control the sampling method. You can set it to greedy to get deterministic output. And when that’s set, you will get the same output for the same prompt, assuming your session is also in the same state. Although note, this only holds true for a given version of the on-device model. When the model is updated as part of an OS update, your prompt can definitely give different output, even when using greedy sampling.

You can also play with the temperature for the random sampling. For example, setting the temperature to 0.5 to get output that only varies a little. Or setting it to a higher value to get wildly different output for the same prompt.

Also, keep in mind, when taking user input in your prompt, the language might not be supported.

There is the dedicated unsupportedLanguageOrLocale error that you can catch for this case.

This can be a good way to show a custom message in your UI.

And there’s also an API to check whether the model supports a certain language. For example to checkout if the user’s current language is supported, and to show a disclaimer when it’s not. So that’s an overview of sessions. You can prompt it, which will store the history in the transcript. And you can optionally set the sampling parameter, to control the randomness of the session’s output. But let’s get fancier! When the player walks around, we can generate NPCs, Non Playable Characters, again using Foundation Models. However, this time, we want more complicated output. Instead of just plain text, we’d like a name and a coffee order from the NPC. Generable can help us here.

It can be a challenge to get structured output from a Large Language Model. You could prompt it with the specific fields you expect, and have some parsing code to extract that. But this is hard to maintain, and very fragile, it might not always give the valid keys, which would make the whole method fail.

Luckily, Foundation Models has a much better API, called Generable.

On your struct, you can apply the @Generable macro. So, what is Generable and is that even a word? Well, yes, it is.

Generable is an easy way to let the model generate structured data, using Swift types The macro generates a schema at compile time, which the model can use to produce the expected structure.

The macro also generates an initializer, which is automatically called for you when making a request to a session.

So then we can generate instances of our struct. Like before, we’ll call the respond method on our session. But this time pass the generating argument telling the model which type to generate.

Foundation Models will even automatically include details about your Generable type in the prompt, in a specific format that the model has been trained on. You don’t have to tell it about what fields your Generable type has In our game, we’ll now get some great generated NPC encounters! Generable is actually more powerful than it might seem. At a low level, this uses constrained decoding, which is a technique to let the model generate text that follows a specific schema.

Remember, that schema that the macro generates.

As we saw before, an LLM generates tokens, which are later transformed into text. And with Generable, that text is even automatically parsed for you in a type-safe way. The tokens are generated in a loop, often referred to as the decoding loop.

Without constrained decoding, the model might hallucinate some invalid field name.

Like `firstName`instead of a name. Which would then fail to be parsed into the NPC type.

But with constrained decoding, the model is prevented from making structural mistakes like this. For every token that’s generated, there’s a distribution of all the tokens in the model’s vocabulary.

And constrained decoding works by masking out the tokens that are not valid. So instead of just picking any token, the model is only allowed to pick valid tokens according to the schema.

And that’s all without needing to worry about manually parsing the model’s output. Which means you can spend your time on what truly matters, like talking to virtual guests in your coffee shop! Generable is truly the best way to get output from the on-device LLM. And it can do so much more. Not only can you use it on structs, but also on enums! So let’s use that to make our encounters more dynamic! Here, I’ve added an Encounter enum, with two cases. The enum can even contain associated values in its cases, so let’s use that to either generate a coffee order, or, to have someone that wants to speak to the manager.

Let’s checkout what we encounter in our game now! Wow, someone really needs a coffee.

Clearly, not every guest is as easy to deal with, so let’s level this up by adding levels to our NPCs.

Generable supports most common Swift types out of the box, including Int. So let’s add a level property. But we don’t want to generate any integer. If we want the level to be in a specific range, we can specify this using a Guide. We can use the Guide macro on our property, and pass a range.

Again, the model will use constrained decoding, to guarantee a value in this range.

While we’re at it, let’s also add an array of attributes to our NPC.

We can again use a guide, this time to specify we want exactly three attributes for this array in our NPC. Keep in mind, the properties of your Generable type are generated in the order they are declared in the source code. Here, name will be generated first, followed by the level, then the attributes, and encounter last.

This order can be important, if you’re expecting the value of a property to be influenced by another property.

And you can even stream property-by-property, if you don’t want to wait until the full output is generated. The game is pretty fun now! Almost ready to share with my friends. But I notice the names of the NPCs aren’t exactly what I had in mind. I would prefer to have a first and last name.

We can use a guide for this, but this time just provide a natural language description.

We can say our name should be a “full name”.

And this is effectively another way of prompting. Instead of having to describe different properties in your prompt, you can do it directly in your Generable type. And it gives the model a stronger relation for what these descriptions are tied to.

If we walk around in our game now, we’ll checkout these new names in action.

Here’s an overview of all the guides you can apply to different types.

With common numerical types, like int, you can specify the minimum, maximum or a range. And with array, you can control the count, or specify guides on the array’s element type.

For String, you can let the model pick from an array with anyOf, or even constrain to a regex pattern.

A regex pattern guide is especially powerful. You may be familiar with using a regex for matching against text. But with Foundation Models, you can use a regex pattern to define the structure of a string to generate. For example, you can constrain the name to a set of prefixes.

And you can even use the regex builder syntax! If this renews your excitement in regex, make sure to watch the timeless classic “Meet Swift Regex” from a few years ago.

To recap, Generable is a macro that you can apply to structs and enums, and it gives you a reliable way to get structured output from the model. You don’t need to worry about any of the parsing, and to get even more specific output, you can apply guides to your properties.

So Generable is great when you know the structure at compile time.

The macro generates the schema for you, and you get an instance of your type as output. But sometimes you only know about a structure at runtime. That’s where dynamic schemas can help.

I’m adding a level creator to my game, where players can dynamically define entities to encounter while walking around in the game. For example, a player could create a riddle structure. Where a riddle has a question, and multiple choice answers. If we knew this structure at compile time, we could simply define a Generable struct for it. But our level creator allows for creating any structure the player can think of.

We can use DynamicGenerationSchema to create a schema at runtime.

Just like a compile-time defined struct, a dynamic schema has a list of properties. We can add a level creator, that can take a player’s input.

Each property has a name and its own schema, which defines its type. You can use the schema for any Generable type, including built-in types, such as String.

A dynamic schema can contain an array, where you then specify a schema for the element of the array. And importantly, a dynamic schema can have references to other dynamic schemas.

So here, our array can reference a custom schema that is also defined at runtime.

From the user’s input, we can create a riddle schema, with two properties.

The first is the question, which is a string property. And secondly, an array property, of a custom type called Answer.

And we'll then create the answer. This has a string and boolean property.

Note that the riddle’s answers property refers to the answer schema by its name.

Then we can create the DynamicGenerationSchema instances. Each dynamic schema is independent. Meaning the riddle dynamic schema doesn't actually contain the answer’s dynamic schema. Before we can do inference, we first have to convert our dynamic schemas into a validated schema. This can throw errors if there are inconsistencies in the dynamic schemas, such as type references that don’t exist.

And once we have a validated schema, we can prompt a session as usual. But this time, the output type is a GeneratedContent instance. Which holds the dynamic values.

You can query this with the property names from your dynamic schemas. Again, Foundation Models will use guided generation to make sure the output matches your schema. It will never make up an unexpected field! So even though it’s dynamic, you still don’t have to worry about manually parsing the output.

So now when the player encounters an NPC, the model can generate this dynamic content. Which we’ll show in a dynamic UI. Let’s checkout what we run into. I’m dark or light, bitter or sweet, I wake you up and bring the heat, what am I? Coffee or hot chocolate. I think the answer is coffee.

That's correct! I think my players will have a lot of fun creating all sorts of fun levels.

To recap, with the Generable macro, we can easily generate structured output from a Swift type that’s defined at compile time.

And under the hood, Foundation Models takes care of the schema, and converting the GeneratedContent into an instance of your own type. Dynamic schemas work very similar, but give you much more control. You control the schema entirely at runtime, and get direct access to the GeneratedContent. Next, let’s take a look at tool calling, which can let the model call your own functions. I’m thinking of creating a DLC, downloadable content, to make my game more personal. Using tool calling, I can let the model autonomously fetch information. I’m thinking that integrating the player’s contacts and calendar could be really fun.

I wouldn’t normally do that with a server-based model, my players wouldn’t appreciate it if the game uploaded such personal data. But since it’s all on-device with Foundation Models, we can do this while preserving privacy.

Defining a tool is very easy, with the Tool protocol. You start by giving it a name, and a description. This is what will be put in the prompt, automatically by the API, to let the model decide when and how often to call your tool.

It’s best to make your tool name short, but still readable as English text. Avoid abbreviations, and don’t make your description too long, or explain any of the implementations. Because remember, these strings are put verbatim in your prompt. So longer strings means more tokens, which can increase the latency. Instead, consider using a verb in the name, such as findContact. And your description should be about one sentence. As always, it’s important to try different variations to checkout what works best for your specific tool.

Next, we can define the input for our tool. I want the tool to get contacts from a certain age generation, like millennials. The model will be able to pick a funny case based on the game state, and I can add the Arguments struct, and make it Generable.

When the model decides to call this tool, it will generate the input arguments. By using Generable, this guarantees your tool always gets valid input arguments. So it won’t make up a different generation, like gen alpha, which we don’t support in our game.

Then I can implement the call function. The model will call this function when it decides to invoke the tool.

In this example, we’ll then call out to the Contacts API. And return a contact’s name for that query.

To use our tool, we’ll pass it in the session initializer. The model will then call our tool when it wants that extra piece of information.

This is more powerful than just getting the contact ourselves, because the model will only call the tool when it needs for a certain NPC, and it can pick fun input arguments based on the game state. Like the age generation for the NPC.

Keep in mind, this is using the regular contacts API, which you might be familiar with. When our tool is first is invoked, it will ask the player for the usual permission. Even if the player doesn’t want to give access to their contacts, Foundation Models can still generate content like before, but if they do give access, we make it more personal.

Let’s walk around a bit in our game until we encounter another NPC. And this time, I’ll get a name from my contacts! Oh hi there Naomy! Let’s checkout what she has to say, I didn’t know you liked coffee.

Note that LanguageModelSession takes an instance of a tool. This means you control the lifecycle of the tool. The instance of this tool stays the same for the whole session.

Now, in this example, because we’re just getting a random character with our FindContactsTool, it’s possible we’ll get the same contact sometimes. In our game, there are multiple Naomy’s now. And that’s not right, there can only be the one.

To fix this, we can keep track of the contacts the game has already used. We can add state to our FindContactTool. To do this, we will first convert our FindContactTool to be a class. So it can mutate its state from the call method.

Then we can keep track of the picked contacts, and in our call method we don’t pick the same one again.

The NPC names are now based on my contacts! But talking to them doesn’t feel right yet. Let’s round this off with another tool, this time for accessing my calendar.

For this tool, we’ll pass in the contact name from a dialog that’s going on in our game. And when the model calls this tool, we’ll let it generate a day, month and a year for which to fetch events with this contact. And we’ll pass this tool in the session for the NPC dialog.

So now, if we ask my friend Naomy’s NPC "What’s going on?", she can reply with real events we have planned together.

Wow, it's like talking to the real Naomy now.

Let’s take a closer look at how tool calling works.

We start by passing the tool at the start of the session, along with instructions. And for this example, we include information like today’s date.

Then, when the user prompts the session, the model can analyze the text. In this example, the model understands that the prompt is asking for events, so calling the calendar tool makes sense.

To call the tool, the model first generates the input arguments. In this case the model needs to generate the date to get events for. The model can relate information from the instructions and prompt, and understand how to fill in the tool arguments based on that.

So in this example it can infer what tomorrow means based on today’s date in the instructions. Once the input for your tool is generated, your call method is invoked.

This is your time to shine, your tool can do anything it wants. But note, the session waits for your tool to return, before it can generate any further output.

The output of your tool is then put in the transcript, just like output from the model. And based on your tool’s output, the model can generate a response to the prompt.

Note that a tool can be called multiple times for a single request.

And when that happens, your tool gets called in parallel. So keep that in mind when accessing data from your tool’s call method.

Alright, that was pretty fun! Our game now randomly generates content, based on my personal contacts and calendar. All without my data ever leaving my device. To recap, tool calling can let the model call your code to access external data during a request. This can be private information, like Contacts, or even external data from sources on the web. Keep in mind that a tool can be invoked multiple times, within a given request. The model determines this based on its context.

Tools can also be called in parallel, and they can store state.

That was quite a lot.

Perhaps get a coffee before doing anything else.

To learn more, you can check out the dedicated video about prompt engineering, including design and safety tips. And, if you want to meet the real Naomy, check out the code-along video. I hope you will have as much fun with Foundation Models as I’ve had. Thanks for watching.

Code

1:05 - Prompting a session

import FoundationModels

func respond(userInput: String) async throws -> String {
  let session = LanguageModelSession(instructions: """
    You are a friendly barista in a world full of pixels.
    Respond to the player’s question.
    """
  )
  let response = try await session.respond(to: userInput)
  return response.content
}

3:37 - Handle context size errors

var session = LanguageModelSession()

do {
  let answer = try await session.respond(to: prompt)
  print(answer.content)
} catch LanguageModelSession.GenerationError.exceededContextWindowSize {
  // New session, without any history from the previous session.
  session = LanguageModelSession()
}

3:55 - Handling context size errors with a new session

var session = LanguageModelSession()

do {
  let answer = try await session.respond(to: prompt)
  print(answer.content)
} catch LanguageModelSession.GenerationError.exceededContextWindowSize {
  // New session, with some history from the previous session.
  session = newSession(previousSession: session)
}

private func newSession(previousSession: LanguageModelSession) -> LanguageModelSession {
  let allEntries = previousSession.transcript.entries
  var condensedEntries = [Transcript.Entry]()
  if let firstEntry = allEntries.first {
    condensedEntries.append(firstEntry)
    if allEntries.count > 1, let lastEntry = allEntries.last {
      condensedEntries.append(lastEntry)
    }
  }
  let condensedTranscript = Transcript(entries: condensedEntries)
  // Note: transcript includes instructions.
  return LanguageModelSession(transcript: condensedTranscript)
}

6:14 - Sampling

// Deterministic output
let response = try await session.respond(
  to: prompt,
  options: GenerationOptions(sampling: .greedy)
)
                
// Low-variance output
let response = try await session.respond(
  to: prompt,
  options: GenerationOptions(temperature: 0.5)
)
                
// High-variance output
let response = try await session.respond(
  to: prompt,
  options: GenerationOptions(temperature: 2.0)
)

7:06 - Handling languages

var session = LanguageModelSession()

do {
  let answer = try await session.respond(to: userInput)
  print(answer.content)
} catch LanguageModelSession.GenerationError.unsupportedLanguageOrLocale {
  // Unsupported language in prompt.
}

let supportedLanguages = SystemLanguageModel.default.supportedLanguages
guard supportedLanguages.contains(Locale.current.language) else {
  // Show message
  return
}

8:14 - Generable

@Generable
struct NPC {
  let name: String
  let coffeeOrder: String
}

func makeNPC() async throws -> NPC {
  let session = LanguageModelSession(instructions: ...)
  let response = try await session.respond(generating: NPC.self) {
    "Generate a character that orders a coffee."
  }
  return response.content
}

9:22 - NPC

@Generable
struct NPC {
  let name: String
  let coffeeOrder: String
}

10:49 - Generable with enum

@Generable
struct NPC {
  let name: String
  let encounter: Encounter

  @Generable
  enum Encounter {
    case orderCoffee(String)
    case wantToTalkToManager(complaint: String)
  }
}

11:20 - Generable with guides

@Generable
struct NPC {
  @Guide(description: "A full name")
  let name: String
  @Guide(.range(1...10))
  let level: Int
  @Guide(.count(3))
  let attributes: [Attribute]
  let encounter: Encounter

  @Generable
  enum Attribute {
    case sassy
    case tired
    case hungry
  }
  @Generable
  enum Encounter {
    case orderCoffee(String)
    case wantToTalkToManager(complaint: String)
  }
}

13:40 - Regex guide

@Generable
struct NPC {
  @Guide(Regex {
    Capture {
      ChoiceOf {
        "Mr"
        "Mrs"
      }
    }
    ". "
    OneOrMore(.word)
  })
  let name: String
}

session.respond(to: "Generate a fun NPC", generating: NPC.self)
// > {name: "Mrs. Brewster"}

14:50 - Generable riddle

@Generable
struct Riddle {
  let question: String
  let answers: [Answer]

  @Generable
  struct Answer {
    let text: String
    let isCorrect: Bool
  }
}

15:10 - Dynamic schema

struct LevelObjectCreator {
  var properties: [DynamicGenerationSchema.Property] = []

  mutating func addStringProperty(name: String) {
    let property = DynamicGenerationSchema.Property(
      name: name,
      schema: DynamicGenerationSchema(type: String.self)
    )
    properties.append(property)
  }

  mutating func addArrayProperty(name: String, customType: String) {
    let property = DynamicGenerationSchema.Property(
      name: name,
      schema: DynamicGenerationSchema(
        arrayOf: DynamicGenerationSchema(referenceTo: customType)
      )
    )
    properties.append(property)
  }
  
  var root: DynamicGenerationSchema {
    DynamicGenerationSchema(
      name: name,
      properties: properties
    )
  }
}

var riddleBuilder = LevelObjectCreator(name: "Riddle")
riddleBuilder.addStringProperty(name: "question")
riddleBuilder.addArrayProperty(name: "answers", customType: "Answer")

var answerBuilder = LevelObjectCreator(name: "Answer")
answerBuilder.addStringProperty(name: "text")
answerBuilder.addBoolProperty(name: "isCorrect")

let riddleDynamicSchema = riddleBuilder.root
let answerDynamicSchema = answerBuilder.root

let schema = try GenerationSchema(
  root: riddleDynamicSchema,
  dependencies: [answerDynamicSchema]
)

let session = LanguageModelSession()
let response = try await session.respond(
  to: "Generate a fun riddle about coffee",
  schema: schema
)
let generatedContent = response.content
let question = try generatedContent.value(String.self, forProperty: "question")
let answers = try generatedContent.value([GeneratedContent].self, forProperty: "answers")

18:47 - FindContactTool

import FoundationModels
import Contacts

struct FindContactTool: Tool {
  let name = "findContact"
  let description = "Finds a contact from a specified age generation."
    
  @Generable
  struct Arguments {
    let generation: Generation
        
    @Generable
    enum Generation {
      case babyBoomers
      case genX
      case millennial
      case genZ            
    }
  }
  
  func call(arguments: Arguments) async throws -> ToolOutput {
    let store = CNContactStore()
        
    let keysToFetch = [CNContactGivenNameKey, CNContactBirthdayKey] as [CNKeyDescriptor]
    let request = CNContactFetchRequest(keysToFetch: keysToFetch)

    var contacts: [CNContact] = []
    try store.enumerateContacts(with: request) { contact, stop in
      if let year = contact.birthday?.year {
        if arguments.generation.yearRange.contains(year) {
          contacts.append(contact)
        }
      }
    }
    guard let pickedContact = contacts.randomElement() else {
      return ToolOutput("Could not find a contact.")
    }
    return ToolOutput(pickedContact.givenName)
  }
}

20:26 - Call FindContactTool

import FoundationModels

let session = LanguageModelSession(
  tools: [FindContactTool()],
  instructions: "Generate fun NPCs"
)

21:55 - FindContactTool with state

import FoundationModels
import Contacts

class FindContactTool: Tool {
  let name = "findContact"
  let description = "Finds a contact from a specified age generation."
   
  var pickedContacts = Set<String>()
    
  ...

  func call(arguments: Arguments) async throws -> ToolOutput {
    contacts.removeAll(where: { pickedContacts.contains($0.givenName) })
    guard let pickedContact = contacts.randomElement() else {
      return ToolOutput("Could not find a contact.")
    }
    return ToolOutput(pickedContact.givenName)
  }
}

22:27 - GetContactEventTool

import FoundationModels
import EventKit

struct GetContactEventTool: Tool {
  let name = "getContactEvent"
  let description = "Get an event with a contact."

  let contactName: String
    
  @Generable
  struct Arguments {
    let day: Int
    let month: Int
    let year: Int
  }
    
  func call(arguments: Arguments) async throws -> ToolOutput { ... }
}

Summary

  • 0:00 - Introduction

  • Learn about the Foundation Models framework for Apple devices, which provides an on-device large language model accessible via Swift API. It covers how to use Generable to get structured output, dynamic schemas, and tool calling for custom functions.

  • 0:49 - Sessions

  • In this example, Foundation Models enhance a pixel art coffee shop game by generating dynamic game dialog and content. Through the creation of a 'LanguageModelSession', custom instructions are provided to the model, enabling it to respond to player questions. The model processes user input and session instructions into tokens, small substrings, which it then uses to generate new sequences of tokens as output. The 'LanguageModelSession' is stateful, recording all prompts and responses in a transcript. You can use this transcript to debug and display the conversation history in the game's user interface. However, there is a limit to the session's size, known as the context limit. The generation of responses is not deterministic by default. The model uses sampling, creating a distribution of likelihoods for each token, which introduces randomness. This randomness can be controlled by using the GenerationOptions API, allowing you to adjust the sampling method, temperature, or even set it to greedy for deterministic output. Beyond simple dialog, Foundation Models can be employed to generate more complex outputs, such as names and coffee orders for Non-Playable Characters (NPCs). This adds depth and variety to the game world, making it feel more alive and interactive. You must also consider potential issues like unsupported languages and handle them gracefully to provide a smooth user experience.

  • 7:57 - Generable

  • Foundation Models' Generable API is a powerful tool that simplifies obtaining structured data from Large Language Models. By applying the @Generable macro to Swift structs or enums, a schema is generated at compile-time, guiding the model's output. Generable automatically generates an initializer and handles parsing the model's generated text into type-safe Swift objects using constrained decoding. This technique ensures that the model's output adheres to the specified schema, preventing hallucinations and structural mistakes. You can further customize the generation process using 'Guides', which provide constraints, ranges, or natural language descriptions for specific properties. This allows for more control over the generated data, such as specifying name formats, array counts, or numerical ranges. Generable enables efficient and reliable data generation, freeing developers to focus on more complex aspects of their applications.

  • 14:29 - Dynamic schemas

  • In the game's level creator, dynamic schemas enable players to define custom entities at runtime. These schemas, akin to compile-time structs, have properties with names and types, allowing for arrays and references to other dynamic schemas. From player input, a riddle schema is created with a question (string) and an array of answers (custom type with string and Boolean properties). These dynamic schemas are validated and then used to generate content by Foundation Models, ensuring the output matches the defined structure. This dynamic approach allows the game to display player-created riddles and other entities in a dynamic UI, providing a high degree of flexibility and creativity for players while maintaining structured data handling.

  • 18:10 - Tool calling

  • With Foundation Models, game developers can create personalized DLC using tool calling. This allows the model to autonomously fetch information from the player's device, such as contacts and calendar, while preserving privacy because the data never leaves the device. Defining a tool involves specifying a name, description, and input arguments. The model uses this information to decide when and how to call the tool. The tool's implementation then interacts with external APIs, like the Contacts API, to retrieve data.

Deliver age-appropriate experiences in your app

Learn how to deliver age-appropriate experiences in your app with the new Declared Age Range API. We'll cover how parents can allow their child to share an age range with an app to ensure a safe experience in a privacy-preserving way. We'll also explore how this framework can help you tailor your app's content and features based on a user's age, and show you how to implement age gates, understand caching, and respect user privacy while creating safer and more engaging experiences.

Chapters

Resources

Related Videos

WWDC25

Transcript

Hello, and welcome to “Deliver age-appropriate experiences in your app”. I’m Austin, and I’m an Engineering Manager for iCloud Family. In this video, I will cover three topics. First, helping protect kids online. Then, I’ll go over the Declared Age Range framework. And lastly, I will show you how to request an age range and build a great age-appropriate experience. I'll start by going over the background that led us to create the Declared Age Range framework. At Apple, our goal is to create technology that empowers people and enriches their lives while helping them stay safe online and protect their privacy. We want people of all ages to be able to have great, safe experiences with our products and services.

In February 2025, Apple released a white paper called “Helping Protect Kids Online.” Protecting kids from online threats, whether they’re young children, preteens, or teenagers, requires constant effort.

The digital world is increasingly complex, and the risks to families are ever-changing. Building on Apple's profound commitment to privacy, security, and safety, we are continuing to enhance the trusted tools that we provide to help parents protect their kids in a way that is designed around privacy. In March 2025, Apple launched a streamlined child setup flow. It enables child-appropriate default settings if parents prefer to wait until later to finish setting up a child account. And in iOS 26, Apple is providing parents with the ability to easily correct the age that’s associated with their child’s account, if it was previously set up incorrectly. Also, in iOS 26, the App Store’s global age ratings changed to provide more granularity. There are now five categories: 4+, 9+, 13+, 16+, 18+. The ratings add more useful information on product pages and give parents further insight into their kids’ experiences. And lastly is the introduction of the new Declared Age Range API, the focus of this video. Next, I’ll go over the specifics of the Declared Age Range framework.

In an age-appropriate experience, the app can ask for an age range. The prompt itself is customized based on what the app requests, and the age of the user. Users can choose to share or not to share. Based on that decision, the app can tailor its experience. Here, the app is asking if the user is 16 or older. In order to preserve privacy, the app provides a set of ages that are important to their experience. The API will then return an age range, which is a set of two numbers. This helps keep the birth date private. The user only has to reveal what’s necessary in order to get the appropriate experience. Now, I will go through how different users can declare their age ranges. In this example, the app asks for the ages 13 and 16.

Olivia is 14, so she can declare that she is in the 13 to 15 range.

Emily is 9, so she can share that she is 12 or under.

Ann is 42, so she can share that she is 16 or over. In each case, the birth date is not revealed to the app. The Declared Age Range framework is for child and teen experiences. Therefore, the API will set a regional max for any age provided. The max will always be the age of an adult in that region.

Apps have their own unique requirements. Each app can specify up to three different ages in one request, which results in four different ranges. Each range must be at least two years in duration.

Deciding which ages are right for your app is fundamentally about what experiences an app wants to present or hide from users of different ages.

Some apps may base this off of regional requirements or decide it’s required for the best experience. During child onboarding, a parent can learn more about Declared Age Range, confirm their child’s age, and choose which sharing option is appropriate for their child. A parent can also manage this by going to Family & Settings, tap on a child, go to Apple Account & Password, and then Age Range for Apps.

You can also manage this on the child device by going to Apple Account and Settings, tap on Personal Information, and again, Age Range for Apps is shown. There are three different settings for this API.

This user is in Always Share.

Always Share automatically returns the age range that is asked by the app.

If new information is revealed, a notification appears.

This user is in Ask First.

Ask First displays a prompt to choose whether to share or not.

This user is in Never share. Never share always declines to share. Nothing is shown when an app requests for the user’s age.

Keeping birth dates private is essential.

So there are additional measures in the Declared Age Range framework to ensure privacy.

When in Ask First, by default, it will only prompt on the anniversary of the original response.

Similarly, in Always Share, by default, it will only reveal new age information on the anniversary of the original response.

Here's an example. If a child turns 13 and crosses into a new age range, the API will still return 12 or under until the anniversary.

Then, on the anniversary of the original age declaration, upon request, the API will then either automatically share or prompt to share.

This helps protect users from revealing their birth date.

Lastly, a user can also allow the app to reprompt in order to receive the current age. For example, on the weekend after the user’s birthday, they are excited to get access to features for older children right away. They don’t have to wait for the anniversary. In Settings, age range for apps, they can force the cache to clear by going to a specific app and tapping Share Age Range again.

This provides an app the updated age range response the next time the age is requested.

Creating an age-appropriate experience is really about modifying the features in your app to ensure they're suitable for users. To illustrate, I will go through the exercise of building out the age-appropriate experience that I showed previously.

Apps can now choose to change the experience based on whether it’s appropriate for the user’s age. In this case, the photo sharing feature will be unavailable until the Landmarks app can confirm the child's age range and determine that it's age-appropriate.

To get started, you will need to add the Declared Age Range capability to your project. Go to the Signing and Capabilities tab in your project target and click the plus button.

Then select Declared Age Range.

Now I’ll go into the sample app UI and start coding.

Here is the Landmarks app. It shows a list of landmarks. When I tap, I’m taken to a detailed description.

This is the landmarks detail view. This is where I’d like to add my new photo sharing feature.

I add a variable that will keep track of whether my feature is enabled or not.

Add an environment variable to help the API know which window to display an alert on. This is important in use cases where there are multiple windows for your app, for example, on iPad or Mac. Then I add a button that checks whether it should be enabled.

By default, it will be disabled.

Now I will dive into requestAgeRangeHelper.

This method contains all the logic for requesting and receiving a response from the API.

The API allows apps to specify ages at runtime. So for each region or use case, specify the ages that apply to your app in that context. For the sake of time, I add a comment to later implement code to check which region the user is in.

Request age range using requestAgeRange method. And provide the ages that you care about. In this case, I want to know if the user is 16 or older.

I’m ready to implement the code that handles the response. The response will return an enum that will equal sharing or decline sharing.

If the enum equals sharing, the user shared, and a lower and upper bound is returned. This represents the age range returned by the API. If the enum equals decline sharing, the user chose not to share.

I will now check if the lower bound is greater than or equal to 16.

If this is true, I can assume the upper bound is nil.

As mentioned, upper and lower bound values can be nil. This is used to represent ranges like 12 or under, or ranges that don’t have an upper limit, like 16 or over.

If they are 16 or over, I will enable the photo sharing button. If they are 15 or under, the button will remain disabled. If the age range is shared, you will also get back an ageRangeDeclaration value.

For children, the value will always be guardianDeclared. For teens, if they are in an iCloud family, it will be guardianDeclared.

If they are not in an iCloud family, it will be selfDeclared.

For adults, the value will be selfDeclared. Now, back to the code.

The API can also throw an error. I'll add some code to handle errors. Error invalidRequest is a developer generated error. It indicates that something is wrong with the request itself. For example, you have an age range that is not two years in duration.

Error notAvailable indicates device configuration issue that needs to be handled by the app. For example, the user hasn’t signed into the device with an Apple account.

Now, I will put it all together and demonstrate the user experience.

When I open the Landmarks detail view, an alert appears asking if I’d like to share that I’m 16 or older. I choose to share. As expected, the photo sharing experience is now available.

There are a few more details that are helpful to know. Because the API will be called often, the system caches the responses so the user doesn’t constantly have to answer prompts. Practically speaking, this means apps won’t need to worry that calling the API will prompt the user too many times.

Cached responses are synced across devices. For example, a cached age range shared on iPhone will sync to Mac.

Users can manage cache responses in Settings.

Additionally, if the upper bound of an age range is below the age of majority, the API returns a set of additional parental controls that the parent has configured for the child. Here, I have code that checks to see if communication limits is enabled for the child.

To learn more about it, check out “Enhance child safety with PermissionKit.” Apps leveraging this framework are able to allow parents to be in control of who a child can communicate with in third-party experiences.

So that's the new API. Let me tell you about some other features you can use to protect your users.

Sensitive Content Analysis API helps apps provide a safer experience in your app by detecting and alerting users to nudity in images and videos before displaying them on screen.

And in iOS 26, the Sensitive Content Analysis API has been expanded to detect and block nudity in live streaming video calls.

The Screen Time Framework gives apps the tools needed to help parents and guardians supervise their children's web usage. And Family Controls helps apps provide their own parental controls on a device.

Now you know about age appropriate experiences. Here's what you need to do next.

Review Apple’s child safety tools at developer.apple.com, add age-appropriate experiences where they make sense, use the Declared Age Range framework to properly gate your age appropriate experiences.

Thank you for adding age-appropriate experiences in your app.

Code

8:03 - Request an age range

// Request an age range

import SwiftUI
import DeclaredAgeRange

struct LandmarkDetail: View {
    // ...
    @State var photoSharingEnabled = false
    @Environment(\.requestAgeRange) var requestAgeRange
    
    var body: some View {
        ScrollView {
            // ...
            Button("Share Photos") {}
                .disabled(!photoSharingEnabled)
        }
        .task {
            await requestAgeRangeHelper()
        }
    }

    func requestAgeRangeHelper() async {
        do {
            // TODO: Check user region
            let ageRangeResponse = try await requestAgeRange(ageGates: 16)
            switch ageRangeResponse {
            case let .sharing(range):
                 // Age range shared
                if let lowerBound = range.lowerBound, lowerBound >= 16 {
                    photoSharingEnabled = true
                }
                // guardianDeclared, selfDeclared
                print(range.ageRangeDeclaration)
            case .declinedSharing:
                // Declined to share
                print("Declined to share")
            }
        } catch AgeRangeService.Error.invalidRequest {
            print("Handle invalid request error")
        } catch AgeRangeService.Error.notAvailable {
            print("Handle not available error")
        } catch {
            print("Unhandled error: \(error)")
        }
    }
}

11:49 - Communication Limits

// Request an age range

func requestAgeRangeHelper() async {
    do {
        // TODO: Check user region
        let ageRangeResponse = try await requestAgeRange(ageGates: 16)
        switch ageRangeResponse {
        case let .sharing(range):
            if range.activeParentalControls.contains(.communicationLimits) {
                print("Communication Limits enabled")
            }
            // ...
        case .declinedSharing:
            // Declined to share
            print("Declined to share")
        }
    } catch {
        // ...
    }
}

Design foundations from idea to interface

Great apps feel clear, intuitive, and effortless to use. In this session, you'll discover how app design can elevate functionality, communicate purpose, guide people through your content, and use components thoughtfully to keep the experience simple without losing impact. This session is for designers and developers of all skill levels — as well as anyone curious about design.

Chapters

Resources

Transcript

Hi, I’m Majo, from the Design Evangelism Team.

If you’re wondering what Evangelism does, basically I get to spend my time helping designers and developers create better apps for Apple platforms.

And what I’ve learned, is that we all know that feeling when an app just works, and when it doesn’t.

Spotting the difference as a user? Easy. But building that seamless experience as a designer? A bit trickier. So that’s exactly what we’re doing today — together. I’ll walk you through how I think, the questions I ask, and how I deal with the messy middle when things don't feel quite right.

We’ll start with structure: how to organize information and define what the app is and what it does.

Then, we’ll explore navigation — how to design clear ways for people to move through the app so they feel confident and in control.

Next, we’ll focus on content and how to organize it to enhance its meaning, present it clearly and guide action.

Finally, in visual design, we’ll see how the right styling can shape your app’s personality and tone — while supporting usability.

Let’s begin with structure. Every app is built on a foundation that shapes everything else: how people navigate, what stands out, and how the experience comes across. When it’s done well, everything clicks into place. If not, well, we’ve all been there.

To ground this, I’ll show you a fictional app I created. It helps music lovers keep track of their growing vinyl collections. People can scroll through their collection and get inspired on what to play next.

They can group records in crates, add new ones, track their swaps, and save records they might want later.

So what’s your first impression? Is anything confusing? What works — and what feels a little off? When I look at an app, I want to find clarity — this makes the experience inviting and helps me use it confidently.

That starts with knowing where I am. The app should make that clear right away — so I’m not left wondering where I am or how I got there.

The next question is: “What can I do?” I shouldn’t have to guess — actions should be clear and easy to understand.

And finally, I ask: “Where can I go from here?” A clear sense of next steps keeps the flow going and helps me avoid hesitation or second-guessing.

When those questions are easy to answer, the app feels inviting and fluid. That’s usually the sign of a solid foundation.

At first glance, this app looks pretty good. And sometimes, that can be misleading because I’d assume it works just as well. Let’s see how it answers those questions — starting from the top.

I expect to know where I am — but the first thing I find is a menu. That’s not ideal. Menus can be vague and unpredictable, and what I really need first is some context.

Next, there’s a title — but it feels more like branding. Looks nice, but doesn’t help much. I almost want to skip right past it… And that could make me easily miss the recommended content from the app — even though it seems useful.

As I keep scrolling, I find some albums — but there’s nothing to do besides browse, so I still don’t really know where I am or what I’m supposed to do here.

At the very bottom, I see that the tab name is named Records, answering to where I am, but it comes a bit too late.

The result? The screen didn’t guide me — I had to work to piece it together. We experienced what happens when the structure isn't crystal clear, people feel it: as hesitation, confusion, sometimes even giving up.

Maybe if there was less going on, the app would feel simpler right when I open it. That is the goal of information architecture. It’s the process of organizing and prioritizing information so people can easily find what they need, when they need it — without friction.

The first thing I do is write down everything the app does — features, workflows, even nice-to-haves. At this point, I’m not trying to judge or cut anything out Then I try to imagine how someone else might use the app. When and where would they be using it? How does it fit into their routine? What actually helps them, and what feels like it gets in the way? And I add these answers to my list.

Once I get that, I start cleaning things up — removing features that aren’t essential, renaming things that aren’t clear, and grouping those that naturally belong together.

After going through this, it’s clear to me that if I don’t have clarity on what’s essential, then I won’t be able to communicate it in the app. Simplifying helps me sharpen the app’s purpose. It also gives me a clear starting point for how people will find the features, and when they’ll use them.

Let’s explore this further in navigation. This is how people move through the app — and it’s should be more than just tapping around. I want them to feel oriented and confident.

To do that, I’ll use what I learned from the information architecture to access the main features, in iOS that happens with the Tab bar component. It supports navigation among different sections of an app, and it’s always visible, so it’s easy to access them at any time.

Simplifying it matters because each extra tab means one more decision for people to make and might present the app as much more complex that what it really is.

So I pause and ask: What’s truly essential? What deserves a tab? One example of what doesn't? Crates. It’s just a screen to group Records. So I merge them. No need for both.

Then there’s Add, it’s here in the tab bar because it’s the primary action of the app. But I start wondering — is this the best place for it? When I’m not sure when or how to properly use a component, I always go back to the Human Interface Guidelines. And sure enough: tabs are for navigation, not for taking action.

So I’ll move Add inside Records where someone is most likely to use it.

Now the Tab Bar has three very distinct sections. And since I’m working on making it more predictable, I think I can improve how I’m naming those tabs. I want the labels and icons to help people get a sense of what each tab is for — so they don’t have to interact just to find out, or end up skipping it because they’re unsure what it leads to.

I’ll rename Swaps and Saves to more direct labels. And change their icons to reinforce the meaning of each tab. I want these to be visually consistent, so instead of designing my own (which I find really hard) I’ll use SF Symbols, Apple’s library of iconography. These symbols are already familiar to people on any Apple platform, helping the tabs being recognizable. Thanks to a simplified architecture, familiar icons and explicit labels the full scope and purpose of the app is evident, and the tab bar feels more approachable.

Because of the work in the information architecture, some things moved around, making the content a bit confusing. So let’s start clarifying that with a toolbar. It’s a great way to orient people in your interface. Notice how it solves both problems I had at the beginning: where I am and what can I do. That’s because the tool bar includes: a title with the name of the screen, instead of a menu, or branding as before. It sets expectations about the content of the screen and helps people stay oriented as they navigate and scroll.

It also offers a great place for screen-specific actions that people are most likely to want, instead of using the tab bar. Since space is limited, I’d only include what’s essential using SF Symbols to make each action easy to recognize.

Now I can clearly answer to where am I what can I do and what’s possible in the app. Setting people up for a more confident experience from the start. All thanks to the work done on its structure and the intentional use of navigation components.

Looking back, I realize I didn’t get it right the first time. But that’s part of the process. Each round of iteration gets the design closer to something that feels supportive, predictable and easy to move through.

Now the foundation to explore is settled, let’s zoom into what’s actually on the screen.

The content of your app should be organized to guide people to what matters most and what they expect to find first.

So far, I’ve worked on getting to the right parts of the app, but what about the content? It feels messy because there are two types of content mixed together: Groups and Records. So I’ll try something simple: like splitting both sections.

I think that’s a good start, at least there’s a title clarifying what the content is, but unless I scroll, I don’t get to see what else the screen has to offer. So what if now I show only a few groups, and let people uncover more as they go? That concept is called progressive disclosure. It’s about showing only what’s necessary upfront — just enough to help people get started then revealing more as they interact. You’ll see it anywhere an interface begins simple and gradually offers more detail or advanced options.

So the rest of the content, is not missing, it’s just behind a tap on the disclosure control next to the title, waiting for the moment it becomes relevant.

And when that screen opens I want the content to be arranged in the same way. It feels connected to the previous screen, like it’s expanding.

As I explained in navigation, every screen should provide orientation, so let’s not forget the tool bar this time.

It has actions related to the screen and the back button, so it’s easy to understand how people got here and how to move around. In my initial design, there were elements for decoration or without concrete purpose, making it difficult to discover features that mattered.

So now, I want to make smarter design choices by finding the clearest way to show content in the layout. Let’s unpack a few examples. Progressive disclosure was a step in the right direction, but the grid layout doesn’t feel quite right. It takes up too much space for just two items. And it doesn’t handle longer text well, making the content unclear. Let me work on that.

A List works way better, it’s a flexible, highly usable and familiar way to show structured information and facilitates quick scanning.

It also takes up less vertical space than images, which means more items can fit on the screen at once.

And so you know, I didn’t design it from scratch, I learned from components like the tab bar and toolbar that designing to prioritize function really pays off. So I’m using the list template from the Apple Design Resources, which was easy to adapt to my content.

It’s starting to feel like the design is much more intentional and supports more functionalities, like a real app. Time to see how the last section is working. Once I scroll, I find all the records someone has uploaded to the app. My goal was to make everything available upfront so people could explore freely. But as the number of choices grows, so does the effort it takes to process.

I worry that instead of browsing, people might feel overwhelmed and leave the app altogether. Before figuring out how to display a large amount content, I needed to organize it. Grouping things into themes — like in information architecture helped me cut through the noise and focus on what matters. There are a few themes that apps use to stay organized. Grouping content by time is one of the most frequent ways: think about how helpful is to find your Recent files, or continue watching when streaming a show. Grouping considering seasonality and current events also make the experience feel more alive and relevant.

Grouping by progress helps people pick up where they left off like draft emails, or an ongoing class. It’s a great way to make an app feel responsive to real life because people rarely finish everything in one go.

And grouping by patterns is about surfacing relationships things that belong together, like related products. Surfacing patterns turns a quick browse into a longer exploration because it shows people connections they didn’t know to look for. These grouping ideas aren’t limited to one type of app. Even if the content isn’t very visual or doesn’t change often, they help reduce choice overload and make the app feel one step ahead like it understands what we’ll need next.

I know I’ll definitely be using them. And for displaying a large number of images — as I need — it’s best to consider using a collection.

It’s ideal to show groups of items like photos, videos, or products that can be scrolled on and off the screen. I love how dynamic it feels. To achieve that, I have consistent spacing between items and I’m avoiding too much text on them.

I created collections using the grouping ideas from before. Here, by time: showcasing records that are ideal for summer time, by progress: featuring complete sets or discographies, and by patterns: like style or genre.

When content is thoughtfully organized and laid out using familiar platform components, it helps people find what matters, effortlessly — creating a space that feels intuitive and inviting to return to again and again. Lastly, when I open an app and it just feels right, visual design is a huge part of that. It communicates the personality of the app and shapes how people feel. It’s the thoughtful use of hierarchy, typography, images and color, all while supporting function.

To take the app’s visual design further, I need to figure out what’s working—and what can be improved. I’m paying attention to how type, color, and imagery are coming together. When I squint, my eyes go straight down to the first collection, because it’s visually heavier and colorful. Missing half of the content, and sense of place. What’s missing here is visual hierarchy. It’s about guiding the eye through the screen, so it notices the different design elements in order of importance.

To improve it, I’m going to turn this suggestion into a visual anchor by making what’s most important larger or higher in contrast, so it naturally draws attention first.

And it works—visually, it does the job. But will it hold up? What if the text gets longer, the language changes, or someone’s using a larger text size? I realize I need to design with more flexibility, especially when it comes to type.

That’s where system text styles come in handy. These make it easy to create clear hierarchy and strong legibility even under different screen conditions They offer a consistent way to style everything from titles to captions, so I have range to communicate the different levels of importance of my content — without needing to eyeball text sizes or creating custom styles from scratch. I’ll maintain the full-bleed design, moving the album cover to the background. That gives the text a persistent space while the three different text styles provide helpful variations of size and contrast, successfully guiding the eye.

Text styles also support Dynamic Type, allowing people to choose a text size that’s comfortable for them making the app more inclusive and easier for everyone to use.

But when text overlays an image, legibility can quickly become a problem especially with busy or high-contrast visuals. In those cases, clarity has to come first.

One simple way to fix this is by adding a subtle background behind the text like a gradient or blur. It improves readability, while adding some dimension without disrupting the design. The last thing I want to focus on is how images and color can help convey the personality of the app.

Starting on the list. I think I simplified it a little too much and it’s getting lost in between components. So I’ll add images to represent each group and make the list easier to scan.

But not every image seems to work. They are all very different — in color, style. I start to see that what I really need is a cohesive visual style.

To get there, I’m choosing a color palette and setting some simple rules to use it. Hopefully, this will establish a consistent aesthetic and evoke the right feeling.

I choose four colors, a few retro shapes and then I mix and match. For those groups that show a title, I went with a bolder, expanded font, so they look different from text in the list.

I really like how this is looking, I think these choices sharpen the app’s personality and make it easier to stay consistent as it grows. And since I’m working with color, I’m curious — where else in the app can I use it? Maybe backgrounds, text, icons, but those already have color, just not the kind I pick from a palette. They’re not black or purple they have names like label or secondarySystemBackground. These are semantic colors, and are named after their purpose, not their appearance — because they’re dynamic.

They automatically change according to contrast settings, screen environments and modes like dark and light.

I can use an accent color here and there — on buttons, controls, maybe to show what’s selected. But I have to be careful it doesn’t get in the way of dynamic changes, overall legibility, or people’s comfort.

So I’d say that for anything dynamic, this is basically my color palette of system colors. They’re also part of the Apple Design Resources, and they give me a flexible set of options to build visual hierarchy that seamlessly adapt to people’s preferred appearance, without extra work.

This is a great way to practice knowing when to lean on the system, and where to add personality. It might be tempting to treat each of these design elements as its own little project, and in some ways, they are; they deserve thought and attention. But the real impact comes when they work together contributing to the overall meaning of the interface.

The design I started with, it’s long gone. I simplified its structure and navigation, presented its content with meaning helping people take action, all in a space everybody can use and enjoy.

Every element of this design, builds on our past decisions, from the first tap to the last scroll. Design is never really finished, and there’s no single right answer. Today we explored the foundations and you can take your app even further with typography, UX writing, and animation. There are endless possibilities. If you’re ready to keep going, check out these sessions from previous years and get to know the new design system. There’s so much more to explore and learn—so stay curious and keep creating. I’m excited to see where your ideas take you!

Design hover interactions for visionOS

Discover how to create advanced interactions for your visionOS apps. We'll explore how you can design compelling custom hover effects and animations, avoid common mistakes, take advantage of interactions like Look to Scroll, and build intuitive media controls with persistence effects.

Chapters

Resources

Transcript

Hey I'm Nathan, a designer on the Apple Design Team. Welcome to this session on designing hover interactions for visionOS. With new ways to respond to where people look, your apps can feel more alive and make people feel like their mind is guiding the experience.

Here's the order for today: we'll review the fundamentals of eye input, explore custom effects, make content scroll just by looking, and see how to persist important controls. Let's jump in. As you may know, people navigate visionOS with their eyes and hands. They look at an element and tap their fingers together to select it. Here's a few quick reminders to make sure your apps work well with eyes and hands.

Place your most important content in front of people, so it's easy to see and use.

For interactive elements, prefer rounded shapes, like circles, pills, and rounded rectangles, they're easier to look at, drawing your eyes to the center of the shape.

For precise interactions, give each element at least 60 points of space. Elements can be visually smaller than this, as long they each have a total space of 60 points.

And for 3D objects with a fixed scale, 60 points is roughly equal to an angular size of two and a half degrees, which is about 4.4 cm for an object one meter away.

Apply highlight effects to all interactive elements. Standard components like this menu will highlight automatically.

For your custom components, add the highlight effect and make sure its shape matches the shape of the content.

And apply these highlight effects to selectable 3D objects too.

To learn more about the basics, check out the session "Design for spatial input" from 2023.

Today, let's talk about more powerful interactions for your apps, starting with custom effects.

The standard highlight effect works great in most cases, but you can extend it or replace it with your own animations. We call these custom hover effects. We use custom hover effects throughout visionOS.

Tab bars pop open to show the names of each tab.

Back buttons grow to show the title of the previous page.

Sliders show a knob to invite interaction.

Tooltips appear below buttons to describe what they do.

In Safari, the navigation bar expands to show your browser tabs. And in the Home View, environment icons reveal more of the landscape.

You can create custom effects like these in your apps too! They're perfect for giving feedback, while maintaining the personality of your app or visual style of your game.

But first, let's understand a little more how they work. To protect people's privacy, the system applies hover effects outside your app's process, so your app doesn't know where people are looking.

Instead of responding to a hover event directly, your views define two states: their standard appearance and their hovered appearance. The system animates between these states when people look at the view, or look away.

This is great for playing animations like we've seen so far but means custom effects cannot be used to perform actions within your app. Let me show you what I mean. Say I'm designing a photo-browsing app. I could add a download button to let people save their favorite photos.

I could use a custom effect to show the download size when people look at it, since this is just an animation. But I couldn't perform the download action just by looking at the button, since this would require my app to know when the effect is applied.

Instead, people will need to perform a tap gesture to save the photo.

Okay, so custom effects drive animations. What kind of animations work best? They generally fall into three buckets: Instant, Delayed, and Ramp animations. The first are "Instant" animations which start right when people look at a view In Mindfulness, the buttons show an arrow icon so people know more choices are available.

And the standard video player shows a timestamp when looking at the playhead. This extra information is small, contextual, and not interactive.

Sometimes instant animations are too quick since they cause motion as people are trying to look around so we can use "Delayed" animations instead. These usually show more content after a short delay.

Tooltips are a great example. This delay allows people to press the button quickly, and only displays the title if someone shows extended interest.

Another example is the Safari profile button, which follows the same pattern with a different layout.

Last are "Ramp" animations. They're like delayed animations but with a hint at the start and a custom curve.

Check out the environment icons in the Home View. Right as people look at them, they start scaling up slowly, then pop open to reveal more of the landscape. This scaling effect makes it clear that the icon will expand if you continue to look at it, but doesn't fully open until people show clear intent. This lets people look across the grid without icons popping open along the way.

Here's another example, where this card expands to reveal the full text. These ramp animations are a balance between "instant" and "delayed" animations. They give immediate feedback, without being distracting.

To create your own ramp animation, try a curve like this. It starts with a slow ease-in, then pops to completion with a quick spring. This animation curve tends to work best when views expand to show more content. As you're starting to think about these effects in your apps, here are some best practices to keep in mind.

First, provide anchoring elements.

The best custom effects keep part of the view the same. These static elements, like the title here, are anchoring — they give people context and help them understand what's new.

If text shifts when people look at a view, the motion can interrupt their reading, so try to keep text in a fixed position.

Or if a view is completely different in its two states, people often get confused about what they’re looking at.

And hover effects should always start from a visible element. If there's no hint about where something is, it'll be hard to find or surprising if people accidentally discover it.

Instead, if you have hidden elements, show them when looking at something visible. Here for example, the resize control appears when looking at the corner of the window. This helps people find the controls they're looking for.

Next, apply custom effects carefully.

Think about where custom effects could provide focus, like on this compass. Show additional information like the location status, or create delightful moments like an animating waterfall.

But don't change too much as people look around. Too many custom effects like showing the name of each pin can distract from your content, or even cause visual discomfort. Even a small scale effect might be distracting for high-usage views like toolbar buttons or table cells since they move the things people are trying to look at. For views like these, stick with the standard highlight effect Try to keep effects small.

The best custom effects are subtle, and work well on small views, like this download button we saw earlier.

Avoid applying custom effects to large views, like this entire photo, since it causes a lot of motion and can be difficult for people to know what's happening.

And for imagery like this, avoid effects that wash out the colors.

A good solution is a highlight effect that gives instant feedback, then fades away so people can see the true colors of the photo. And this applies to 3D objects like this plant. It highlights, then fades to show its real colors.

Last, avoid unexpected motion. This is something I run into a lot. You'll think you've got a good idea but it ends up having side effects. Let me show you. Safari's tab overview shows a grid of open tabs.

You might have a great idea: what if we hide all the close buttons by default, and only show the close button when looking at a tab? Looks cleaner, right? But when you look at a tab and the button appears, your eyes instinctively jump to it. This may cause people to accidentally close the tab when trying to select it, since they're now looking at the close button while performing the tap gesture.

Instead you could try something like this: fade the button in slowly halfway when people look at the tab, then fade it fully when people look at the button.

This still reduces the visual noise in a large grid of tabs, but doesn't make people accidentally close them.

It's critical to try these effects on a real device! It's impossible to experience them them by watching a video since they react to where you look. And hey, I love the visionOS simulator too, okay, but it's no replacement for testing on a real device.

Designing custom effects is an iterative process, more than other parts of the interface. Be prepared to try lots of options and spend time tuning to get it just right.

Over time you'll build an intuition for what works and what doesn't, but if you have an idea — try it! It might just work. Next, let's talk about another powerful interaction: Look to Scroll. Look to Scroll lets people scroll just with their eyes. Here in Safari, when I look at the bottom of the page, it smoothly animates up.

Or in Music, when I look at the last playlist, it scrolls toward the center.

This makes browsing easier because your eyes follow the content, almost pulling it into view. It's a lightweight interaction that works alongside scrolling with hands.

Scrolling starts when people look near the edge of the scroll view, along the top and bottom for vertical scroll views, or along the sides for horizontal scroll views.

Your app may have many scroll views and Look to Scroll isn't enabled by default, so you'll need to pick views should opt-in. To decide, here's a few things to think about: First, if your view is primarily for reading or browsing, opt-in.

Look to Scroll is great for reading an article in apps like Safari or browsing for the next show in apps like TV.

But in apps like Settings, people want to quickly look through to find the right option, and they don't read down the list one-by-one, so Look to Scroll isn't the right fit here. In general, if your view is mostly UI controls, don't enable Look to Scroll.

Next, consider your content. Generally, Look to Scroll works best on the primary content of the app. So here in Notes, the note itself scrolls when people look at the edges, but the list on the left does not, following the pattern we just saw with Settings.

Finally, think about consistency within your app.

Here in TV, there's a list of related shows, which look just like the list in the main library, so people expect Look to Scroll to work here too.

Now that you're getting a sense for which scroll views to opt-in, let's see some ways to make scrolling feel natural and predictable.

Ideally, your scroll views with Look to Scroll enabled should be the full width or full height of your window. This gives people generous space to scroll and provides clear edges to look at.

If your scroll view is inset from the window, provide clear boundaries so people know where to look.

And if your app uses the scroll position to animate content at a different speed, apply parallax effects or drive custom animations consider adjusting your design to scroll at the normal rate. Or maybe Look to Scroll simply isn't the right choice for these views.

But when your scroll view fills the window, has clear edges, and scrolls normally, browsing will feel smooth and predictable. Finally, let's look at persistent controls, a subtle feature that makes a big difference. Here's a standard video player on visionOS. The playback controls show and hide with a tap. And if the controls are up, they'll automatically hide after a delay. This works well, but the controls might hide while you're looking at them! So now, we keep the controls visible while you look at them, and hide them when you look back at the video. You might not even notice it at first, because it just works how you'd expect.

If your app uses a standard video player, either inline, windowed, or immersive, you'll get this behavior for free. But if your app uses custom video controls, you'll need to enable this behavior. And it's great for more than just video apps! FaceTime persists the call controls and Mindfulness persists the session controls. Any time you have UI that auto-hides after a delay, whether it's for a video or an immersive experience, adopt this persistence behavior.

As we've seen, there's a lot of ways to bring your apps to life with hover interactions: animating your interfaces with custom effects, scrolling views just by looking at them, and persisting media controls.

As you design your apps with these features in mind, start by trying a few ideas and see what works! Look for opportunities to apply custom hover effects in your apps. You can create them in SwiftUI or RealityKit, and you can learn more about how to build them in the sessions here from 2024. To enable Look to Scroll and persistence behaviors, check out the documentation on developer.apple.com Thanks, and have fun!

Design interactive snippets

Snippets are compact views invoked from App Intents that display information from your app. Now, snippets can allow your app to bring even more capability to Siri, Spotlight, and the Shortcuts app by including buttons and stateful information that offer additional interactivity as part of an intent. In this session, you'll learn best practices for designing snippets, including guidance on layout, typography, interaction, and intent types.

Chapters

Resources

Transcript

Hi my name is Ray and I’m a designer on the Apple Design Team. Today, you’ll learn about designing interactive snippets.

Interactive snippets are compact views displayed by App Intents.

They show updated information and offer quick actions directly from your app.

Snippets can appear from anywhere App Intents are supported.

This includes Spotlight, Siri, and the Shortcuts app. This integration expands your app’s reach and utility in the system.

Snippets always appears clearly at the top of the screen, overlaying other content. This makes them useful without taking people out of their context. They remain until confirming, canceling, or swiping away.

This makes snippets a great way to surface simple, routine tasks and information from your app.

Snippets feature rich visual layouts that can reflect your app's unique identity. Now they also support elements like buttons and display updated information, adding a new level of interactivity to App Intents.

With that, we’ll start by discussing the appearance of your snippet’s content and layout.

Then, we’ll learn how to make concise and useful interactions.

Finally, we’ll cover how to use different snippet types, including results and confirmations.

Snippets are designed with App Intents for quick, in-the-moment experiences, so it’s important for the content to be easy to read and understand. A key aspect to making the snippet glanceable is type size. Text in snippets are larger than system defaults. Larger type draws attention to the most important information in the moment. Make sure to provide enough space between elements to avoid cluttering the layout.

When arranging content in snippets, keep consistent margins for the content around the view to make the layout clear. This keeps the snippet organized, allowing people to focus on what matters. You can use the ContainerRelativeShape API to ensure these margins are responsive and adapt correctly across different platforms and screen sizes.

The use of larger text also means limited space. Avoid including content past 340 points in height, which will require scrolling and introduce unexpected friction. Instead, keep the content concise with only the most important information.

If people might need more information, your snippet can link to the relevant view in your app to show everything they need.

When designing experiences that appear above other content, vibrant backgrounds based on your app's visual identity can help your snippet stand out. However, sometimes this can make content difficult to read. It’s especially important to check for contrast when the snippet is viewed from a distance, going beyond standard contrast ratios. If the content is hard to read, try increasing the contrast between the content in the layout and the background. This will help keep the snippet clear, even when using vibrant backgrounds.

So that’s how to make your snippet easy to read and understand.

Next, let’s talk about interaction. Interaction makes snippets more stateful and actionable.

This means you can incorporate buttons, allowing people to perform simple and relevant actions directly when they use the Intent. Snippets can also show updated data, reflecting the latest information from your app.

To learn how to use interactivity, let’s look at this example of a water-tracking intent. By adding a simple button to add water, the information is more actionable while maintaining a lightweight experience.

The data is updated with a scale and blur. This provides clear visual feedback for the action.

By updating the data within the snippet itself to confirm that the action was successful, people can build trust in your App Intent for their routines. Your snippet can also include multiple buttons and updated pieces of content at the same time.

For example, an equalizer snippet can show updated audio settings while providing a few different presets to choose from in the moment. Make sure your snippet offers clear, relevant actions to supplement the main task.

Even without interactivity, snippets can animate in the latest information from your app.

Now let’s talk about snippet types. There are two types of snippets: results and confirmations. Result snippets present information that’s an outcome of a confirmation or doesn’t require further action. Because there are no follow-up tasks or decisions needed from this snippet, the only button at the bottom of the view is “Done”.

Result types are great for snippets such as checking on the status of an order. Next is confirmation snippets. Use confirmations when the intent needs an action before it can show the result. For example, let’s take a look at this coffee-ordering intent. Although there are buttons to change the amount of espresso, the snippet is a confirmation type because the coffee order cannot be placed until people take action. The action verb in the button makes it clear what’s next, such as “Order”.

The verb in the confirmation can be changed to any of these pre-written options, or you can write your own if these don’t fit the intent.

After confirming the order, such as the latte with 2 shots of espresso, the result snippet then shows the outcome of that choice. This pattern helps people understand they’ve started an intent and then see it completed.

Now let's talk about when to show dialog: Dialog is what Siri speaks for the App Intent. It is essential for voice-first interactions. For example, if someone is using their AirPods and not looking at their screen, they can still hear the dialog from a result or reply to a confirmation.

Although the snippet can appear with dialog, challenge yourself to make your snippet understandable on its own, even if the dialog isn't shown or heard. The snippet should clearly communicate the purpose of the intent without relying on dialog. This helps remove redundancy and makes the snippet more intuitive.

Use confirmations and results to ask for action or show an outcome. Let’s wrap with what we learned, and what’s next. Design lightweight, routine snippets using a glanceable appearance, simple interactions, and the right snippet types.

To learn more about what’s new with building app intents for Shortcuts and Spotlight, check out “Develop for shortcuts and spotlight with app intents,” and “Explore advances in app intents.” We can’t wait to see your snippets! Thank you!

Design widgets for visionOS

Learn how you can design beautiful widgets for visionOS 26 that blend effortlessly into someone's surroundings. Discover how you can add depth to your widget design and customize materials, sizes, and styles for spatial computing. We'll share how to adapt your existing widgets for visionOS, or design new widgets that feel like real objects.

Chapters

Resources

Transcript

Hi and welcome to Designing Widgets for visionOS. I’m Jonathan, and a bit later, my colleague Moritz will join me to share more on widgets as well. Widgets have always been about glanceable, timely information, helping people to stay connected to what matters, without the need to open an app. From checking the forecast, to viewing your next calendar event, or tracking a goal's progress, widgets make useful content easy to access. In this session, we’ll show you how these ideas extend to visionOS, and how you can design widgets that feel at home in people’s spaces, by taking advantage of the platform’s spatial and visual capabilities.

On visionOS, widgets take a new form. They become three-dimensional objects that feel right at home in your surroundings. Whether placed on a wall, desk, or shelf, they keep information from your app ambient, and close by.

Moritz and I will take you through the design system that brings them to life. We’ll look at the principles that guide their behavior in space, and share practical tips on working with the new materials, styles, and sizes. So your widgets don’t only look great, but truly belong in places people live, work, and unwind. If your iPad app already includes widgets, you are already off to a great start. By simply enabling compatibility mode, your existing widgets can carry over to visionOS, where they will automatically adapt a new spatial and visual treatment. This treatment gives your designs new depth and dimension, and also unlocks the ability for people to easily place them in their surroundings.

You can also build native widgets designed specifically for visionOS. These give you access to platform-specific sizes, and enhanced visual styling that help your widget feel more integrated with the space around it. The Music Poster widget is a great example, designed to feel like a poster in a room, not just an interface on a screen. To help guide your design, let’s take a closer look at the four core principles behind widgets on Vision Pro. The first principle is persistence, a defining aspect of widgets on visionOS.

When someone places a widget in their space, it stays right where they put it. It remains anchored to that location and purses across sessions, even when they move from room to room or the devices turned off and then back on.

Building on that, the next principle is fixed size. Widgets have a consistent, real-world scale, helping them to feel well-proportioned wherever they are placed. The Music Poster widget is using the extra-large template size, giving it a familiar, printed art frame-like dimension. In addition to being persistent and fixed in size, widgets on visionOS are highly customizable, both for your apps and the people using them.

People can personalize how your widgets look in their space, while you can offer styling options that help your widget feel at home in a wide range of spaces.

And finally, there is proximity awareness. Widgets on visionOS adapt their appearance and layout based on how close someone is, always showing the right level of detail, whether viewed from across the room or up close. We will go through each one step by step, starting with persistence, and how it shapes the way widgets behave in people’s spaces. Once a widget is placed in a room, it remains there permanently. This opens up exciting opportunities to design context-aware, persistent experiences that live right alongside people in their environments.

Before we dive into the details, it’s helpful to understand how people access widgets on visionOS. They can find and place your widgets through the Widgets app on the home grid, making it easy to discover and add them to any space. When someone adds a widget from the Widgets app, it first appears in a detached state, floating next to the library window. To anchor it in a space, the widget has to be placed on a horizontal or vertical surface. This locks it into its persistent position. When placed on a horizontal surface, like a desk, table, or shelf, the widget gently tilts towards the person placing it. This subtle angle helps legibility. It also casts a shadow that makes it feel grounded in space.

When placed on a wall, widgets align flush with the surface and cast a realistic shadow much like a picture frame. Widgets on visionOS are always presented within a frame, that connects the digital content to the surrounding environment.

When thinking about what kinds of widgets to bring to visionOS, or designing one from scratch, it helps to consider them as part of the room they are in.

So always design with context in mind. They are not floating in isolation, they will live in people’s kitchens, living rooms, offices, and more. The environment shapes how widget is seen and used, considering that early on will lead to better experiences.

Take the Weather widget as an example. On visionOS, Weather adopts a large, window-like format. The design prioritizes a clear depiction of the current conditions, aiming to create an illusion of looking out of a window. The text size is designed to be glanceable from across the room, ensuring legibility when wall-mounted. The shift in scale and visual presence is a great example of how widgets can adapt to and enhance the spaces they live in. So once your widget is placed, it stays, but people can place more than one. visionOS supports multi-instance widgets, meaning multiple copies of the same widget can exist in a single space. When arranged on a wall, they neatly snap into a familiar grid layout. Following the cross-platform design guidelines ensures your widget fits seamlessly alongside other widgets in a grid.

For more general guidance on designing widgets, check out the “Design great widgets” session from our colleague Mac.

As widgets live in your real world while using Vision, they behave that way too. Widgets are displayed behind all virtual content, which reinforces their connection to the space around you. Just keep in mind, they only snap to physical surfaces, they won’t attach or persist in virtual environments. Now that we have covered how widgets persist in space, let’s talk about how they scale within it. The second principle is fixed size. When digital content lives alongside real objects, it needs to feel appropriately sized. That’s why widgets on visionOS have defined, consistent dimensions, so they look right at home, whether they are placed on a wall, a desk, or a shelf.

Just like on other platforms, visionOS offers multiple widget templates to choose from. But here, those sizes map to real-world dimensions. That means the sizes you select have a physical presence in someone’s space. So be intentional. Think about where your widget might live -- mounted on a wall or sitting next to a workspace -- and choose the size that feels right for that context. Because widgets share space with real objects, layout matters more than ever. Design with print or wayfinding principles in mind, use clear hierarchy, strong typography, and thoughtful scale to make sure your content stays clear from a range of distances. Here is an example of how size can support context. If you are designing a productivity-focused widget, you might want to offer it in a small template size so it fits easily on a desk. That works especially well for something like the to-do list. It can sit next to the Mac Virtual Display and stay glanceable while someone gets work done. On the other hand, if your goal is to let people decorate their space while using Vision Pro with something visually rich, like an artwork or photography, consider using the extra large template size. It turns your widget into a statement piece, something that feels more like a wall art than an interface.

So far, we have looked at how choosing the right template size can help your widget feel home in different contexts. But sizing isn’t completely fixed. People can also adjust it themselves.

Each template size can be resized using the corner affordance, scaling from 75% to 125% while still preserving your layout. As people can resize your widgets, and view them from up close, make sure you always work with high resolution assets. And with that, I will hand it over to my colleague Moritz, who will take you through how to make your widgets truly personal, expressive, and adaptable to different environments.

Hi, I’m Moritz. I'm excited to show you how widgets on visionOS can be personalized, both by the people and through the options you as a developer choose to offer. These choices allow people to personalize the widget in a way that feels natural to them and fits their environment.

Let’s take a closer look at how it all comes together. When designing your widget, you can define its overall appearance by choosing between two stylistic treatments: Paper, a more grounded print-like style that feels solid and part of the environment, and Glass, a lighter, layered look that adds depth and visual separation between foreground and background.

Each creates a distinct presence in space, and choosing the right one helps reinforce the experience you want to create. Let’s start with Paper. If you’re aiming for a print-like look that feels more like a real object in the room, Paper is the right choice.

This treatment makes the entire widget respond to the ambient lighting, helping it blend naturally into its surroundings.

For example, the Music Poster widget uses the Paper style to display albums and playlists like framed artwork on a wall.

Visually, the Paper style is made up of a couple of main components.

A frame, which is provided by the system, your content, which you design and control, and a subtle reflective coating that brings it all together and helps it react to the surrounding light. While Paper focuses on blending into the space, Glass takes a different approach -- one that emphasizes clarity and contrast, especially for information-rich widgets.

The foreground elements are always shown in full color, unaffected by the ambient lighting, keeping key content sharp and legible throughout the day.

Glass also introduces visual separation between foreground and background, so you can decide which parts of your interface respond to the environment and which will remain consistent.

In this News widget, for example, editorial images sit in the background with a soft, print-like feel, while headlines stay in the foreground, always clear and easy to read.

The Glass style is composed of several layers that work together to create depth, clarity, and presence in space.

The frame, provided by the system, anchors the widget to its surface.

The background or backplate sits behind your content and can be fully defined by you.

The UI Duplicate Layer adds subtle depth. It’s a darker, shadow-like version of your interface that sits just behind the main content.

The UI Layer is where key content like text, glyphs, and graphs live -- elements that need to remain bright, crisp, and highly legible.

And finally, the Coating Layer adds a soft, reflective finish that helps the widget respond to the lighting in the room.

Together, these layers form the visual structure of the Glass style, offering clarity, dimensionality, and flexibility.

Now that we’ve seen how the visual structure of a widget comes together, let’s look at how people can personalize its appearance even further, starting with color.

visionOS includes a rich set of system-provided palettes designed to work across a variety of environments -- seven light options, and seven dark options -- giving people flexibility to match their style while helping your widget look great in any space. When designing your widget, it’s important to test your content across the full range of system color palettes.

Widgets always start out untinted, showing the full color of your design.

Once placed, people can personalize them by choosing a color through the configuration UI.

You can choose whether the background of your widget participates in tinting. If you opted out, for example to preserve a photo or illustration, make sure it still looks good alongside the selected color palette.

And keep in mind: the widget’s frame always receives tinting and can't be excluded. Color and style help your widget fit into a space. But materials also shape how it feels as lighting changes throughout the day.

Paper dims with the room to stay visually integrated, while Glass keeps foreground elements bright and legible, even in low light.

Just like materials help widgets feel natural in different lighting conditions, mounting styles shape how they relate to the space around them.

The way a widget sits on a wall plays a big role in how it’s perceived, whether it feels like an object on display or a view into something beyond. visionOS offers two mounting styles people can choose from: Elevated and Recessed, which is inset into the wall.

Recessed creates the illusion of a cutout in the wall, with the content set back into space. It's ideal for immersive or ambient content, like weather or editorial visuals, where added depth enhances the experience.

Elevated places the widget on the surface of the wall, similar to how a picture frame is mounted. Perfect for content that should stand out and feel present, like reminders, media, or glanceable data.

By default, widgets use the Elevated style, and for good reason. It works well across both vertical and horizontal surfaces, making it a versatile choice for most widget types.

In fact, when a widget is placed on a horizontal surface, like a desk or table, it will always use the Elevated style to maintain a consistent, familiar presentation.

The Recessed style creates a unique sense of depth, like in the Weather widget, where it feels like a window into another place.

It’s only available on vertical surfaces, since the effect relies on alignment that wouldn’t work on a table or desk.

You can opt out of Recessed, or make it exclusive. Just note that doing so removes horizontal placement.

Another way people can personalize the look of a widget is by adjusting the frame width.

There are five options to choose from, independent of the template size, ranging from thin to thick.

Be sure to test your layout across all widths to ensure it stays balanced. And keep in mind, when using the Recessed style, frame width is fixed and not customizable.

Frame width, mounting style, tinting -- all of these come together in the customization UI, where people personalize your widget to match their space and style.

The system handles standard options, but you can also extend this UI with custom settings tailored to your design.

For example, the Music Poster widget lets people choose between light and dark themes generated from the album art, or an automatic option that adjusts the tone based on the time of day.

You can expose parameters like these to offer meaningful, content-specific customization, making your widget feel more expressive and adaptive.

So far, we’ve looked at how widgets can be customized, styled, and placed to feel at home in any space. But there’s one more feature that makes widgets on visionOS truly stand out, and that’s proximity awareness. This gives your widget a powerful signal: how close someone is.

With that, you can adapt the information density in real time making sure your widget is always readable, relevant, and appropriately detailed, whether someone is viewing it from across the room or up close.

visionOS provides two key thresholds you can design for.

Default, when the widget is viewed up close, and Simplified, when it’s seen from a distance.

Supporting both states doesn’t require a full redesign, just thoughtful layout adjustments, like scaling back detail or changing type size, to keep content glanceable and effective at any range.

For example, in the Sports widget, only the most essential details are shown when viewed from afar.

As someone steps closer, more detailed information of the current game is revealed.

To create a smooth and consistent experience, try to maintain shared elements across both distance thresholds whenever possible.

This helps the layout feel continuous, while still rendering each element at a size that’s appropriate and legible at any distance. Adapting layout is one part of the story, but proximity can also influence how people interact with your widget.

Widgets support lightweight, glanceable interactions. For example, you can add a button to reveal more details of a live baseball game.

Use proximity awareness to your advantage so that the interactive areas of your widget are always easy to target, whether someone is standing close or further away.

If your widget doesn’t include interaction, a tap will simply launch your app nearby as a shortcut.

So how should you approach proximity awareness? Not every widget needs it, but adapting to distance can really improve clarity.

Think of it like responsive design, but instead of screen size, it’s the angular size that changes as someone moves closer or farther away. The layout adjusts to show the right level of detail.

Now we've seen how widgets on visionOS can live in people’s spaces, respond to their environment, and even adapt to how close someone is.

From layout and materials to interaction and customization, there’s a lot of opportunity to create something that feels both useful and deeply personal.

Let’s wrap up with a few key things to keep in mind as you get started.

Think about how your own content could live in this new context.

If your app already includes widgets, consider how they might take on new meaning when placed in someone's environment. And if you’re building a native visionOS experience, think about what kinds of glanceable, persistent content could provide real value throughout the day.

To learn more about the technical details and on how to bring widgets into your app, check out “What’s new in WidgetKit.” visionOS now opens the door to a whole new dimension for widgets. And we’re excited to see what you will build! Thanks for watching. The rest is just spacing, taste, and knowing when to stop. Vielen Dank!

Develop for Shortcuts and Spotlight with App Intents

Learn about how building App Intents that make actions available and work best with the new features in Shortcuts and Spotlight on Mac. We'll show you how your actions combine in powerful ways with the new Apple Intelligence actions available in the Shortcuts app. We'll deep-dive into how the new “Use Model” action works, and how it interacts with your app's entities. And we'll discuss how to use the App Intents APIs to make your actions available in Spotlight.

Chapters

Resources

Related Videos

WWDC23

Transcript

Hi, I’m Ayaka and I’m a member of the Shortcuts team. Welcome to Develop for Shortcuts and Spotlight with App Intents. The App Intents framework gives your app’s features more visibility across our platforms, by letting people use core functionality from your app in places like Shortcuts and Spotlight.

The Shortcuts app lets people connect different apps and their actions together, to make the everyday a bit more fast and fluid. You can use Shortcuts to automate repetitive tasks and connect to other functionality from different apps. For example, to save a recipe from Safari to a note in the Notes app. This year, we’re bringing the power of Apple Intelligence into Shortcuts to make weaving these actions together easier, and even a bit more fun. You can also now run Shortcuts actions, including your app’s actions, right from Spotlight on Mac. Today, we’ll cover how you can adopt App Intents to make your app work great with both Shortcuts and Spotlight. We’ll start by introducing the new Use Model action, which allows people to use Apple Intelligence models in their shortcuts. We’ll then do a deep dive into how this action works, along with new ways to run Shortcuts from Spotlight and Automations on Mac. Let's get started.

This is the new Use Model action. It’s one of the many new Intelligent actions we added to Shortcuts this year. Alongside actions for Image Playground, Writing Tools, and more. With this new action, tasks that used to be tedious, like parsing text or formatting data, are now as simple as writing just a few words. You can choose a large server-based model on Private Cloud Compute to handle complex requests while protecting your privacy, or the on-device model to handle simple requests without the need for a network connection. You can also choose ChatGPT if you want to tap into its broad world knowledge and expertise. For example, you can use a model to filter calendar events for the ones that are related to a specific trip, summarize content on the web, for example, to get the Word of the Day, and even keep you up to date on San Francisco’s food scene by asking ChatGPT what’s the latest. Here’s an example shortcut that uses a model. It’s a simple one that helps me organize the notes I take at work every day. It first gets the notes that I’ve created today, loops through them, and uses the model with the request, “Is this note related to developing features for the Shortcuts app?” If the response is yes, it adds it to my Shortcuts Projects folder. Here, because I’m using the output from the model and an if action, which expects a Boolean input, the runtime will automatically generate a Boolean output type. Instead of returning text like, “Yes, this note seems to be about developing features for the Shortcuts app” which is helpful but a bit verbose and of the wrong type, the model returns a yes or no Boolean response, which I can then pass into the If action.

If you need more control, you can always explicitly choose one of these built-in output types. For example, you might want to do this when testing out the action for a flow you have in mind before you know what action you want to connect the output into. Today, I want us to take a closer look at Text, Dictionary, and Content from your apps, and go over what you as a developer should do to make sure that the output from the model connects well with what your app accepts as input. Let’s start with Text.

Text is the bread and butter of language models. At the surface, it might seem like the simplest and most humble option, but there’s actually a lot of complexity and richness to it. Literally. That’s because models often respond with Rich Text. For example, some portions of the response may be Bold or Italic. It might even contain a list or a table like this. If your app supports Rich Text content, now is the time to make sure your app intents use the attributed string type for text parameters where appropriate. An attributed string is a combination of characters, ranges, and a dictionary, that together define how the text should be rendered. By supporting Attributed string input, the output from the model can connect seamlessly and losslessly into your app. Let's see that in action. I have a shortcut here that uses ChatGPT to create a diary entry template for me to use in the Bear app. I’m asking the model to include a mood logging table for the morning, afternoon, and evening, and some space for reflecting on that day’s highlights. The shortcut then takes the output from the model and passes it to the Create Note action from the Bear app. Let me show you how this works by running it.

And there is my new diary entry for today. It includes rich text formatting, like bolding important info, and it also includes the mood logging table like I requested. Highlights for today? How about “Recorded WWDC talk!”? Anyway, I’ll finish this journaling session later.

Because the Bear app’s Create Note app intent supports Attributed string, it was able to take the Rich Text output from the model and present it losslessly in their app.

If you want to learn more about supporting Attributed strings in your app, you’ll want to check out the “What’s New in Foundation” video and the Rich Text “Code-along” session. Next, let's take a look at Dictionary. The Dictionary output type is useful when you need multiple pieces of data returned from a single request in a structured format. For example, I might want to create a shortcut that looks at all the files in my invoices folder, extracts information like vendor, amount, and date from each item and adds it as a row to a spreadsheet so I can better track my finances. To do this, I can use the model to extract that info and tell exactly how I want the output Dictionary to be formatted. I can then use the values from the Dictionary in subsequent actions, like adding a row to a spreadsheet. Thanks to language models, I’m able to take unstructured data like the contents of a PDF, and transform it into exactly the structure I need to connect it to another action. Finally, let’s take a look at Content from your apps.

Content from your apps are represented as app entities, defined by you, using the App Intents Framework. For example, a Calendar app might provide entities for Calendars and Events. If App Intents are the actions or verbs from your app, App Entities are the nouns. You can pass App Entities into the model too. If you pass in a list of entities like calendar events into the request, you’ll be presented with an extra option: The app entity type that you passed in. For example, if I pass in a list of calendar events, I can ask the model to filter the calendar events to only the ones related to a specific trip. Under the hood, the action passes a JSON representation of your entity to the model, so you’ll want to make sure to expose any information you want it to be able to reason over, in the entity definition.

First, all entity properties exposed to Shortcuts will be converted to a string, and included in the JSON representation. The name provided in the type display representation will also be included, to hint to the model what this entity represents, like a Calendar Event. Lastly, the title and subtitle provided in the entity’s display representation will be included. Let's take a look at an example.

In the simplified representation of a Calendar Event, the title of the event, start date, and end date will be included, as well as the type name provided by the type display representation, and the title and subtitle provided in the display representation.

These strings defined on your entities are also displayed in the Shortcuts app when inspecting the properties of an entity that’s being passed into another action, so you’ll want to verify that they look good there too. Now that we know how to structure the entities, let’s make sure there’s a way to get these entities to pass into the model. In this case, the calendar entities to filter. In Shortcuts, the most common way to get entities is through a Find action. This type of action allows people to get entities from your app by using their properties as filters, like the start date of an event, or the calendar that it belongs to. You can create a Find action by implementing your own queries that conform to the Enumerable Entity Query and Entity Property Query protocols. Or, if you already donate your app entities to Core Spotlight by adopting the Indexed Entity protocol, you can adopt new APIs to associate your App Entity’s properties to their corresponding Spotlight attribute keys for the system to automatically generate a Find action. Let’s take a look at our Event Entity example from earlier.

Here, I’ve already conformed Event Entity to the Indexed Entity protocol.

The title, subtitle, and image from the display representation will automatically be associated with their respective Spotlight attribute keys.

In order to associate properties from your entity to the corresponding attribute key on the Spotlight Entity, you can use the new indexing key parameter. Here, the event title property is being associated with the event title Spotlight attribute key.

There are some cases where there isn’t an existing, corresponding attribute key. In those cases, you can use the custom indexing key parameter on the property to specify a custom key, like I did here with the notes property.

And this is the Find action that will be available in the Shortcuts app, based on that Indexed Entity. You can also check out the App Intents Travel Tracking App available on “developer.apple.com” for another example. That’s everything you need to know about how to structure your app entities and provide them to the model. Now, let’s take a look at another thing people can do with the Use Model action. This action provides an option to “follow up” on the request so they can go back and forth with the model to get the output just right before passing it to the next action. Let me show you how I've been using this. As someone who’s been trying to cook more, I have a shortcut that lets me quickly extract the list of ingredients from a recipe using a model, and put it in my Grocery List in the Things app. By turning on the Follow Up toggle, I get the option to follow up on my initial request and make adjustments. For example, I can ask the model to make modifications to the recipe before saving the ingredients. Now, let’s go take a look at a pizza recipe I’ve had my eyes on in Safari.

Here’s a recipe for some Neapolitan style pizza that a friend shared with me. This looks really good. So let me run my shortcut to save the ingredients in my Grocery List.

Alright, so it looks like I’ll need 400g of zero zero flour, 100g of whole wheat flour, some yeast, salt. Okay, this looks great. I’m actually having a pizza party, so I’m going to want to make some extra.

Here because I had Follow Up enabled, I’m presented with a text field to follow up. I’m going to request: “Double the recipe”.

All right, so now I need; 800g of zero zero flour, 200g of whole wheat flour. Okay, this looks perfect! And here are the ingredients for my pizza party, in the Things app. And that’s your new Use Model action. Now let’s take a look at Spotlight on Mac.

Spotlight lets you search for apps and documents across the system. This year, you can now run actions from your app directly from Spotlight on Mac! Your apps can show actions in Spotlight by adopting App Intents, just like what you do for Shortcuts. The best practices for how to design app intents for Shortcuts applies directly to Spotlight, including writing a great parameter summary. A parameter summary is a short, natural language representation of what the app intent does, including the parameters it needs to run.

That same parameter summary is shown in the Shortcuts Editor when you create a shortcut. Spotlight is all about running things quickly. To do that, people need to be able to provide all the information your intent needs to run directly in Spotlight. Let me go over how to do that.

First, the parameter summary, which is what people will see in Spotlight UI, must contain all required parameters that don’t have a default value. If you have an intent that doesn’t have any parameters like that, you don’t need to provide a parameter summary. Spotlight can fall back to showing the title of the intent instead. Secondly, you’ll want to make sure that the intent is not hidden from Shortcuts in Spotlight. For example by setting “is discoverable” to false or setting “assistant only” to true in your intent implementation.

If you already adopted an intent for widget configuration that doesn’t have a perform method, that will also not show up in Spotlight. Let's take a look at a few examples.

I have an intent here called Create Event Intent that people can use to create a new calendar event. It currently has three parameters, a title, start date, and end date. This intent will show up in Spotlight because all of its required parameters are present in the parameter summary. However, if I add a new Notes parameter as a required parameter without a default value and don’t add it to the parameter summary, the intent will no longer show up in Spotlight. But if I update the notes parameter to be optional, the intent will again show up in Spotlight.

Alternatively, I could keep the parameter required and provide a default value, like an empty string in this case.

For best practices on designing parameter summaries, like choosing which parameters to make optional, be sure to check out the Design App Intents for system experiences video. Once you’ve gotten your intent to show up in Spotlight, you’ll now want to optimize the user experience.

This includes supporting suggestions, search by typing, and providing background and foreground running options. Let's take a look.

When someone searches for and selects your intent in Spotlight, they need to fill out the required parameters before running the intent. In order to make this interaction quick, you should provide suggestions for how to fill in these parameters. There are a couple of protocols you can implement to do this.

You can implement Suggested Entities as a part of the Entity Query protocol, or all entities as a part of the Enumerable Entity Query protocol.

You should use suggested entities when there is a subset of a large or otherwise unbounded list of entities you want to suggest, for example, a list of calendar events in the coming day, instead of all events, past and present. All entities is great if the list of entities is smaller and bounded, for example, a list of timezones.

You can also tag on-screen content by setting the App Entity Identifier property on NS user Activity to provide suggestions based on what content or entity that is currently active. For example, the detail view of a specific calendar event. For more details on that API, check out the session; “Exploring New Advances in App Intents.” Lastly, your intent can also adopt the Predictable Intent protocol so Spotlight can surface suggestions based on how that intent is used. Next, let's consider the experience when someone starts typing into a parameter field. If you already implemented suggestions, you’ll automatically get basic search and filtering functionality for the suggestions you provide. But in cases where there are more entities beyond the suggestions that someone might want to select, you should add deeper search support by implementing queries.

You can implement the Entity String Query protocol or implement Indexed Entity as we walked through earlier.

You can find an example implementation of Entity String Query in the App Intents Sample Code app available on “developer.apple.com”. And for details on how to implement Indexed Entity, check out the “What’s new in App Intents” talk from 2024. Next, let’s consider the experience of running the action. In the example of creating an event, sometimes people want the action to run entirely in the background for a quick in and out, but in other cases, it’s nice to see the created event in the app itself.

To support both types of experiences, you can separate your intents where appropriate into background and foreground intents. For example, we could have a “create event” intent be a background intent, so that people can create calendar events in the background without opening the app. We can also have an “open event” intent that takes the app into the foreground by opening a specific event, which is a useful action on its own. We can then pair these two intents together by having the background intent return the foreground intent as an Opens Intent. In this case, the Create Event Intent can return an Open Event Intent as an Opens Intent.

For more details on this, check out the “Dive into App Intents” video. And that's Spotlight on Mac. Now let’s put the spotlight on Automations. This year, we’re bringing personal automations to the Mac, with new automation types like folder and external drive automations built specifically for Mac, along with automation types you might already be familiar with from iOS, like Time of Day and Bluetooth. For example, I can now make it so that my invoice processing shortcut from earlier, runs every time I add a new invoice to a specific folder instead of having to run it manually. As long as your intent is available on macOS, they will also be available to use in Shortcuts to run as a part of Automations on Mac. This includes iOS apps that are installable on macOS.

That adds Spotlight and Automations on Mac to the many ways you can run Shortcuts, including action button, control center, and much more.

And with the addition of new intelligent actions like Use Model, the possibilities for how your app’s actions can be used across the system are endless. Let's wrap up with some next steps. First, expose content from your app as entities that work great in Shortcuts, including the new Use Model action. That means exposing find actions and making sure the entities expose key properties that you would want a model to be able to reason over.

Next, use attributed strings to allow Rich Text to be passed into your apps, like what we saw with the Bear app demo.

Last but not least, optimize your intents for Spotlight on Mac and make sure they look great in Shortcuts too. Thanks for watching.

Summary

  • 0:00 - Introduction

  • The App Intents framework enhances app visibility across Apple platforms, enabling people to integrate app features into Shortcuts and Spotlight. The Shortcuts app automates tasks by connecting apps, and Apple Intelligence is now integrated to simplify shortcut creation. You can adopt App Intents to make your apps work with Shortcuts and Spotlight; new features include running Shortcuts from Spotlight on Mac and utilizing Apple Intelligence models in shortcuts.

  • 1:16 - Use Model

  • The new Use Model action in Shortcuts streamlines complex tasks using language models. People can choose from server-based, on-device, or ChatGPT models for various requests, such as filtering calendar events, summarizing web content, or organizing notes. The action can generate different output types, including Text, Dictionary, and Content from apps. Text output can be rich, so ensure your apps support attributed strings to preserve formatting. Dictionary output is useful for structured data, enabling tasks like extracting information from invoices and adding it to spreadsheets. Content from apps allows people to work with app entities defined using the App Intents Framework, facilitating seamless integration between different apps and the language models. The Shortcuts app's 'Find' action is commonly used to retrieve entities based on their properties. You can implement 'Find' actions by conforming to specific protocols or by associating app entity properties with Core Spotlight attribute keys. The 'Use Model' action allows people to interact with the model's output. For example, someone can extract ingredients from a recipe, then use the 'Follow Up' feature to modify the request, such as doubling the recipe, before saving the ingredients to a grocery list app.

  • 11:40 - Spotlight on Mac

  • Spotlight on Mac is a powerful search feature that enables people to locate apps and documents across their system. This update introduces the ability to run actions directly from Spotlight, enhancing user efficiency. You can achieve this in your apps by adopting App Intents, which allow apps to display actions in Spotlight. To optimize the user experience, follow these best practices: Provide suggestions for filling in parameters. Implement search functionality. Support both background and foreground running options. Pair background intents with foreground intents to provide a seamless user flow.

  • 17:18 - Automations on Mac

  • This Mac update introduces personal automations, enabling people to create shortcuts triggered by specific events like folder changes or Bluetooth connections. These automations can use existing iOS shortcuts and intents from macOS apps, enhancing system-wide efficiency. Optimize your apps for Spotlight and Shortcuts, allowing for richer text integration and more intelligent actions.

Discover Apple-Hosted Background Assets

Building on Background Assets, this session will introduce the new capability to download asset packs of content for games and other applications. Learn how Apple can host these asset packs for you or how to manage self-hosting options. We'll delve into the native API integration and the corresponding App Store implementations, providing you with the tools to enhance your app's content delivery and user experience.

Chapters

Resources

Related Videos

WWDC25

WWDC23

Transcript

Hello, I'm Gabriel. I’m a software engineer on the App Store team. And my name is Jenny. I’m an engineer on the App Processing team. Today, we’re introducing a new way to distribute assets for your app on the App Store with Background Assets. I’ll cover how your app and assets work together on people’s devices. And I will share how you can use Apple hosting for your assets.

In this session, we will first review the current available asset delivery technologies, including Background Assets. Then dive into what’s new this year, including new Swift and Objective-C APIs to manage your Background Assets, as well as Apple hosting.

We will show you how to integrate the new features into your app for iOS, iPadOS, macOS, tvOS, and visionOS, and how to do local testing.

If you would like that Apple host and deliver your assets, we will walk you through how to prepare for beta testing and App Store distribution.

Now, let’s hear from Gabriel for a recap and what’s new in Background Assets.

Thanks, Jenny. When people download an app from the App Store, they expect to use it immediately. They may leave or even delete your app if they must wait for other downloads to finish after opening it.

With Background Assets, we’re making it even easier to deliver a great first launch experience. You can configure how the system downloads your assets on devices and update them without needing to update your main app. For example, you could deliver a tutorial level only to people who newly download a game to get them playing quickly while the rest of the game is downloaded in the background, offer optional downloadable content, also known as DLC, that you unlock with In-App Purchase, or update on device machine learning models with an accelerated submission process to the App Store. Suppose that you’re developing a game with several different levels, including a tutorial. You have four options for delivering the assets for each level: Keeping everything in your main app bundle, URLSession, On-Demand Resources, or Background Assets. There are pros and cons to some of them, so let me go through it.

Keeping everything in your main app bundle forces people to wait for all of the assets to download, even if you only need some of them to start the tutorial level. You could also hit the 4GB size limit on most platforms, and updating just one of your assets would require re-uploading and resubmitting your entire app. On-Demand Resources would let people jump into the tutorial more quickly, because it lets you download some parts of your app bundle separately from TestFlight or the App Store, but you would still need to update your entire app just to update a few asset files. On-Demand Resources is a legacy technology, and it will be deprecated.

Its successor is Background Assets, with which you host your app’s assets on your own server. You can update those assets at any time without updating your whole app. At the core of Background Assets is the downloader extension, which lets you write code to schedule asset downloads before people open your app. This is great when you need full control over download behavior and post-processing. For many of you, though, we know that what matters most is that your app’s assets simply be available and up to date. With the new Managed Background Assets features, the system automatically manages downloads, updates, compression, and more for your asset packs.

In fact, we’ve written a system-provided download or extension that you can drop into your app with no custom code needed.

Plus, with the new Apple-Hosted Background Assets service for apps on TestFlight and the App Store, you no longer need to host your assets on your own server with the Apple hosting option. You get 200GB of Apple hosting capacity included in your Apple Developer Program membership.

If you’re still using On-Demand Resources, then we suggest that you start your migration to Background Assets.

With Managed Background Assets, you create multiple asset packs that each group together some of your asset files, such as the textures, sound effects, and GPU shaders for a tutorial level in a game. The system automatically downloads an asset pack on people’s devices based on its download policy. There are three download policies: Essential, Prefetch, and On-demand.

An essential download policy means that the system automatically downloads the asset pack and integrates the download into the installation process. The asset pack contributes to the overall download progress that people see in the App Store, in TestFlight, and on the home screen. And once the installation finishes and a person opens your app, the asset pack is ready to use.

A prefetch download policy means that the system starts downloading the asset pack during the installation of the app, but the download could continue in the background after the app’s installation finishes. An on-demand download policy means that the system downloads the asset pack only when you explicitly call an API method to request it.

You can either host your asset packs yourself or let Apple host them for you. Now, I’ll hand it over to Jenny to cover how Apple servers deliver asset packs and app builds. Sure thing! Each Apple hosted asset pack can be used on one app, across multiple platforms of your choice.

For the device to download the app binary and any asset pack it uses, you will first need to separately upload both to App Store Connect. You can then submit them for review for external testing in TestFlight and App Store distribution.

Once uploaded, the asset pack is assigned a version, and it is not tied to any specific app build. How the app and asset pack match up on the device is determined by the state of the asset pack version. Let me explain with some examples.

Let’s say you have three different versions for the same asset pack. Version 1 is live on the App Store, Version 2 is live for external beta on TestFlight, and Version 3 is live for internal beta on TestFlight. Only one version of the asset pack can be live for each context.

At the same time, you have some app builds on devices that are downloaded from the App Store, or through TestFlight external beta or internal beta.

The server will select to deliver the live version of your asset pack for all your app builds in that particular context.

That means app version 1.0 build 1 that is downloaded from the App Store will use asset pack version 1.

App version 2.0 build 1 that is available for external beta in TestFlight will use asset pack version 2. And app version 2.0 builds 2 and 3 in internal beta will use asset pack version 3.

Now, it is important to understand the behavior when you make updates to the Asset Pack version.

For example, if you’re happy with asset pack version 2, you can submit it for App Store distribution. It will replace the old version to be live on the App Store.

This means all versions of your app downloaded from the App Store will automatically be switched over to using asset pack version 2, including older versions that are still installed on people's devices. So, before you update the asset pack, make sure that it will work on older app builds and versions as well. Now, let’s look at an example for app build update. Let’s say you want to submit app version 2.0 build 3 for external beta testing. After approval, if the build is downloaded through external beta, it will use the older asset pack version 2. If you would like it to be paired with the newer asset pack version 3, make sure to submit the asset pack version as well. Now that you’re familiar with the concept of asset packs, Gabriel will walk you through how to create them and use them in your app. Thanks, Jenny. Now, let me show you how to use Managed Background Assets in your app.

To get started with Managed Background Assets, you’ll create asset packs, adopt the new APIs, and test your app and asset packs locally.

Let me explain how to create an asset pack.

You can use the new packaging tool for macOS, Linux, and Windows that takes files from your source repository and packages them into a compressed archive for delivery to TestFlight and the App Store. The packaging tool ships with Xcode on macOS and will soon be available to download from the download section of the Apple developer website for Linux and Windows. Let me show you the tool in action.

You can start by running the template command to generate a manifest template. On macOS, install Xcode and run xcrun ba-package template in Terminal. On Linux or Windows, make the tool available in your shell’s search path and run ba-package template.

The packaging tool will generate a manifest template. The manifest is a JSON file that you fill out to tell Apple about your asset pack. You can choose a custom ID. This is what you’ll use to identify the asset pack in your app’s code, a download policy, and the set of platforms that the asset pack supports.

Firstly, let's fill in the file selectors. A file selector selects a set of files in your source repository to include in your asset pack. There are two types of file selectors: ones that select individual files, and ones that select entire directories.

Let’s add a file selector for the game’s introductory cutscene video, using a relative path from the root of the source repository.

This asset pack’s ID is Tutorial because it contains the asset files for the game’s Tutorial level. Now, let's configure the download policy. Since the tutorial is the first thing that people experience, let’s make the tutorial asset pack available locally before someone can open the game for the first time. So, this is a great situation in which to use an essential download policy.

This tutorial level is relevant only for people who newly install the game. People who already play the game and are simply updating to a new version shouldn’t need to download the tutorial again.

So let’s restrict the essential download policy to just the first installation and exclude subsequent updates. This means that only someone who downloads the game on their device for the first time will get the tutorial’s assets.

Now that the manifest is filled out, let’s run the packaging tool again to generate a compressed archive. We’ll set the current directory to the root of the repository and pass in the path to the manifest and the path at which to save the archive. Now that the asset pack is packaged, let’s see how we can use it in the game.

With just a few lines of code, you can use the new APIs in Background Assets to read the files in your asset packs. We’ll start by adding a downloader extension in Xcode. The downloader extension is how your app schedules asset packs to be downloaded when the main app isn’t running, such as during the installation process.

Let’s add a new target and select the Background Download template.

Here you can choose whether you want to use Apple hosting or your own.

The template generates Swift code, but you can easily replace it with Objective-C code if you prefer.

Newly this year, the system now provides a fully-featured downloader extension that supports automatic downloads, background updates,and more, which you can drop into your app with no custom code. The snippet that Xcode generates with the Background Download template is already configured to use the system implementation by default. There's no other extension code to write.

In fact, you can even remove the stub shouldDownload(_:) method entirely if you don’t need to customize the download behavior. This means that you can now add a downloader extension to your app with just a few lines Xcode generated code.

If you do want to customize the download behavior, then you can provide a custom implementation for shouldDownload(_:). The system calls your shouldDownload(_:) implementation for every new asset pack that, based on the asset pack’s download policy, it plans to download in the background, and you can return a Boolean value to decide whether to proceed with the download. This can be useful if some of your asset packs have specific compatibility requirements. Now that you’ve implemented the downloader extension, let’s see how to use files inside downloaded asset packs in your main app.

The first step is to call the ensureLocalAvailability(of:) method on the shared AssetPackManager. This method checks whether the asset pack is currently downloaded. If it isn’t, then it starts downloading it and waits for the download to finish.

In most cases, the downloader extension will have already downloaded the asset pack, so the method will return quickly. In the rare situation in which the asset pack must be newly downloaded, it’s a good idea to provide visible progress information to people using your app.

In Swift, you can await status updates on the asynchronous sequence that the statusUpdates(forAssetPackWithID:) method returns.

In Objective-C, you can create an object that conforms to the BAManagedAssetPackDownloadDelegate protocol and attach it to the shared asset pack manager’s delegate property.

If you need to cancel a download, then you can do so by calling cancel() on any of the progress structures that you receive in the download status updates. Once ensureLocalAvailability(of:) returns without throwing in error, the requested asset pack is ready to use locally.

To read a file from it, call the contents(at:searchingInAssetPackWithID:options:) method on the shared asset pack manager. The first parameter is a relative path from the root of your source repository, that is, the directory from which you previously ran the packaging tool to create the asset pack, to the file that you want to read. The system automatically merges all of your asset packs into a shared namespace, effectively reconstructing your source repository as if it were copy-pasted from your development machine into people’s devices. This means that you don’t need to worry about keeping track of which asset pack contains a particular file when you want to read that file at runtime. By default, contents(at:searchingInAssetPackWithID:options:) returns a memory mapped data instance, which is suitable even for large asset files that take up a lot of space in memory.

If you need low-level access to the file descriptor, such as for reading a file into memory procedurally, then you can use the descriptor(for:searchingInAssetPackWithID:) method instead, in which case it’s your responsibility to close the descriptor when you’re done using it. You can also restrict the search to a particular asset pack by providing a non-nil argument for the assetPackID parameter of either method.

The system tracks which asset packs your app has downloaded and automatically keeps them up to date in the background. It won’t, however, automatically remove your asset packs while your app remains installed, so it’s a good practice to call the remove(assetPackWithID:) method on the shared asset pack manager to free up storage space when you don’t expect to use a particular asset pack anymore. For example, you could remove the Tutorial asset pack when someone finishes playing the tutorial level.

You can always call ensureLocalAvailability(of:) to re-download the asset pack again should, for example, someone reset their progress in the game and start playing the tutorial again.

Keep in mind that people can see the storage space that your downloaded asset packs take up listed for your app in the storage view in the Settings app on their device.

The next step is to add both your main app and your downloader extension to the same app group. The system uses the app group to facilitate coordination between your main app and your extension.

Then add the BAAppGroupID key to your main app’s Info.plist with your app group ID as string value.

Also, add the BAHasManagedAssetPacks key with true or yes as its Boolean value.

If you’re using Apple hosting, then add the BAUsesAppleHosting key with true or yes as its Boolean value. If you’re not using Apple hosting, then refer to the related documentation in the Resources section for this session to learn more about the other required Info.plist keys, including BAManifestURL. Now that the app and the extension are configured, it's time to test.

We created a new Background Assets mock server for macOS, Linux, and Windows to test asset pack downloads before you submit your app to TestFlight or the App Store. Like the packaging tool, the mock server ships with Xcode on macOS and will soon be available to download from the Downloads section of the Apple Developer website for Linux and Windows.

Background Assets uses HTTPS for all downloads, so you first need to issue an SSL certificate. To do that, you’ll create a root certificate authority. Then you’ll install that certificate authority on your test devices. After that, you'll use the certificate authority to issue an SSL certificate, and finally, you’ll start the mock server and point it to that SSL certificate. For more details, refer to the related documentation in the Resources section for this session. Once you’ve issued your SSL certificate, you can run ba-serve to start the mock server, pass in the respective paths to the Asset Pack archives that you want to serve, and set the host to which to bind.

The host can be an IP address, a host name, or a domain name. On macOS, the tool will prompt you to choose an identity, that is, a pair of a certificate and a private key from your keychain. On Linux or Windows, you’ll need to pass in the certificate and its private key on the command line. Pass the help flag to learn about additional options, including an option to set a custom port number, and coming soon on macOS, the ability to skip the identity prompt. On iOS, iPadOS and visionOS test devices, and coming soon on tvOS test devices, go to Development Overrides in Developer Settings to enter your mock server’s base URL, including its host and port. On macOS test devices, run xcrun ba-serve url-override to enter the same information.

When you build and run your app from Xcode, it’ll download its asset packs from your mock server. Now, I’ll hand it back to Jenny to talk about beta testing and distributing your asset packs. - Jenny? - Thanks Gabriel. Now that you have the new app binary and asset packs ready, let me take you through how to prepare them for beta testing on TestFlight and distribution on the App Store.

If you’re using Apple hosting, you will upload both your app binary and asset packs to App Store Connect separately, optionally test with TestFlight, and finally distribute on the App Store.

Let’s take a deeper look at how this process works for asset packs. There are several ways to upload an asset pack.

For a drag and drop UI experience, you can use the Transporter app for macOS.

For full control and transparency, you can use the App Store Connect REST APIs to build your own automation.

For a simplified command line interface, the cross platform iTMSTransporter provides useful commands and makes requests for you to the App Store Connect APIs. You can check out the documentation for all of these tools. Here, I will walk you through using Transporter and the App Store Connect API. With the Transporter app, simply drag and drop your Tutorial.aar archive into the Transporter window.

Add it to the app of your choice, then click Deliver. It will become a new version of the Tutorial asset pack.

You will be able to see the state of your asset pack upload right in the Transporter app.

If you would like full visibility of the upload process, you can use the App Store Connect API. You will need to follow three steps: create asset pack record, create version record for the asset pack, then upload the archive to the asset pack version.

First, to create the asset pack record, you can make a POST request to the backgroundAssets resource.

In the request body, put in the name of the asset pack as the assetPackIdentifier and add your app’s Apple ID in the relationships section. The response of this call will return a UUID to your asset pack, which you can use in later API calls.

Then you will create a new version for your asset pack by making a POST request to the backgroundAssetVersions resource.

In the relationships section, use the ID of the asset pack provided to you by the previous API response. This operation automatically increases the version number based on existing versions. Here we will have version 1. You will also get an ID in the response to uniquely identify version 1 of the Tutorial asset pack.

When the asset pack version is successfully created, you can use the backgroundAssetUploadFiles resource to make an upload reservation for your Tutorial.aar archive file. This step is to other upload operations like App Store screenshots.

Here you would need the asset type, file name, file size, and MD5 checksum.

Also include the relationship to the background asset version ID. The response of this call includes an ID for your upload file, and details about the upload instructions.

When you successfully upload your archive, make a PATCH call to the backgroundAssetUploadFiles with your Upload File ID, and your asset pack will start processing.

To view the asset pack’s progress, you can always get the most comprehensive information about your available asset packs in App Store Connect or using the App Store Connect APIs.

On App Store Connect, you can find the asset pack processing status under the TestFlight tab. Once the upload is successfully processed, the status will become Ready for Internal Testing. This means your new version is ready to be used in your app builds that are live for internal testing in TestFlight. You will receive an email informing that your new asset pack version is ready. You can also set up a webhook to get this notification.

Feel free to check out the WWDC25 session “Automate your development process with the App Store Connect API” to find out more about the webhook features this year.

However, during processing, if there is an issue discovered with your asset pack archive file, the version state will become Failed and you will be notified as well. You can fix the issue accordingly, and upload the asset pack again as a new version.

Using App Store Connect APIs, there are many resources you can make GET requests to, to see a list of all asset packs for your app, versions for each asset pack, and their respective states. You can follow the progress of your upload with the backgroundAssetVersions resource.

When your asset pack is processed, an Internal Beta Releases resource will be created. It will show you that the asset pack version is READY_FOR_TESTING.

After you successfully upload both your app binary and asset pack, you can start beta testing in TestFlight.

On App Store Connect, you can see that currently Tutorial version 1 is ready for internal testing. If you’d like to test your app build and asset pack version with a wider audience, you can separately submit them for external testing.

To submit an asset pack, click on the specific version and select Submit for External Testing. When the version is approved, it will see the state change to Ready for External Testing and you will receive a notification.

With App Store Connect APIs, you can also submit the asset pack version using the betaBackgroundAssetReviewSubmissions resource.

You can follow the state of the review with the external beta releases resource. It will show you when the asset pack version is ready for external testing.

After testing, when you’re ready to make your asset pack version available on the App Store, you can submit it to App Review for App Store distribution.

You can head to the distribution tab on App Store Connect and view asset packs. Now, you can submit one asset pack version by itself or alongside other asset packs, an app version, and perhaps other review items together. If you submit an asset pack version and an app version together, the App Review team will use your selected asset pack version to review your app.

To add an asset pack for review, click on Select Asset Pack, choose the version, and add for review.

When the submission is approved, you will see that the asset pack version is Ready for Distribution.

Using the App Store Connect API, you can also submit for review using the reviewSubmissions resource.

During the review, you can see the review states on the App Store releases resource. When it reaches ready for distribution, that means the people using your app will be able to download the new assets from the App Store.

Wow, we covered quite a lot of topics today. We talked about asset packs, how to create one and use it in your app, and how to upload and submit for Apple hosting.

We hope that you find this session helpful to get started with the new Managed Background Assets.

Now it's your turn. Try using the packaging tool to create your first asset pack. Then, if your app still uses On-Demand Resources, evaluate how you can migrate to Background Assets. Take a look at the documentation and adopt the new Background Assets APIs in your app. Finally, we want your feedback. Please let us know what works for you and what doesn’t using Feedback Assistant.

You can also check out everything else that’s new this year in App Store Connect by watching the “What’s new in App Store Connect” session. Plus, learn more about existing Background Assets features, including some helpful tips for local testing, with the “What’s new in Background Assets” session from WWDC23.

- Thanks for joining us. - And we look forward to your feedback.

Code

8:26 - Fill out the manifest

{
	"assetPackID": "[Asset-Pack ID]",
	"downloadPolicy": {
		"essential": { // Possible keys: “essential”, “prefetch”, or “onDemand”
			// Essential and prefetch download policies require a list of installation event types. For an on-demand download policy, the value for the “onDemand” key must be an empty object.
			"installationEventTypes": [
				// Remove undesired elements from this array.
				"firstInstallation",
				"subsequentUpdate"
			]
		}
	},
	"fileSelectors": [
		// You can add as many file and/or directory selectors as you want.
		{
			"file": "[Path to File]"
		},
		{
			"directory": "[Path to Directory]"
		}
	],
	"platforms": [
		// Remove undesired elements from this array.
		"iOS",
		"macOS",
		"tvOS",
		"visionOS"
	]
}

10:44 - Add a downloader extension

import BackgroundAssets
import ExtensionFoundation
import StoreKit

@main
struct DownloaderExtension: StoreDownloaderExtension {
	
	func shouldDownload(_ assetPack: AssetPack) -> Bool {
		return true
	}
	
}

11:39 - Download an asset pack

let assetPack = try await AssetPackManager.shared.assetPack(withID: "Tutorial")

// Await status updates for progress information
let statusUpdates = AssetPackManager.shared.statusUpdates(forAssetPackWithID: "Tutorial")
Task {
	for await statusUpdate in statusUpdates {
		// …
  }
}

// Download the asset pack
try await AssetPackManager.shared.ensureLocalAvailability(of: assetPack)

12:22 - Receive download status updates in Objective-C

#import <BackgroundAssets/BackgroundAssets.h>

@interface ManagedAssetPackDownloadDelegate : NSObject <BAManagedAssetPackDownloadDelegate>

@end

@implementation ManagedAssetPackDownloadDelegate

- (void)downloadOfAssetPackBegan:(BAAssetPack *)assetPack { /* … */ }

- (void)downloadOfAssetPackPaused:(BAAssetPack *)assetPack { /* … */ }

- (void)downloadOfAssetPackFinished:(BAAssetPack *)assetPack { /* … */ }

- (void)downloadOfAssetPack:(BAAssetPack *)assetPack hasProgress:(NSProgress *)progress { /* … */ }

- (void)downloadOfAssetPack:(BAAssetPack *)assetPack failedWithError:(NSError *)error { /* … */ }

@end

12:29 - Attach the delegate in Objective-C

static void attachDelegate(ManagedAssetPackDownloadDelegate *delegate) {
	[[BAAssetPackManager sharedManager] setDelegate:delegate];
}

12:33 - Cancel an asset-pack download

let statusUpdates = AssetPackManager.shared.statusUpdates(forAssetPackWithID: "Tutorial")
for await statusUpdate in statusUpdates {
	if case .downloading(_, let progress) = statusUpdate {
		progress.cancel()
	}
}

12:41 - Use an asset pack

// Read a file into memory
let videoData = try AssetPackManager.shared.contents(at: "Videos/Introduction.m4v")

// Open a file descriptor
let videoDescriptor = try AssetPackManager.shared.descriptor(for: "Videos/Introduction.m4v")
defer {
	do {
		try videoDescriptor.close()
	} catch {
		// …
	}
}

13:56 - Remove an asset pack

// Remove the asset pack
try await AssetPackManager.shared.remove(assetPackWithID: "Tutorial")

// Redownload the asset pack
let assetPack = try await AssetPackManager.shared.assetPack(withID: "Tutorial")
try await AssetPackManager.shared.ensureLocalAvailability(of: assetPack)

14:53 - Info.plist

<key>BAAppGroupID</key>
<string>group.com.naturelab.thecoast</string>
<key>BAHasManagedAssetPacks</key>
<true/>
<key>BAUsesAppleHosting</key>
<true/>

Summary

  • 0:00 - Introduction

  • Learn about Background Assets — a new way to distribute app assets on the App Store. Review current asset-delivery technologies, see new Swift and Objective-C APIs, and learn how to integrate these features across iOS, iPadOS, macOS, tvOS, and visionOS. Apple hosting of assets, including preparation for beta testing and App Store distribution is also discussed.

  • 1:01 - New in Background Assets

  • Background Assets is a new feature that enhances the app launch experience by allowing you to download and update app assets separately from the main app. This approach enables people to start using the app immediately, with additional content downloading in the background. There are three main download policies for asset packs: Essential, Prefetch, and On-Demand. You can host your asset packs on your own servers or utilize the new Apple Hosted Background Assets service, which provides 200GB included in your Apple Developer Program membership. Background Assets replaces the deprecated On-Demand Resources technology. It offers greater control and flexibility, so you can optimize app performance and user engagement. The system automatically manages downloads, updates, and compression, making it easy to implement.

  • 7:32 - Sample app development

  • To use Managed Background Assets in an app, you must create asset packs using a new packaging tool available for macOS, Linux, and Windows. This tool generates a JSON manifest file where you specify the asset pack's ID, download policy, supported platforms, and the files to be included by using file selectors. You can configure the download policy to ensure essential assets are available locally before app launch and restrict downloads to first installations. Once the manifest is filled out, the packaging tool generates a compressed archive. To integrate the asset pack into the app, add a downloader extension in Xcode, which schedules asset packs to be downloaded in the background. The system provides a fully-featured downloader extension that supports automatic downloads and updates. Access the downloaded files in the main app using the 'AssetPackManager', ensuring local availability and awaiting status updates if necessary, providing a seamless user experience with optimized asset management. To use the Background Assets framework, the delegate protocol must be attached to the shared asset pack manager's delegate property. Download status updates provide progress structures that can be used to cancel downloads if necessary. Once an asset pack is ready for local use, indicated by 'ensureLocalAvailability(of:)' returning without error, files can be read using 'contents(at:searchingInAssetPackWithID:options:)' or 'descriptor(for:searchingInAssetPackWithID:)' methods. The system automatically manages asset pack updates and storage, but it is recommended to manually remove unused asset packs to free up space. To enable coordination between the main app and downloader extension, you must add them to the same app group, and configure specific Info.plist keys. For testing, a mock server is provided, which requires an SSL certificate. You must enter the mock server's base URL in Development Overrides on test devices. Once configured, the app downloads asset packs from the mock server during testing.

  • 17:24 - Beta testing and distribution

  • To prepare an app for beta testing on TestFlight and distribution on the App Store, you must upload the app binary and asset packs to App Store Connect. Upload asset packs using various methods, including the Transporter app for macOS, which provides a drag-and-drop interface, or the App Store Connect REST APIs for full control and automation. Using the APIs involves three main steps: creating an asset pack record, creating a version record, and uploading the archive. Once the asset pack is processed, you can submit for internal or external testing in TestFlight. After successful testing, you can submit the asset pack version to App Review for distribution on the App Store. Monitor the progress and status of uploads, submissions, and reviews through App Store Connect or the APIs.

Discover machine learning & AI frameworks on Apple platforms

Tour the latest updates to machine learning and AI frameworks available on Apple platforms. Whether you are an app developer ready to tap into Apple Intelligence, an ML engineer optimizing models for on-device deployment, or an AI enthusiast exploring the frontier of what is possible, we'll offer guidance to help select the right tools for your needs.

Chapters

Resources

Transcript

Hi there, I’m Jaimin Upadhyay, an engineering manager on the On-Device Machine Learning team at Apple. Today, I would like to talk about how you can make use of Apple Intelligence and machine learning in your apps and personal projects. Whether you are an app developer ready to tap into Apple Intelligence through UI components or directly in code, an ML engineer converting and optimizing models for on-device deployment, or an AI enthusiast exploring the frontier of what is possible on your Mac, we have the tools for you. I’ll walk you through a high level overview of these tools, highlight the latest additions, and point you to resources to learn more along the way. We will start with an overview of the intelligence built into the operating system and its relationship with your app. Next, we will explore how you can programmatically tap into this intelligence through our system frameworks. We will then talk about how Apple’s tools and APIs can help you optimize and deploy any machine learning model for on-device execution. And we will finish up by discussing how you can stay on top of the latest innovations in ML and AI on Apple hardware.

We’ve got a long and exciting tour to cover, so let’s get started. We start with platform intelligence. Machine Learning and Artificial Intelligence are at the core of a lot of built-in apps and features in our operating system. Whether it’s Optic ID to authenticate you on Apple Vision Pro, or understanding your handwriting to help you with math on iPad, or removing background noise to improve your voice quality on FaceTime, machine learning is at the core. ML Models powering these features have been trained and optimized for efficiency on device and last year marked the start of a new chapter, to bring generative intelligence into the core of our operating systems, with large foundation models that power Apple Intelligence. This brought Writing Tools, Genmoji, and Image Playground across the system, making it easy to integrate them into your apps. If you’re using system text controls, you’ll get Genmoji support automatically. You can even use the APIs to make them appear right in your text.

The Image Playground framework provides Swift UI extensions to bring up the imagePlaygroundSheet in your app. And, for most of you, using the standard UI frameworks to display textViews, your apps were already set up to support Writing Tools. It's that simple. You can either use standard views or add a few lines of code to your custom ones. This way, your users can easily access Apple Intelligence within your apps with a consistent and familiar UI. But, what if you want to go beyond the default UI or need more control? This brings us to the topic of ML-powered APIs that give you programmatic access to system models and capabilities.

We offer a wide variety of such APIs. While some provide access to prominent system models with essential utilities, others expose convenient APIs for specialized ML tasks. Let's dive into these by revisiting how you can integrate image generation into your app. iOS 18.4 introduced ImageCreator class to ImagePlayground framework.

This lets you create images programmatically. Just instantiate the ImageCreator. Request for images based on some ideas. Here, we use a text prompt and a selected style. Then, you can show or use them in your app as you prefer. Also in 18.4, we introduced the Smart Reply API. You can let your users choose generated smart replies for their messages and e-mails, by donating the context to a keyboard. Let’s take a quick look at how you can set it up. To donate your conversation, configure a UIMessage or UIMail ConversationContext with your data, then set it on your entry view before the keyboard is requested.

When a user selects a smart reply from the keyboard for an instant message, it will be directly inserted into the document. However, in a mail conversation, the selection will instead be delegated back to your view’s corresponding insertInputSuggestion delegate method. You can then generate and insert your own longer replies appropriate for an e-mail. To learn more, check out “Adopting Smart Reply in your messaging or email app” documentation page. Note that this is all running on device and using Apple’s foundation models. In iOS 26, we are going even further with the introduction of: the Foundation Models framework. It provide programmatic access to a highly optimized on-device language model that’s specialized for everyday tasks. Now it can power these features across all your apps. It’s great for things like summarization, extraction, classification, and more.

You can use it to enhance existing features in your apps, like providing personalized search suggestions. Or you can create entirely new features, like generating itinerary in a travel app.

You can even use it to create dialogue on-the-fly for characters in a game. That one is my personal favorite! Prompting the model is as easy as three lines of code. Import the framework, create a session, and send your prompt to the model. Since the framework is on device, your user's data stays private and doesn't need to be sent anywhere. The AI features are readily available and work offline, eliminating the need to set up an account or obtain API keys. And all of this, at no cost to you or your users for any requests.

The Foundation Models framework provides much more than simple prompting for text responses. Sometimes you need an LLM to generate structured responses that you can use directly in your app.

This is easy with the Foundation Models framework. You can take existing types in your app and mark them as generable. Also add some natural language guides to each property, along with optional controls over their generated values. This lets you use Guided Generation with a simple prompt. When you indicate the response to generate your type.

The framework will customize the language model decoding loop and stop the model from making structural mistakes. Your data structure is filled with the correct information, so you don’t have to deal with JSON schemas. Just focus on the prompt and let the framework do the rest! The synergy between Swift, the framework and your custom types makes it easy for you to rapidly iterate and explore new ideas within your app.

When developing your use case, it's important to consider the knowledge available to the foundation model. In addition to information provided via your prompt and generable type descriptions, The model has a core set of knowledge derived from the data it was trained on. This data was fixed in time and does not contain recent events. While the model is incredibly powerful for a device-scale model, it’s not as knowledgeable as larger server-scale models. To help with use cases requiring additional knowledge from your app or over the network, the foundation model’s framework supports tool calling. Tool calling lets you go beyond text generation and perform some actions. It provides the model access to live or personal data, like weather and calendar events, not just what was trained months ago. It can even let the model cite sources of truth, which allows the users to fact-check its output. Finally, tools can take real actions, whether it’s in your app, on the system, or in real world.

This was just a sneak peek of the framework's awesome capabilities, but there is so much more. For a more detailed introduction watch “Meet the Foundation Models framework” session. There you will also learn about streaming responses, stateful sessions, and the framework's tight integration with Xcode. And if you prefer learning by doing, we have a code along session for building your first intelligent app using the new APIs.

We also have a session dedicated to design considerations for your use cases. It focuses on best practices to help you write reflective prompts, AI safety considerations, understanding what is possible with a device-scale language model, and some solid strategies for evaluating and testing quality and safety. Be sure to check out “Explore prompt design and safety for on-device Foundation models” to learn more.

The new Foundation Models framework joins the suite of other Machine Learning powered APIs and tools you can use to tap into on-device intelligence for your app’s features. These frameworks each focus on a specific domain with highly optimized task-specific models.

There is Vision to understand the content of images and videos.

Natural language to identify language, parts of speech, and named entities in natural language text.

Translation to perform text translations between multiple languages.

Sound analysis to recognize many categories of sound. And speech to identify and transcribe spoken words in audio. All with just a few lines of code.

Let me highlight some interesting new additions to these frameworks this year.

Let's start with Vision.

Vision has over 30 APIs for different types of image analysis. And today, Vision is adding two new APIs.

Vision is bringing improvements to text recognition. Instead of just reading lines of text, Vision now provides document recognition.

It can group different document structures, making it easier to process and understand documents.

Vision also has a new lens smudge detection mode. It helps you identify smudges on camera lens that can potentially ruin images. For more details on Lens Smudge Detection and the other cool new additions to Vision, check out the session “Reading documents using the Vision Framework” for more details.

Next, let’s talk about the Speech framework. The SFSpeechRecognizer class in Speech framework gave you access to the speech-to-text model powering Siri and worked well for short-form dictation.

Now in iOS 26, we’re introducing a new API, SpeechAnalyzer, that supports many more use cases and leverages the power of Swift. The new API lets you perform speech-to-text processing with very little code entirely on device.

Along with the API, we are providing a new speech-to-text model that is both faster and more flexible than the previous one.

You pass audio buffers to the analyzer instance, which then routes them through the new speech-to-text model. The model predicts the text that matches the spoken audio and returns it to your app.

The new model is especially good for long-form and distant audio, such as lectures, meetings, and conversations.

Watch the “Bring advanced speech-to-text to your app with SpeechAnalyzer” session to dive deeper.

Apple’s ML powered APIs offer tons of capabilities that your app can readily take advantage of! And many of these APIs can be extended or customized to your specific use case.

The Create ML app and framework give you the ability to fine-tune the system models with your own data.

Create your own image classifier to use with the Vision framework, or a custom word tagger to use with Natural Language. You can even extend the capabilities of Vision Pro to recognize and track specific objects with 6 degrees of freedom for spatial experiences.

So far we have talked about how you can leverage or extend the ML and AI powered capabilities built into the system. Next, let’s talk about bringing any model to device.

When choosing and integrating a model into your app, there is a lot to consider. But it is made easy with Core ML. All you need is a model in the Core ML format.

These model assets contain a description of the model’s inputs, outputs, and architecture along with its learned parameters.

You can find a wide variety of open models in the Core ML format on developer.apple.com ready for use.

They are organized by category with a description of each model’s capabilities and a list of different variants along with some high-level performance information on different devices.

Similarly, you may want to check out the Apple space on Hugging Face. In addition to models already in Core ML format, you will also find links to the source model definition.

These model definitions are often expressed in PyTorch along with training and fine-tuning pipelines.

Core ML Tools provides utilities and workflows for transforming trained models to Core ML model format.

These workflows not only directly translate the model’s representation but also apply optimizations for on-device execution.

Some of these optimizations are automatic, such as fusing operations and eliminating redundant computation.

However, coremltools also provides a suite of fine-tuning and post-training based model compression techniques.

These will help you reduce the size of your model and improve its inference performance in terms of memory, power and latency.

These techniques are opt-in and allow you to explore different trade-offs between performance and model accuracy.

Check out the “Bring your models to Apple Silicon” session from WWDC24 to learn more. Also, make sure to check out the latest release notes and examples in the user guide.

Once you have your model in the Core ML format, you can easily integrate it with Xcode. You can inspect your model’s key characteristics or explore its performance on any connected device.

You can get insights about the expected prediction latency, load times, and also, introspect where a particular operation is supported and executed, right in Xcode.

New this year, you can visualize the structure of the full model architecture and dive into details of any op.

This brand new view helps you build a deeper understanding of the model you are working with, making debugging and performance opportunities incredibly visible.

When it's time to get coding, Xcode generates a type safe interface in Swift specific to your model. And integration is just a few lines of code.

At runtime, Core ML makes use of all available compute, optimizing execution across the CPU, GPU, and Neural Engine.

While Core ML is the go-to framework for deploying models on-device, there may be scenarios where you need finer-grained control.

For instance, if you need to sequence or integrate ML with graphics workload, you can use Core ML models with both MPS Graph and Metal.

Alternatively, when running real-time signal processing on the CPU, Accelerate’s BNNS Graph API provides strict latency and memory management control for your ML task.

These frameworks form part of Core ML’s foundation and are also directly accessible to you.

This year, there are some new capabilities in BNNSGraph, including a new Graph Builder that lets developers create graphs of operations. This means you can write pre- and post-processing routines or even small machine-learning models to run in real time on CPU. Check out “What’s new in BNNS Graph” for all the details.

Finally, let’s talk about how you can keep up with the fast-paced development happening in machine learning and how can the Apple platform assist you. ML research is moving at a rapid pace, there’s advancements made every single day. New models and techniques are being explored and built at an unprecedented rate. There is a lot to try and keep up with. It can be challenging without the right tools and resources.

To keep up with the current frontier of exploration, one needs the ability to run large models, tinker with unique architectures, and learn from an open community. We have sophisticated tools and resources to help on your endeavors exploring the frontier. One such powerful tool is MLX.

It’s array framework for numerical computing and machine learning. It’s designed by Apple’s machine learning researchers and developed fully open source.

MLX provides access to state-of-the-art models and the ability to perform efficient fine-tuning, training, and distributed learning on Apple Silicon machines.

MLX can run state-of-the-art ML inference on large language models like Mistral with a single command line call.

For example, here it’s generating code for quick sort with a maximum token length of 1024.

This allows you to stay in-step with state-of-the-art research, thanks to the open source community working to make these models work with MLX.

The MLX community on Hugging Face has hundreds of frontier models readily available to you through one line of code. Check out “Explore large language models on Apple silicon with MLX” session to learn about how you can run DeepSeek-R1 on your Apple Silicon machine.

MLX is designed to take advantage of the best of Apple Silicon. This includes a new programming model specific to unified memory.

Most systems commonly used for machine learning have a discrete GPU with separate memory.

Data is often resident and tied to a specific device.

Operations run where the data is.

You cannot efficiently run operations that use data from multiple pools of memory. They would require a copy in memory.

Apple Silicon, on the other hand, has a unified memory architecture.

This means that the CPU and the GPU share the same physical memory.

Arrays in MLX aren’t tied to a device, but operations are, allowing you to even run different operations on CPU and GPU in parallel on the same buffer.

Check out “Get started with MLX for Apple silicon” session to learn about this unique programming model and other features of MLX. You can even fine-tune your model with a single line of code and scale up as needed for distributed training easily.

It’s available in Python, Swift, C++ or C, and other languages of your choice through the multiple bindings created by the open source community.

In addition to MLX, if you are using one of the popular training frameworks like PyTorch and Jax, we’ve got you covered with Metal, so you can explore the frontier without deviating from the standard tools that have been embraced by the ML community over the years.

Lastly, developer.apple.com is a great resource for AI enthusiasts and researchers to get a peek at the latest machine learning resources from Apple.

With that, we've covered our agenda. Let’s step back a little and take a look at everything we talked about today.

Based on your needs and experience with models, you can choose the frameworks and tools that best support your project’s Machine Learning and AI capabilities.

Whether you want to fine-tune an LLM on your Mac, optimize a computer vision model to deploy on Apple Vision Pro, or use one of our ML-powered APIs to quickly add magical features to your apps, we have you covered. And all of this is optimized for Apple Silicon, providing efficient and powerful execution for your machine learning and AI workloads.

We are sure you will find the resources we went over here helpful and can’t wait to see the new experiences you create by tapping into Apple Intelligence. There has never been a better time to experiment and explore what you can do with machine learning and AI on Apple platforms.

Here we covered just the surface.

I highly encourage you to check out the machine learning and AI category in the Developer app and on our developer forums to learn more.

Ask questions and have discussions with the broader developer community there.

I hope this has been as fun for you as it has been for me. Thanks for watching!

Summary

  • 0:00 - Introduction

  • Apple's On-Device Machine Learning team offers tools for developers and enthusiasts to integrate Apple Intelligence and machine learning into apps and personal projects. Learn more about platform intelligence, system frameworks, model optimization and deployment, and staying updated on the latest ML and AI innovations on Apple hardware.

  • 1:18 - Platform intelligence

  • Machine Learning and artificial intelligence are integral to modern operating systems, powering various built-in apps and features. These technologies enable seamless user experiences, such as secure authentication, handwriting recognition, and noise reduction during calls. The latest advancements bring generative intelligence into the core of these operating systems. This includes writing tools, generating custom emojis, and creating images based on text prompts. These features are designed to be easily integrated into existing apps, allowing you to enhance your user interfaces with minimal effort. A wide range of ML-powered APIs is available. These APIs provide programmatic access to system models and capabilities, enabling tasks like image generation and smart reply suggestions. The introduction of the Foundation Models framework in iOS 26 further simplifies this process. This framework allows you to access a highly optimized on-device language model specialized for everyday tasks. It can be used for summarization, extraction, classification, and more, all while ensuring user data privacy as the model operates entirely offline. You can easily prompt the model, generate structured responses, and even integrate it with live or personal data using tool calling, enabling the model to perform actions and cite sources of truth.

  • 8:20 - ML-powered APIs

  • Updated Machine Learning-powered APIs provide you with a comprehensive suite of tools for enhancing app intelligence. The frameworks include Vision for image and video analysis, Natural Language for text processing, Translation for languages, Sound Analysis for recognizing sounds, and Speech for recognition and transcription. Notable new additions include document recognition and lens-smudge detection in Vision, and the SpeechAnalyzer API in Speech, which enables faster and more flexible speech-to-text processing, particularly for long-form and distant audio. Developers can also customize these models using the CreateML app and framework.

  • 11:15 - ML models

  • Core ML simplifies the process of integrating machine learning models into apps for Apple devices. You can utilize models already in CoreML format, available on developer.apple.com and the Apple space on Hugging Face, or convert trained models from other formats using CoreML Tools. CoreML Tools optimizes these models for on-device execution, reducing size and improving performance through automatic and manual techniques. You can then easily integrate these models into Xcode, where they can inspect performance, visualize the model architecture, and generate type-safe Swift interfaces. At runtime, CoreML leverages the CPU, GPU, and Neural Engine for efficient execution. For more advanced control, combine CoreML models with MPSGraph, Metal compute, or Accelerate’s BNNS Graph API, which has new capabilities this year, including a BNNSGraphBuilder for real-time CPU-based ML tasks.

  • 14:54 - Exploration

  • The rapid pace of machine learning research demands sophisticated tools and resources to keep up. Apple's MLX, an open-source array framework for numerical computing and machine learning, is designed to leverage the power of Apple Silicon. MLX enables efficient fine-tuning, training, and distributed learning of state-of-the-art models on Apple devices. It can run large language models with a single command line call and takes advantage of Apple Silicon's unified memory architecture, allowing parallel CPU and GPU operations on the same buffer. You can access MLX in Python, Swift, C++, and other languages. Additionally, Apple supports popular training frameworks like PyTorch and Jax through Metal. The developer.apple.com website and Apple Github repositories are valuable resources for AI enthusiasts and researchers, providing access to the latest machine learning resources from Apple.

Discover Metal 4

Learn how to get started leveraging the powerful new features of Metal 4 in your existing Metal apps. We'll cover how Metal enables you to get the most out of Apple silicon and program the hardware more efficiently. You'll also learn how Metal 4 provides you with new capabilities to integrate machine learning into your Metal code.

Chapters

Resources

Related Videos

WWDC25

Transcript

Hello and welcome. I’m Aaron and I’m excited to share the details of a big update to the Metal API.

Metal is Apple’s low-level graphics and compute API. It has powered multiple generations of complex applications, including the latest games like Cyberpunk 2077, as well as powerful pro apps.

Metal has become central to the way developers think about rendering and compute on Apple platforms.

Building on over a decade of experience Metal 4 takes the API to the next level, enabling developers to deliver the most demanding games and pro apps. Metal 4 is built with the next generations of games, graphics and compute apps in mind. It unlocks the full performance potential of Apple silicon, while also ensuring you’ll find it familiar and approachable if you’re coming from other graphics and compute APIs like DirectX.

Metal 4 is part of the same Metal framework that you may already have in your app and it’s supported by devices equipped with the Apple M1 and later, as well as the A14 Bionic and later. Metal 4 starts with an entirely new command structure with explicit memory management, and it changes the way resources are managed to enable richer and more complex visuals.

Shader compilation is quicker with more options so your app can reduce redundant compilation, and machine learning can now be integrated seamlessly with the rest of your Metal app.

New built-in solutions are also available via MetalFX to boost your app’s performance.

I’ll also show you how to get started with your Metal 4 adoption.

I'll start with how your app encodes and submits commands.

Metal is represented in the system by a Metal device, which the OS provides to your app.

Once you have a Metal device, you can create command queues to send work to the hardware. And command buffers that contain the work to be sent. Command encoders allow you to encode commands into those command buffers. You can take advantage of Metal 4 using the same Metal device that your app uses today.

Metal 4 provides a new but familiar model for encoding commands, and introduces new versions of the rest of these objects.

Those changes start with the new MTL4CommandQueue which your app can obtain using the Metal device. Metal 4 decouples command buffers from the queue that will use them, so your app requests for a MTL4CommandBuffer from the device as well. Command buffers are independent of one another, making it easy for your app to encode them in parallel. Apps take advantage of encoders for different types of commands including draws, dispatches, blits, and building acceleration structures.

Metal 4 consolidates existing command encoders. With the new unified compute encoder, your app can also manage blit and acceleration structure command encoding. This reduces the total number of encoders required. There’s a new MTL4RenderCommandEncoder as well. It features an attachment map that your app can use to map logical shader outputs to physical color attachments. Your app can configure the attachment map of a single render encoder with all the necessary color attachments, and can swap between them on the fly. This saves you the need to allocate additional render encoders, saving your app memory and unnecessary code. Your app can use any combination of the available encoder types to encode commands into command buffers. Command buffers are backed by memory. As your app encodes commands, data is written into this memory.

In Metal 4, this memory is managed by a MTL4CommandAllocator.

Use the device to create a command allocator to take direct control of your app’s command buffer memory use.

Memory management is essential to maximizing what modern apps can fit into their available system resources.

And those apps are using more resources than ever.

Your app manages resources differently in Metal 4.

Metal uses two fundamental types of resources. Metal buffers store generic data formatted by your app and Metal textures store image formatted data.

In the past, applications used fewer resources, sometimes even a single buffer and texture per object. And then applications added more textures to improve the level of surface details. And then, to introduce more variety in their rendering, they also added more geometry.

Modern applications continued the trend, and they pulled in more and more resources to support complex new use cases.

But while the number of resources increased, APIs only exposed a fixed set of resource bind points, to set the resources for each draw or dispatch.

Historically each draw would typically only use a few. For example, this bunny has just a texture and a buffer for its geometry. And then for each draw, each object can change buffers for geometry and textures to alter surface appearance.

However, as draw call counts increased and shaders became more complex, the impact of managing bind points per draw added to the CPU time.

So, applications changed to bindless, where the bound resources are moved to another buffer to store the bindings. That way, the app just needs to bind a single argument buffer for each object, or even the entire scene. And this means you can greatly reduce the number of bind points you need.

To help your app avoid paying for extra bind points, Metal 4 provides a new MTL4ArgumentTable type to store the binding points your app needs. Argument tables are specified for each stage on an encoder, but can be shared across stages. You create tables with a size based on the bind points that your application needs. For example, in the bindless case, the argument table just needs one buffer binding.

The GPU also needs to be able to access all of these resources. This is where residency comes in. Apple silicon provides you with a large, unified memory space. You can use it to store all your app’s resources, but Metal still needs to know what resources to make resident. In Metal 4, your app uses residency sets to specify the resources that Metal should make resident. This is to ensure they are accessible to the hardware.

Residency sets are easy to integrate into per frame encoding and committing of command buffers. You just need to make sure that the residency set contains all the required resources for the command buffer you commit. But since the residency set contents rarely change, populating the residency set can be done at app start up.

And once the set is created, you can add it to a Metal 4 command queue. All command buffers committed to that queue will then include those resources. If you need to make any updates, this should be a much lower cost at runtime. For applications which stream resources in and out on a separate thread, you can move the cost of updating the residency set to that thread and update the residency set in parallel with encoding.

A great example of a game that benefited from residency sets is Control Ultimate Edition. The Technical Publishing Director of Remedy Entertainment, Tuukka Taipalvesi, had this to say Control Ultimate Edition found residency sets easy to integrate. While separating resources into different residency sets based on use and managing resource residency on a background thread, we saw significant reductions in the overheads of managing residency and lower memory usage when ray-tracing is disabled.

Delivering beautiful games like Control requires more resources than ever. You may even require more memory than is available, especially when targeting a range of devices. To get the most out of the available memory, your app can dynamically control how resources use that memory.

This requires fine grained control of how memory is allocated to resources, since not all the resources are required at once.

Apps can adjust quality by controlling the resident level of detail to support the same content across a wider variety of devices. Your app can do this using placement sparse resources.

Metal 4 supports buffers and textures allocated as placement sparse resources.

These resources are allocated without any pages to store their data.

With placement sparse resources, pages come from a placement heap.

Your app allocates pages from the placement heap to provide storage for the resource contents.

With Metal 4's focus on concurrency by default, you need to ensure that updates to resources are synchronized. To simplify synchronization, Metal 4 introduces a Barrier API that provides low overhead, stage-to-stage synchronization that maps well to barriers in other APIs.

You can see barriers in action in the Metal 4 sample, “processing a texture in a compute function”. The app starts with a colored image, applies a compute shader that converts it to greyscale, and then renders that converted texture to the screen.

These steps have a dependency on the shared resource: the output of the texture processing. The sample ensures that resource writes and reads occur in the correct order, using Metal 4 barriers.

Without synchronization, these steps could run in any order, and that will either mean using the wrong texture contents or even worse, if the two steps overlap, that will result in a corrupted output.

To get the order right, the app uses a barrier.

Barriers work stage to stage. So you'll need to consider which stage each operation runs in, based on the stages in the encoder.

Processing the texture will execute on a compute command encoder as a dispatch stage operation.

And the render will be part of a render command encoder, which reads the texture in a fragment operation. So the barrier you need is a dispatch to fragment barrier, that waits for dispatch stage work to complete before letting any fragment work start. Using barriers effectively is important to achieve the best performance in your app, especially with a large number of resources.

In addition to resources, modern apps also manage a massive number of shaders. Those shaders need to be compiled before they can be sent to the hardware, to render and compute for your app. Shaders are written in the Metal Shading Language, and lowered to Metal IR. The IR is then compiled to GPU binaries that can be natively executed by the hardware. As the developer, you have control over when shader compilation occurs. The Metal device provides an interface to send shaders to the OS for compilation by the CPU.

Metal 4 manages compilation with dedicated compilation contexts. The new MTL4Compiler interface is now separate from the device. Your app uses the device to allocate the compiler interface. The interface provides clear, explicit control of when your app performs compilation on the CPU.

You can also leverage the MTL4Compiler to take advantage of scheduling improvements in the shader compilation stack. The MTL4Compiler inherits the quality of service class assigned to the thread requesting compilation.

When multiple threads compile at the same time, the OS will prioritize requests from higher priority threads to ensure your app’s most important shaders are processed first before moving on to other compilations.

Explicit control of shader compilation is important, since modern apps have more shaders than ever.

During pipeline state generation, your app must compile each shader the first time before GPU work can proceed.

Sometimes, pipelines share common Metal IR. For example, your app may apply different color states to render with differing transparency.

And that same situation may apply to other sets of pipelines as well. With Metal 4, you can optimize for this case, so you can reduce the time spent in shader compilation.

Render pipelines can now use flexible render pipeline states to build common Metal IR once.

This creates an unspecialized pipeline. Your app then specializes the pipeline with the intended color state. Metal will re-use the compiled Metal IR to efficiently create a specialized pipeline to execute.

Flexible render pipeline states save compilation time when re-using Metal IR across shader pipelines.

Your app creates an unspecialized pipeline once before specializing it for each of the necessary color states.

You can repeat this process for other pipelines that share Metal IR, and share the compilation results from each to reduce the time your app spends compiling shaders.

On-device compilation still costs CPU time.

The most performant path is still to eliminate on-device compilation entirely. Metal 4 streamlines the harvesting of pipeline configurations. For more information on how to take advantage of Metal 4’s compilation workflow, as well as more details on command encoding, follow-up and Explore Metal 4 games. Metal 4 makes it easier than ever to integrate machine learning, which opens up entirely new possibilities for your app.

Rendering techniques like upscaling, asset compression, animation blending, and neural shading, each benefit from the targeted application of machine learning. To apply these techniques efficiently, apps need to operate on complicated data sets and structures.

Buffers are flexible, but leave much of the heavy lifting to the app and textures aren’t a great fit.

Metal 4 adds support for tensors, a fundamental machine learning resource type, supported across all contexts.

Metal tensors are multi-dimensional data containers.

They are extendable well beyond two dimensions, providing the flexibility to describe the data layout necessary for practical machine learning usage.

Metal 4 integrates tensors directly into the API… as well as into the Metal shading language. Tensors offload the complex indexing into multidimensional data, so your Metal 4 app can focus on using them to apply machine learning in novel ways. To make that even easier, Metal 4 expands the set of supported command encoders.

The new machine learning command encoder enables you to execute large-scale networks directly within your Metal app.

Machine learning encoders function in a similar way to existing Metal encoder types.

Tensors are consumed as resources mapped into an argument table. Encoding is done to the same Metal 4 command buffers, and it supports barriers for synchronization with your other Metal 4 encoders.

Metal 4 machine learning encoders are compatible with networks expressed in the existing CoreML package format. Use the Metal toolchain to convert these into a Metal package, and then feed the network directly to your encoder. The new machine learning encoder is perfect for large networks that need command level interleaving with the rest of your Metal app. If you have smaller networks, Metal 4 also gives you the flexibility to integrate them directly into your existing shader pipelines. For example, neural material evaluation replaces traditional material textures with latent texture data. Your app samples those latent textures to create input tensors, performs inference using the sampled values, and uses the final output to perform shading.

Splitting each step into its own pipeline is inefficient, since each step needs to sync tensors from device memory, operate on them, and sync outputs back for later operations.

To get the best performance, your app should combine these steps into a single shader dispatch, so that each operation can share common thread memory. Implementing inference is a complex task, but Metal 4 has you covered with Metal performance primitives. Metal performance primitives are shader primitives designed to execute complex calculations, and they can natively operate on tensors. Each one is optimized to run blazingly fast on Apple silicon.

Tensor ops are perfect for embedding small networks into your shaders. Your app can take advantage of them as part of the Metal Shading Language, and when you do, the OS shader compiler inlines shader code, optimized for the device being used, directly into your shader.

To learn more about how to use Metal’s new machine learning capabilities, you can watch Combine Metal 4 machine learning and graphics.

Metal 4 provides you with all the tools you need to integrate cutting edge Machine Learning techniques. You can also take advantage of production-ready solutions built into MetalFX. Apple devices have incredible, high resolution screens that are perfect for showcasing your amazing games. MetalFX allows your app to deliver high resolutions at even greater refresh rates when rendering complex scenes with realistic reflections.

Rendering high resolution images can consume the GPU for a significant period of time. Instead, your app can render low resolution images and use MetalFX to upscale them. The combined time to render your final image is reduced and that means your app can save time for each frame it renders. You can use the time saved to render the next frame sooner. And if you want to hit the highest refresh rates without sacrificing quality, MetalFX has a great solution built right in.

This year, MetalFX adds support for frame interpolation.

Your app can use it to generate intermediate frames in much less time than it would take to render each frame from scratch. You can use those intermediate frames to achieve even higher frame rates.

Ray tracing is another technique that apps use to achieve realistic rendering results, by tracing rays from the camera to a light source. However if too few rays are cast, the resulting image will be too noisy to use.

MetalFX also now supports denoising during the upscale process, so your app can remove the noise from images rendered with fewer rays, and deliver full-size results. MetalFX helps you produce high quality results your players will love, at increased refresh rates. Your game can combine it with Metal 4 ray tracing to achieve incredible results. You can learn more about how to use these features in “Go further with Metal 4 games." Your app can combine all of these features to do incredible things. And Metal 4 is designed to make porting both approachable and modular. An app is composed of several key categories of functionality, including: how it compiles shaders, how it encodes and submits commands to the hardware, and finally, its management of resources. Each of these can be approached separately in their own distinct phases. Compilation is perhaps the easiest first step to take. You can allocate a Metal 4 compiler and insert it into your app's compilation flows to improve quality-of-service.

Once you’ve adopted the new compilation interface, your app can integrate flexible render pipelines to speed up render pipeline compilation or take advantage of harvesting improvements to improve ahead of time compilation.

Whether you adopt the new compiler or not, your app can also benefit from generating commands with Metal 4.

With Metal 4’s command encoding and generation model, you can take greater control of your memory allocations. You can also leverage native parallel encoding across encoder types to get encoding done quicker and Metal 4’s completely new set of machine learning capabilities unlock new possibilities for your rendering pipelines.

Your app can adopt the machine learning encoder, or shader ML, based on the network you want to integrate.

You can then take it a step further with Metal 4’s resource management.

Residency sets are an easy win. Integrate them to simplify the process of residency management. Barriers allow your app to synchronize resource access across different passes efficiently And placement sparse resources enable you to build resource streaming into your Metal app. As the developer, you are in the best position to judge how to make your app better. And Metal 4 gives you the flexibility to adopt the new functionality where you need it most. Placement sparse is a great example of a feature that enables a specific use case. Here’s how you can integrate it into your existing Metal app. Your Metal app already commits work using the existing Metal command queue while placement sparse mapping operations require a MTL4CommandQueue.

You can use a Metal event to synchronize work between your app’s Metal and MTL4CommandQueue.

The first signal event call unblocks the MTL4CommandQueue to update placement sparse resource mappings. A second signal event call notifies your app to continue the render work using the placement sparse resources. You can use the same Metal event from before.

Your app should submit work that doesn’t depend on those resources before waiting on the event to ensure the hardware remains fully utilized.

Metal comes with an advanced set of developer tools to help you debug and optimize your Metal apps. And this year, these same tools come with support for Metal 4.

API and Shader Validation save you valuable time by identifying common problems. and the comprehensive Metal Debugger helps you deep dive into your Metal 4 usage. The Metal performance HUD provides a real-time overlay to monitor your app’s performance, and Metal System Trace lets you dig into performance traces from your app.

You can learn about all of these tools and find their documentation on the Apple Developer website. Also, check out "Level up your games" for great techniques to debug and further optimize your games. Xcode 26 also comes with a new Metal 4 template built in. Start a new project and select Game templates before choosing Metal 4 as your game technology. With the built in Xcode template, it’s easy to get a basic render going and start your Metal 4 journey.

Now that you’ve discovered what Metal 4 can do, you’re ready to learn how to apply it to the needs of your app. If you’re developing a game, you can dive In and explore how to use Metal 4’s new command encoding and compilation features.

After that, you can learn how to go further and optimize your game with MetalFX, and discover how to take advantage of Metal 4 ray tracing. Or, you can learn how to use Metal 4 to combine Machine Learning and graphics in your Metal app. Metal 4 enables a new generation of apps and games with an incredible feature set. You’ve only just scratched the surface with the foundation you'll need to dive in. You can start using Metal 4 in your new or existing apps with the upcoming developer beta. The sample code is a great example of how to start your Metal 4 adoption.

It’s available now.

Thanks for watching!

Dive deeper into Writing Tools

With Writing Tools, people can proofread, rewrite, and transform text directly within your app. Learn advanced techniques to customize Writing Tools for your app. Explore formatting options and how they work with rich text editing. If you have a custom text engine, learn how to seamlessly integrate the complete Writing Tools experience, allowing edits directly within the text view.

Chapters

Resources

Related Videos

WWDC24

Transcript

Hi and welcome to “Dive deeper into Writing Tools.” I’m Dongyuan. I work on text input and internationalization. Last year, we introduced Writing Tools and showed how to integrate it in your app. Today, we’ll explore some more advanced topics. I’ll show you what’s new in Writing Tools, ways to customize the Writing Tools experience in native text views, how to have Writing Tools work seamlessly with rich text, and finally, how to integrate a full Writing Tools experience if your app has a custom text engine. Let’s get started! With Writing Tools, people can refine their words by rewriting, proofreading, and summarizing text right inside their text views. This year, we’ve added many new features to Writing Tools. With ChatGPT integration, you can also generate content for anything you want to write about, or create images with a simple prompt.

Writing Tools are now available on visionOS. They work nearly everywhere, including Mail, Notes, and in apps that you create.

New in iOS, iPadOS, and macOS 26, after describing changes to rewrite text, you can enter a follow-up request.

For example, you can ask to make the text warmer, more conversational, or more encouraging.

In addition, Writing Tools are available as Shortcuts so you can take your workflows to the next level with Apple Intelligence. Tools like proofread, rewrite, and summarize can now be used in an automated fashion.

We’ve also added a variety of APIs to help your app support Writing Tools. You can now get the stock toolbar button and standard menu items, ask Writing Tools to return presentation intents for rich text, or integrate the powerful Writing Tools coordinator into your custom text engine. We’ll dive deeper into the new result options and the coordinator API later.

Next up, let’s talk about how to customize Writing Tools in native text views provided by the system.

Last year’s Writing Tools video covers the basics. For native text views, you can get Writing Tools support for free. You can take advantage of lifecycle methods to react to Writing Tools operations like pausing syncing, customize your text view to use limited or full Writing Tools behavior, you can specify ranges you don’t want Writing Tools to rewrite, or use result options to manage support for rich text, lists, or tables. Keep in mind that in last year’s session, result options were named “Allowed Input Options.” They have been renamed to “Writing Tools Result Options” for clarity.

Although Writing Tools is available when you select text, if your app is text-heavy, consider adding a toolbar button, like we did for Notes and Mail.

To do that, use UIBarButtonItem in UIKit, or NSToolbarItem in AppKit.

In the context menu, Writing Tools items are inserted automatically. If you have a custom menu implementation, or if you want to move the Writing Tools items around, you can set automaticallyInsertsWritingToolsItems to false, and use the writingToolsItems API to get the standard items. This year we updated Proofread, Rewrite, and Summary options in the menu. Using the API ensures that you get all updates for free.

Now, let’s talk about formatting.

Apps have many kinds of text views. Some text views don’t support rich text, like the search field in Finder. Some text views support Rich Text, like TextEdit.

Some text views support semantic styles, like Notes. Here you can specify paragraphs with different semantic styles, such as heading, subheading, or block quote. For example, in Notes the result from Writing Tools can utilize native Notes headings, tables, and lists. To tell Writing Tools what kind of text your text view supports, use Writing Tools result options. For plain text views, use the plainText result option. Writing Tools will still use NSAttributedString to communicate with your text view, but you can safely ignore all the attributes inside. For rich text views, like the one in TextEdit, use the richText result option, with or without the list and table options, depending on if your text view supports lists and tables. Writing Tools may add display attributes like bold or italic to the attributed string.

Apps that have specialized understanding of semantic formats like Notes can use the richText option with the new presentationIntent option. Writing Tools will use NSAttributedString with presentation intents to communicate with your text view.

You may wonder what's the difference between display attributes and presentation intents.

In the TextEdit example, Writing Tools produce rich text by using display attributes like bold or italic, and sends the attributed string to the text view. These attributes only carry stylistic information like concrete font sizes, but not semantic style information. In contrast, Notes tries to fully utilize native semantic styles like headers. Although Writing Tools generated the same text underneath, we instead add presentation intents to the attributed string. Notes can then convert presentation intents to its internal semantic styles. From this example, you can see that the headers part is just a header intent without concrete font attributes.

Note that even if .presentationIntent is specified in the result options, Writing Tools may still add display attributes to the attributed string, because some styles can’t be represented by presentation intents. In this example, Writing Tools use both an emphasized presentation intent, and a strikethrough display attribute to represent the text “crucial and deleted” in the screenshot.

To summarize, in presentation intent mode we’ll provide styles in the presentation intent form whenever possible. This includes elements like lists, tables, and code blocks.

Display attributes may still be used for attributes like underlines, subscript, and superscript. And lastly, presentation intents don’t have a default style. Your app is responsible for converting presentation intents into display attributes, or its own internal styles.

To have Writing Tools understand the semantics of your text better, once you’ve adopted the presentation intent option, you can override the requestContexts method for your text view, and supply a context with presentation intents whenever possible.

Finally, even if you have a completely custom text engine, we’ve got you covered! You can get the basic Writing Tools experience for free, as long as your text engine has adopted common text editing protocols. On iOS that’s UITextInteraction, or UITextSelectionDisplayInteraction with UIEditMenuInteraction. On macOS, your view needs to adopt NSServicesMenuRequestor this gives your text view Writing Tools, as well as support for features in the Services menu. For details on basic adoption, check out the WWDC24 session.

If you want to go one step further, you can implement the full Writing Tools experience. This gives Writing Tools the ability to rewrite text in place, provide animation, and show proofreading changes right in line. This year we’ve added Writing Tools coordinator APIs for custom text engines.

The Writing Tools coordinator manages interactions between your view and Writing Tools.

You attach a coordinator to your view, and create a delegate to implement the WritingToolsCoordinatorDelegate methods. The delegate prepares the context for Writing Tools to work on, incorporates changes, provides preview objects to use during animations, provides coordinates for Writing Tools to draw proofreading marks, and responds to state changes. To bring it all to life, let me walk you through a quick demo.

Here I have a custom text engine built using TextKit 2. As you can see, we’ve already implemented common text editing protocols like NSTextInputClient and NSServicesMenuRequestor.

If I build and run the app, I get the basic Writing Tools support for free. All the results will be shown in a panel. I can do a Rewrite, for example.

And Replace, or Copy the result.

Now let’s implement full Writing Tools support.

In the DocumentView, I’m going add some instance properties that are needed during a Writing Tools session, initialize an NSWritingToolsCoordinator object, set the delegate to self, and assign it to an NSView. I also need to call this `configureWritingTools` in `init`.

Of course, it complains that DocumentView does not conform to `NSWritingToolsCoordinator.Delegate`.

Let’s drag in a file that implements all the delegate methods, in a DocumentView extension. You can see that it prepares the context, does text replacement and selection, returns bounding boxes for proofreading, generates previews for animations, etc.

Let’s build and run the app.

If I rewrite the text, you can see the animation, and the text change happens directly in the text view.

And if I proofread, you can see the underlines added by Writing Tools.

I can also click on the individual suggestions to show what’s changed.

Now let’s talk about each of those steps in more detail.

The first thing to do is to create a coordinator, and attach it to a view. Writing Tools coordinator is a UIInteraction in UIKit, and you attach it to your UIView just like other UIInteractions. In AppKit, it’s an instance property on NSView. Once you have a coordinator, you can set your preferred Writing Tools behavior and result options.

Now, let’s get to the delegate methods. First and foremost, Writing Tools need a context for the current text. A context consists of a piece of text as an NSAttributedString, and a selection range. At minimum, you should include the current text selection in the attributed string. Optionally, you can also include the paragraph before and after the selection— this gives Writing Tools a better understanding of the context around the text. Set the range to what’s currently selected, based on context.attributedString. If nothing is selected, return the whole document as the context and set the range to the cursor location. This gives Writing Tools an opportunity to work on the whole document, when people haven’t specifically selected anything. This is how you provide context. Here I’m showing AppKit, but unless I specifically point it out, you can assume that UIWritingToolsCoordinator and NSWritingToolsCoordinator behave the same way. The delegate methods are async, because large text views may take time to process the underlying text storage. In the body of the function, prepare the text and range depending on the “scope” parameter. Most of the time, you only need to return one context. For sophisticated text views in which text in multiple text storages can be selected at the same time, the coordinator also supports multiple contexts.

After evaluating your view’s text, Writing Tools delivers suggested changes to your delegate object. In this replace text delegate method, incorporate the change into your view’s text storage. Writing Tools call this method for each distinct change, and may call the method multiple times with different range values for the same context object. Once finished processing, Writing Tools may ask the delegate to update the selected text range.

To animate the text while Writing Tools are processing, the coordinator asks for preview images for certain ranges of text.

Text view should return previews by rendering the text on a clear background. During animations, Writing Tools apply the visual effects to the provided preview images instead of to the text itself. On macOS, this is done via two delegate methods. The first delegate method takes an array. You should return at least one preview for the whole range. Optionally, return one preview per line for smoother animation.

On iOS, UIKit uses UITargetedPreview instead of NSTextPreview, and only one delegate method is used.

Before and after the actual animation, Writing Tools calls the prepareFor and finish methods. To prepare for the text animation, hide the specific range of text from the text view. Once animation finishes, show the specific range of text again.

For proofreading, Writing Tools shows an underline for text ranges that were changed. Writing Tools also responds to click events in ranges of text to show the inline proofreading popup.

To show proofreading marks, the coordinator asks the delegate to return the underline bezier paths for individual ranges. Writing Tools also need the bounding bezier path to respond to click or tap events.

Lastly, you can implement the optional writingToolsCoordinator:willChangeToState:completion: method to respond to state changes. You may want to perform undo coalescing, stop syncing, or prevent editing, depending on your text view implementation. Conversely, you should inform the coordinator about external changes via the updateRange:withText to make sure Writing Tools operations are in sync with the latest text. Use updateForReflowedText to notify Writing Tools about layout changes in your view. When you call this method, Writing Tools requests new previews, proofreading marks, and other layout-dependent information.

We’ve seen how to integrate the full Writing Tools experience with your powerful custom text engine. We’ve also released the project I showed earlier as sample code. Check out the sample code and the complete documentation about Writing Tools coordinator to learn more.

That wraps up the session. What’s next? Try out the new Writing Tools features. Like follow-up adjustments after you describe your changes. Or Writing Tools in Vision Pro, and the Shortcuts app.

Add a toolbar button, if your app is text-heavy.

Play with the formatting options to allow Writing Tools to read and write semantic styles like headings, subheadings, and code blocks.

If you already have a powerful text engine, empower it further with the full Writing Tools experience.

Also check out Get started with Writing Tools from last year, and the Writing Tools sample code linked below if you want to see how Writing Tools coordinator works in action. Thanks for watching!

Code

11:46 - Attach a coordinator to the view (UIKit)

// Attach a coordinator to the view
// UIKit

func configureWritingTools() {
    guard UIWritingToolsCoordinator.isWritingToolsAvailable else { return }

    let coordinator = UIWritingToolsCoordinator(delegate: self)
    addInteraction(coordinator)
}

12:02 - Attach a coordinator to the view (AppKit)

// Attach a coordinator to the view
// AppKit

func configureWritingTools() {
    guard NSWritingToolsCoordinator.isWritingToolsAvailable else { return }
       
    let coordinator = NSWritingToolsCoordinator(delegate: self)

    coordinator.preferredBehavior = .complete
    coordinator.preferredResultOptions = [.richText, .list]
    writingToolsCoordinator = coordinator
}

13:06 - Prepare the context

// Prepare the context

func writingToolsCoordinator(_ writingToolsCoordinator: NSWritingToolsCoordinator,
        requestsContextsFor scope: NSWritingToolsCoordinator.ContextScope,
        completion: @escaping ([NSWritingToolsCoordinator.Context]) -> Void) {

    var contexts = [NSWritingToolsCoordinator.Context]()
                
    switch scope {
    case .userSelection:
        let context = getContextObjectForSelection()
        contexts.append(context)
        break
        // other cases…
    }
        
    // Save references to the contexts for later delegate calls.
    storeContexts(contexts)
    completion(contexts)
}

13:48 - Respond to text changes from Writing Tools and update selected range

// Respond to text changes from Writing Tools

func writingToolsCoordinator(_ writingToolsCoordinator: NSWritingToolsCoordinator,
        replace range: NSRange,
        in context: NSWritingToolsCoordinator.Context,
        proposedText replacementText: NSAttributedString,
        reason: NSWritingToolsCoordinator.TextReplacementReason,
        animationParameters: NSWritingToolsCoordinator.AnimationParameters?,
        completion: @escaping (NSAttributedString?) -> Void) {
}

// Update selected range

func writingToolsCoordinator(_ writingToolsCoordinator: NSWritingToolsCoordinator,
        select ranges: [NSValue],
        in context: NSWritingToolsCoordinator.Context,
        completion: @escaping () -> Void) {
}

14:41 - Generate preview for animation (AppKit)

// Generate preview for animation (macOS)

func writingToolsCoordinator(_ writingToolsCoordinator: NSWritingToolsCoordinator,
        requestsPreviewFor textAnimation: NSWritingToolsCoordinator.TextAnimation,
        of range: NSRange,
        in context: NSWritingToolsCoordinator.Context,
        completion: @escaping ([NSTextPreview]?) -> Void) {
}
    
func writingToolsCoordinator(_ writingToolsCoordinator: NSWritingToolsCoordinator,
        requestsPreviewFor rect: NSRect,
        in context: NSWritingToolsCoordinator.Context,
        completion: @escaping (NSTextPreview?) -> Void) {
}

14:58 - Generate preview for animation (UIKit)

// Generate preview for animation (iOS)

func writingToolsCoordinator(_ writingToolsCoordinator: UIWritingToolsCoordinator,
        requestsPreviewFor textAnimation: UIWritingToolsCoordinator.TextAnimation,
        of range: NSRange,
        in context: UIWritingToolsCoordinator.Context,
        completion: @escaping (UITargetedPreview?) -> Void) {
}

15:08 - Delegate callbacks before and after animation

// Generate preview for animation

func writingToolsCoordinator(
    _ writingToolsCoordinator: NSWritingToolsCoordinator,
    prepareFor textAnimation: NSWritingToolsCoordinator.TextAnimation,
    for range: NSRange,
    in context: NSWritingToolsCoordinator.Context,
    completion: @escaping () -> Void) {

    // Hide the specific range of text from the text view
}

func writingToolsCoordinator(
    _ writingToolsCoordinator: NSWritingToolsCoordinator,
    finish textAnimation: NSWritingToolsCoordinator.TextAnimation,
    for range: NSRange,
    in context: NSWritingToolsCoordinator.Context,
    completion: @escaping () -> Void) {

    // Show the specific range of text again
}

15:39 - Delegate callbacks to show proofreading marks

// Create proofreading marks

func writingToolsCoordinator(_ writingToolsCoordinator: NSWritingToolsCoordinator,
        requestsUnderlinePathsFor range: NSRange,
        in context: NSWritingToolsCoordinator.Context,
        completion: @escaping ([NSBezierPath]) -> Void) {
}

func writingToolsCoordinator(_ writingToolsCoordinator: NSWritingToolsCoordinator,
        requestsBoundingBezierPathsFor range: NSRange,
        in context: NSWritingToolsCoordinator.Context,
        completion: @escaping ([NSBezierPath]) -> Void) {
}

Summary

  • 0:00 - Introduction

  • In this video we'll review what's new in Writing Tools, how to customize your app's experience, support rich text, and integrate them into custom text engines.

  • 0:46 - What’s new

  • Writings Tools now support integration with ChatGPT, visionOS, follow-up requests for making tone adjustments, and automation using Shortcuts. Writings Tools also provide new APIs to help you to integrate them into your app.

  • 2:21 - Customize native text views

  • If your app uses native text views, you get Writing Tools support for free. You can further customize the experience by adopting lifecycle methods like pausing, syncing, specifying ranges within in text, providing a toolbar button, or customizing context menus.

  • 4:00 - Rich text formatting

  • Writing Tools now support rich text with semantic styles. If your app supports presentation intents such as headings, subheadings, quotes, tables, and lists, you can communicate that information to Writings Tools. Writings Tools will provide results using styles in your app's supported presentation intent whenever possible.

  • 7:41 - Custom text engines

  • If your app uses a custom text engine, you can now enable a fully integrated experience with Writing Tools. The basic Writing Tools experience works automatically, provided you adopt common text editing protocols. The full Writing Tools experience allows Writing Tools to rewrite text directly in place, provide animations, and show proofreading changes inline. For the full experience, use the new Writing Tools coordinator API to integrate them into your custom text engine.

  • 16:58 - Next steps

  • Explore new Writing Tools features and take advantage of customization and rich text support in your app. If you have a custom text engine, enable a full Writing Tools experience by adopting the coordinator API.

Dive into App Store server APIs for In-App Purchase

Discover the latest updates for the App Store Server API, App Store Server Notifications, and App Store Server Library to help manage customer purchase data directly on your server and deliver great In-App Purchase experiences. We'll cover updates to appAccountToken and signature signing, new fields in signed transaction and renewal info, and new APIs. Then, we'll show how to generate a promotional offer signature on your server, and how to use the Send Consumption Information endpoint.

Chapters

Resources

Related Videos

WWDC25

WWDC24

Transcript

Hello, I’m Riyaz, an engineer on the App Store server team. In this session, we're diving deep into the App Store server APIs for In-App Purchase. I'll show how our latest updates are designed to streamline and enhance your app server responsibilities. Let's start by exploring some of the key performed by your app server.

In this session, I'll focus on three critical responsibilities.

First, manage In-App Purchases. This task involves associating transaction data with your customer accounts so your app can deliver content and services seamlessly.

Next, sign requests. This requires generating a signature to authorize your server’s requests to the App Store.

Finally, participate in the refund decision process. By sharing consumption data related to purchases, your server can help the App Store make informed refund decisions. These are some of the many crucial responsibilities managed by your app server. Next, I'll explore how the new updates to the App Store server API will enhance these responsibilities. There's a lot to cover, so let's jump right in... First up, I'll introduce updates to transaction identifiers that will help you better manage In-App Purchases. Then, I'll look at improvements for generating signatures to simplify signing requests on your server. and finally, I'll review enhancements that simplify your participation in refund processes.

Let's dive into the details, beginning with Manage In-App Purchases. Managing In-App Purchases starts with effectively handling customer accounts on your system.

Typically, you would assign a unique account ID on your system to each customer, establishing a clear link between their account and the App Store transactions. This association is crucial for delivering the right content or personalizing the user experience. The App Store provides In-App Purchase data through three key data structures AppTransaction, JWSTransaction, and JWSRenewalInfo First, I'll focus on the JWSTransaction; I'll return to the other types later.

When a customer makes an In-App Purchase, the App Store provides a signed transaction object. On the server, the JWSTransaction represents this signed transaction effectively verify and decode using the App Store Server Library.

Here's a sample decoded signedTransactionInfo.

The first few fields contain important information related to the app, and the in-app product type.

Following that, is metadata about the purchase including quantity, price, and currency. If a customer redeems an offer, the JWSTransaction includes fields for offerType, offerIdentifier, and offerDiscountType. We recently added a new field offerPeriod, which indicates the duration of the redeemed offer using the ISO 8601 duration format.

This field is also included in JWSRenewalInfo, informing you of the offer's duration that applies at the time of next subscription renewal.

Then, there are transaction identifiers. The transactionId is a unique identifier for a transaction, such as an In-App Purchase, restore, or subscription renewal. In this example, it represents the transaction Id

Elevate the design of your iPad app

Make your app look and feel great on iPadOS. Learn best practices for designing a responsive layout for resizable app windows. Get familiar with window controls and explore the best ways to accommodate them. Discover the building blocks of a great menu bar. And meet the new pointer and its updated effects.

Chapters

Resources

Transcript

Hi, my name is Rene Lee and I’m a designer on the Apple Design team.

Today, I’m so excited to share how the new updates coming to iPad can elevate the design of your app. At its core, iPad is about simplicity. And iPadOS 26 is built on that solid foundation.

Multitasking is easy with a new gesture, that fluidly resizes any window.

In the toolbar, you’ll find the new Window Controls that become larger to reveal their functionality.

The new pointer is more precise and responsive.

And finally, there is the new menu bar, that you can populate with everything that makes your app unique and powerful.

With these new features, you now have all the building blocks that you’ll ever need to finally Elevate your iPad app. In this session, I’ll take you step by step through this journey, starting with your app’s navigation.

Then, we will look at how windows work when multitasking on iPad. Next, I’ll introduce iPad’s new pointer and updates to its hover effect.

Last, we’ll wrap up this session with the new menu bar on iPad.

Let’s get started with: Navigation. Elevating your iPad app starts with choosing its navigation pattern. The first option you have is a sidebar.

A sidebar, is ideal for apps that have numerous sub-views or deeply nested content like Mail.

Let’s take a look inside the sidebar in Mail: Mail uses the sidebar to list all the mailboxes and multiple accounts, exposing its content hierarchy to the top level. Mail on iPad is faster because its sidebar flattens its navigation. Let’s look at another example: Music. Instead of mailboxes, Music app has a set of tabs, library, and playlists. So in Music, these destinations are placed inside the sidebar which makes navigating very fast.

The sidebar in Music does have one more button that looks like a tab bar. When you tap it, the sidebar fluidly morphs... into a tab bar! A tab bar is a more compact and flexible navigation pattern.

Because it’s much smaller than a sidebar. Your app can show much more content than before, making it feel more immersive.

Morphing a tab bar back into a sidebar is also seamless. Tab bar’s fluidity lets people choose the navigation they prefer.

If you’re still not sure which navigation is best for your app, start with a tab bar.

A tab bar can easily morph into sidebar to help your app scale over time. Next, let’s see how each navigation adapts to smaller sizes.

When the iPad is rotated to portrait, a sidebar can respond by morphing back into a tab bar. But from a layout standpoint, the orientation change simply means that your app must adapt to a change in width.

When the navigation is adaptive, your app can reflow to any width.

You can extend this adaptivity to floating windows too, since they also result in a smaller width. If you use a tab bar, you’ll be able to adapt gracefully to any dimension.

Apps that don’t use a tab bar can also adapt its navigation by collapsing its columns down.

When adapting to size changes, make sure that any change in layout is non-destructive.

Resizing an app should not permanently alter its layout. Be opportunistic about reverting back to the starting state whenever possible.

Next, extend your content around navigation.

People come to your app for its content. So use as much of the display as possible to create the most immersive experience. You can start doing this by drawing content below the toolbar using the new 'scroll edge effect'.

Extending content is especially beneficial for floating windows that are smaller than a fullscreen app.

You can also extend content below the sidebar.

When content extends beyond its boundary like this, you can even tell which row has been scrolled, and by how much. To summarize: Choose the navigation pattern that best fits the content.

Consider starting with a tab bar. Adapt your navigation to smaller sizes.

And extend your content around the navigation.

Next, let’s get into windows. In iPadOS 26, there’s a new windowing system that’s somehow both simpler and more powerful! Every app that supports multitasking now shows a handle in the bottom right corner.

Dragging this will start resizing the app into a window that floats above your wallpaper.

On the top left of every window, you’ll find the new window controls. They become larger on tap, revealing their functionality. If you press and hold, they expand even further, showing a set of shortcuts that quickly create various window layouts. Window controls pack a lot of functionality and are central to how you multitask on iPad. Let’s dive a bit deeper into how they appear on every window. Window controls will appear on the leading edge of your app toolbar. Any existing controls will shift to the right to make room and avoid occlusion.

For apps that haven’t been updated to iPadOS 26, the system will increase the safe area above the toolbar and placethe window controls on its leading edge.

Please note that this placement above the toolbar only exists for compatibility.

Because it creates a permanent safe area directly above your toolbar, using this placement is suboptimal for maximizing content in your app. You should instead wrap your toolbar around window controls so they appear in line. When you wrap your toolbar around where window controls appear, you can avoid having to reserve a safe area.

The extra room you reclaim can then be spent on showing more content without increasing the window size.

Next, there’s an important change in how your app should handle opening new documents. If you open a document, its default app will launch above and open it in its window. It’s safe to assume that the app’s current state has some value until the app is hidden or deliberately closed. But what if you wanted to open a second document? Prior to multitasking-oriented workflows, if an app is then asked to open a different document, it would proceed to open the file in its one and only window, clearing any prior context in the process. This type of ‘Open in Place’ behavior is no longer recommended for apps that participate in multitasking. Going forward, your app should create a new window for each document. When multitasking is enabled, your app should open each document in their own window. This additive behavior is more helpful because windows persist until closed. But wouldn’t this behavior cause lots of windows to accumulate, and make it difficult to find my window? With the new additive windowing behavior, it’s very easy to start accumulating lots of windows over time.

To help you find your window, there’s a new section in the app menu with a list of all your open windows. But this list is not very helpful if window names are not descriptive. Therefore, it’s important for you to provide descriptive names for your app’s windows.

Name each window with a unique string, like the title of the document.

If your app provides a descriptive name for each window, it will be much easier to find a specific window.

To conclude this section: Wrap your toolbar around window controls.

Create a window for each document. And finally, provide descriptive names for your app’s windows.

Next, let’s see what changes are coming to the pointer on iPad.

Everything on iPad was designed for touch. So the original pointer was circular in shape, to best approximate your finger in both size and accuracy.

But under the hood, the pointer is actually capable of being much more precise than your finger.

So in iPadOS 26, the pointer is getting a new shape, unlocking its true potential.

The new pointer somehow feels more precise and responsive because it always tracks your input directly 1 to 1. In addition to its new shape, there are also changes coming to how the pointer behaves when hovering over buttons and controls. The new highlight effect is a liquid glass platter that materializes directly on top of the buttons you’re hovering over with your pointer.

Floating above the liquid glass controls, the highlight bends and refracts its underlying elements to indicate which button is currently selected.

When you hover over a liquid glass control that has more than one button, the highlight effect will appear to indicate which button you’re about to press with your pointer.

As you move across the cluster, the highlight will quickly catch up with your pointer, whenever its target changes.

The new highlight effect replaces the original hover effect where the pointer would morph into the highlight.

The new pointer will no longer magnetize or rubber band to any target and always track your input directly.

Make sure to test your app with the new pointer to identify unexpected results. Feel free to come back to this section for reference and guidance as you refine and polish your pointer interaction. Lastly, let’s take a look at the menu bar on iPad. The menu bar has always been a part of macOS, and shares many of its core principles with the menu bar on iPad. So in this section, we will focus on what makes the menu bar on iPad distinct. On iPad, you can reveal the menu bar by moving your pointer to the top edge. You can also swipe down with your finger.

In every menu bar, you’ll find the app menu, followed by default menus that the system provides, and custom menus provided by the app.

Going forward, every app on iPad is getting its own menu bar. So let’s get started with some tips for designing a great menu bar. First, organize your menu items.

Let me walk you through the process of organizing a custom menu. As an example, we'll use the Message menu here in Mail.

First thing you should do is to populate the menu with every action that relates to its name. You should order these actions by frequency of use, and not alphabetically. You should also group related actions into their own sections, to create further separation. If the menu starts becoming too long, you can move secondary actions into a submenu. Next, assign each item a symbol — ideally one that matches how they appear in your app. Finally, assign keyboard shortcuts to the most commonly performed actions. The result is a well-organized custom menu that not only contain individual commands, but also provides a mental map that people can internalize to better understand your app. Next, populate the View menu.

One of the menus that the system provides is the View menu. But it’s up to your app to populate it with useful actions. If your app is organized by tabs, they are great to include in the View menu.

Add your tabs as items in the View menu, and assign keyboard shortcuts to speed up tab switching in your app.

And now, you have a fully functional View menu! Another useful item is a navigation toggle. Any button that shows or hides the sidebar is a great addition to the View menu. Adding a sidebar toggle lets people quickly show or hide their navigation.

Next thing to keep in mind is to never hide any menus or items depending on the context of your app.

As people use your app, its context is always shifting. So when they open a menu, some items may appear dimmed, as they are not actionable. This, is very much by design. Menu items should always remain in the same place, even when inactive. They become fully opaque when they're actionable and dimmed when they're not. You might be wondering why inactive actions aren't simply hidden to save space.

Hiding menu items is not recommended because people will find this disorienting.

When a menu's content is always changing, people need to re-scan the entire menu each time they open it, as they can no longer fallback on their spatial memory. Make sure to keep the contents of your menu static, so that the menu bar is predictable and also aids in the discovery of new features each time people visit. Lastly, depending on the context, nothing inside a menu may be actionable.

Even when this is the case, avoid hiding menus in their entirety. The menu bar stops being predictable when entire menus disappear intermittently.

Make sure to keep the menus as well as their contents persistent so that people always get experience they expect.

Let’s review what we learned about the menu bar: Organize your menu items.

Populate the View menu with tabs and sidebar toggle.

Avoid hiding menus and menu items.

Keep these guidelines in mind as you design your app's menu bar for the best results! And with that we’ve covered everything I had for you today.

I reviewed how fluid navigation and additive multitasking make your iPad apps more powerful than ever. I also shared how the new pointer provides precision input and how the menu bar helps organize your app’s actions, making them easier to discover. If you’re interested in learning more about the new design system, I recommend checking out these other sessions.

As you start bringing these building blocks into your apps, I’m sure that they will add up to be more than the sum of their parts. I’m confident that you will take what you learned today and use them to truly transform and elevate the experience of your iPad app. Thanks for watching!

Embracing Swift concurrency

Join us to learn the core Swift concurrency concepts. Concurrency helps you improve app responsiveness and performance, and Swift is designed to make asynchronous and concurrent code easier to write correctly. We'll cover the steps you need to take an app through from single-threaded to concurrent. We'll also help you determine how and when to make the best use of Swift concurrency features – whether it's making your code more asynchronous, moving it to the background, or sharing data across concurrent tasks.

Chapters

Resources

Related Videos

WWDC25

WWDC23

Transcript

Hello! I’m Doug from the Swift team, and I’m excited to talk to you about how to make the best use of Swift concurrency in your app.

Concurrency allows code to perform multiple tasks at the same time. You can use concurrency in your app to improve responsiveness when waiting on data, like when reading files from disk or performing a network request. It can also be used to offload expensive computation to the background, like processing large images.

Swift’s concurrency model is designed to make concurrent code easier to write correctly. It makes the introduction of concurrency explicit and identifies what data is shared across concurrent tasks.

It leverages this information to identify potential data races at compile time, so you can introduce concurrency as you need it without fear of creating hard-to-fix data races.

Many apps only need to use concurrency sparingly, and some don't need concurrency at all. Concurrent code is more complex than single-threaded code, and you should only introduce concurrency as you need it.

Your apps should start by running all of their code on the main thread, and you can get really far with single-threaded code. The main thread is where your app receives UI-related events and can update the UI in response. If you aren’t doing a lot of computation in your app, it’s fine to keep everything on the main thread! Eventually, you are likely to need to introduce asynchronous code, perhaps to fetch some content over the network.

Your code can wait for the content to come across the network without causing the UI to hang.

If those tasks take too long to run, we can move them off to a background thread that runs concurrently with the main thread.

As we develop our app further, we may find that keeping all our data within the main thread is causing the app to perform poorly. Here, we can introduce data types for specific purposes that always run in the background.

Swift concurrency provides tools like actors and tasks for expressing these kinds of concurrent operations. A large app is likely to have an architecture that looks a bit like this. But you don’t start there, and not every app needs to end up here. In this session, we’re going to talk through the steps to take an app through this journey from single-threaded to concurrent. For each step along the way, we’ll help you determine when to take that step, what Swift language features that you’ll use, how to use them effectively, and why they work the way they do. First, we’ll describe how single-threaded code works with Swift concurrency. Then, we’ll introduce asynchronous tasks to help with high-latency operations, like network access.

Next, we’ll introduce concurrency to move work to a background thread and learn about sharing data across threads without introducing data races.

Finally, we’ll move data off the main thread with actors.

Let’s start with single-threaded code. When you run a program, code starts running on the main thread.

Any code that you add stays on the main thread, until you explicitly introduce concurrency to run code somewhere else. Single-threaded code is easier to write and maintain, because the code is only doing one thing at a time. If you start to introduce concurrency later on, Swift will protect your main thread code.

The main thread and all of its data is represented by the main actor. There is no concurrency on the main actor, because there is only one main thread that can run it. We can specify that data or code is on the main actor using the @MainActor notation.

Swift will ensure that main-actor code only ever runs on the main thread, and main-actor data is only ever accessed from there. We say that such code is isolated to the main actor.

Swift protects your main thread code using the main actor by default.

This is like the Swift compiler writing @MainActor for you on everything in that module. It lets us freely access shared state like static variables from anywhere in our code. In main actor mode, we don't have to worry about concurrent access until we start to introduce concurrency.

Protecting code with the main actor by default is driven by a build setting. Use this primarily for your main app module and any modules that are focused on UI interactions. This mode is enabled by default for new app projects created with Xcode 26. In this talk, we'll assume that main actor mode is enabled throughout the code examples.

Let's add a method on our image model to fetch and display an image from a URL. We want to load an image from a local file. Then decode it, and display it in our UI.

Our app has no concurrency in it at all. There is just a single, main thread doing all of the work.

This whole function runs on the main thread in one piece. So long as every operation in here is fast enough, that’s fine.

Right now, we’re only able to read files locally. If we want to allow our app to fetch an image over the network, we need to use a different API.

This URLSession API lets us fetch data over the network given a URL. However, running this method on the main thread would freeze the UI until the data has been downloaded from the network.

As a developer, it’s important to keep your app responsive. That means taking care not to tie up the main thread for so long that the UI will glitch or hang. Swift concurrency provides tools to help: asynchronous tasks can be used when waiting on data, such as a network request, without tying up the main thread.

To prevent hangs like this, network access is asynchronous.

We can change fetchAndDisplayImage so that it's capable of handling asynchronous calls by making the function 'async', and calling the URL session API with 'await'.

The await indicates where the function might suspend, meaning that it stops running on the current thread until the event it’s waiting for happens. Then, it can resume execution.

We can think of this as breaking the function into two pieces: the piece that runs up until we start to fetch the image, and the piece that runs after the image has been fetched. By breaking up the function this way, we allow other work to run in between the two pieces, keeping our UI responsive.

In practice, many library APIs, like URLSession, will offload work to the background for you. We still have not introduced concurrency into our own code, because we didn't need to! We improved the responsiveness of our application by making parts of it asynchronous, and calling library APIs that offload work on our behalf. All we needed to do in our code was adopt async/await.

So far, our code is only running one async function. An async function runs in a task. A task executes independently of other code, and should be created to perform a specific operation end-to-end. It's common to create a task in response to an event, such as a button press. Here, the task performs the full fetch-and-display image operation.

There can be many asynchronous tasks in a given app. In addition to the fetch-and-display image task that we’ve been talking about, here I’ve added a second task that fetches the news, displays it, and then waits for a refresh. Each task will complete its operations in order from start to finish. Fetching happens in the background, but the other operations in each task will all run on the main thread, and only one operation can run at a time. The tasks are independent from each other, so each task can take turns on the main thread. The main thread will run the pieces of each task as they become ready to run.

A single thread alternating between multiple tasks is called 'interleaving'. This improves overall performance by making the most efficient use of system resources. A thread can start making progress on any of the tasks as soon as possible, rather than leaving the thread idle while waiting for a single operation.

If fetching the image completes before fetching the news, the main thread will start decoding and displaying the image before displaying the news.

But if fetching the news finishes first, the main thread can start displaying the news before decoding the image.

Multiple asynchronous tasks are great when your app needs to perform many independent operations at the same time. When you need to perform work in a specific order, you should run that work in a single task.

To make your app responsive when there are high-latency operations like a network request, use an asynchronous task to hide that latency. Libraries can help you here, by providing asynchronous APIs that might do some concurrency on your behalf, while your own code stays on the main thread. The URLSession API has already introduced some concurrency for us, because it’s handling the network access on a background thread. Our own fetch-and-display image operation is running on the main thread. We might find that the decode operation is taking too long. This could show up as UI hangs when decoding a large image.

Asynchronous, single-threaded is often enough for an app. But if you start to notice that your app isn’t responsive, it’s an indication that too much is happening on the main thread. A profiling tool such as Instruments can help you determine where you are spending too much time. If it’s work that can be made faster without concurrency, do that first.

If it can’t be made faster, you might need to introduce concurrency. Concurrency is what lets parts of your code run on a background thread in parallel with the main thread, so it doesn’t block the UI. It can also be used to get work done faster by using more of the CPU cores in your system.

Our goal is to get the decoding off the main thread, so that work can happen on the background thread. Because we're in the main actor by default mode, fetchAndDisplaylmage and decodelmage are both isolated to the main actor. Main actor code can freely access all data and code that is accessible only to the main thread, which is safe because there's no concurrency on the main thread.

We want to offload the call to decodeImage, Which we can do by applying the @concurrent attribute to the decodeImage function.

@concurrent tells Swift to run the function in the background.

Changing where decodeImage runs also changes our assumptions about what state decodeImage can access. Let's take a look at the implementation.

The implementation is checking a dictionary of cached image data that's stored on the main actor, which is only safe to do on the main thread.

The Swift compiler shows us where the function is trying to access data on the main actor. This is exactly what we need to know to make sure we're not introducing bugs as we add concurrency. There are a few strategies you can use when breaking ties to the main actor so you can introduce concurrency safely.

In some cases, you can move the main actor code into a caller that always runs on the main actor. This is a good strategy if you want to make sure that work happens synchronously.

Or, you can use await to access the main actor from concurrent code asynchronously.

If the code doesn’t need to be on the main actor at all, you can add the nonisolated keyword to separate it from any actor.

We’re going to explore the first strategy now, and will talk about the others later on.

I'm going to move the image caching into fetchAndDisplayImage, which runs on the main actor. Checking the cache before making any async calls is good for eliminating latency. If the image is in the cache, fetchAndDisplayImage will complete synchronously without suspending at all. This means the results will be delivered to the UI immediately, and it will only suspend if the image is not already available.

And we can remove the url parameter from decodeImage because we don't need it anymore. Now, all we have to do is await the result of decodeImage.

An @concurrent function will always switch off of an actor to run. If you want the function to stay on whatever actor it was called on, you can use the nonisolated keyword.

Swift has additional ways to introduce more concurrency. For more information, see “Beyond the basics of structured concurrency”.

If we were providing decoding APIs as part of a library for many clients to use, using @concurrent isn't always the best API choice.

How long it takes to decode data depends on how large the data is, and decoding small amounts of data is okay to do on the main thread. For libraries, it's best to provide a nonisolated API and let clients decide whether to offload work.

Nonisolated code is very flexible, because you can call it from anywhere: if you call it from the main actor, it will stay on the main actor. If you call it from a background thread, it will stay on a background thread. This makes it a great default for general-purpose libraries.

When you offload work to the background, the system handles scheduling the work to run on a background thread. The concurrent thread pool contains all of the system's background threads, which can involve any number of threads. For smaller devices like a watch, there might be only be one or two threads in the pool. Large systems with more cores will have more background threads in the pool. It doesn't matter which background thread a task runs on, and you can rely on the system to make the best use of resources. For example, when a task suspends, the original thread will start running other tasks that are ready. When the task resumes, it can start running on any available thread in the concurrent pool, which might be different from the background thread it started on.

Now that we have concurrency, we will be sharing data among different threads. Sharing mutable state in concurrent code is notoriously prone to mistakes that lead to hard-to-fix runtime bugs. Swift helps you catch these mistakes at compile time so you can write concurrent code with confidence. Each time we go between the main actor and the concurrent pool, we share data between different threads. When we get the URL from the UI, it’s passed from the main actor out the background thread to fetch the image.

Fetching the image returns data, which is passed along to image decoding.

Then, after we’ve decoded the image, the image is passed back into the main actor, along with self.

Swift ensures that all of these values are accessed safely in concurrent code. Let's see what happens if the UI update ends up creating additional tasks that involve the URL.

Fortunately, URL is a value type. That means that when we copy the URL into the background thread, the background thread has a separate copy from the one that’s on the main thread. If the user enters a new URL through the UI, code on the main thread is free to use or modify its copy, and the changes have no effect on the value used on the background thread.

This means that it is safe to share value types like URL because it isn’t really sharing after all: each copy is independent of the others.

Value types have been a big part of Swift from the beginning. All of the basic types like strings, integers, and dates, are value types.

Collections of value types, such as dictionaries and arrays, are also value types. And so are structs and enums that store value types in them, like this Post struct.

We refer to types that are always safe to share concurrently as Sendable types. Sendable is a protocol, and any type that conforms to Sendable is safe to share. Collections like Array define conditional conformances to Sendable, so they are Sendable when their elements are.

Structs and enums are allowed to be marked Sendable when all of their instance data is Sendable.

And main actor types are implicitly Sendable, so you don’t have to say so explicitly.

Actors like the main actor protect non-Sendable state by making sure it’s only ever accessed by one task at a time. Actors might store values passed into its methods, and the actor might return a reference to its protected state from its methods.

Whenever a value is sent into or out of an actor, the Swift compiler checks that the value is safe to send to concurrent code. Let's focus on the async call to decodeImage.

Decode image is an instance method, so we're passing an implicit self argument.

Here, we see two values being sent outside the main actor, and a result value being sent back into the main actor.

'self' is my image model class, which is main actor isolated. The main actor protects the mutable state, so it's safe to pass a reference to the class to the background thread. And Data is a value type, so it's Sendable.

That leaves the image type. It could be a value type, like Data, in which case it would be Sendable. Instead let’s talk about types that are not Sendable, such as classes. Classes are reference types, meaning that when you assign one variable to another, they point at the same object in memory. If you change something about the object through one variable, such as scaling the image, then those changes are immediately visible through the other variables that point at the same object.

fetchAndDisplayImage does not use the image value concurrently. decodeImage runs in the background, so it can't access any state protected by an actor. It creates a new instance of an image from the given data. This image can't be referenced by any concurrent code, so it's safe to send over to the main actor and display it in the UI.

Let’s see what happens when we introduce some concurrency. First, this scaleAndDisplay method loads a new image on the main thread. The image variable points to this image object, which contains the cat picture.

Then, the function creates a task running on the concurrent pool, and that gets a copy of the image.

Finally, the main thread moves on to display the image.

Now, we have a problem. The background thread is changing the image: making the width and height different, and replacing the pixels with those of a scaled version. At the same time, the main thread is iterating over the pixels based on the old width and height.

That’s a data race. You might end up with a UI glitch, or more likely you’ll end up with a crash when your program tries to access outside of the pixel array’s bounds. Swift concurrency prevents data races with compiler errors if your code tries to share a non-Sendable type. Here, the compiler is indicating that the concurrent task is capturing the image, which is also used by the main actor to display the image. To correct this, we need to make sure that we avoid sharing the same object concurrently. If we want the image effect to be shown in the UI, the right solution is to wait for the scaling to complete before displaying the image. We can move all three of these operations into the task to make sure they happen in order.

displayImage has to run on the main actor, so we use await to call it from a concurrent task.

If we can make scaleAndDisplay async directly, we can simplify the code to not create a new task, and perform the three operations in order in the task that calls scaleAndDisplay.

Once we send the image to the main actor to display in the UI, the main actor is free to store a reference to the image, for example by caching the image object.

If we try to change the image after it's displayed in the UI, we'll get a compiler error about unsafe concurrent access.

We can address the issue by making any changes to the image before we send it over to the main actor. If you are using classes for your data model, your model classes will likely start on the main actor, so you can present parts of them in the UI.

If you eventually decide that you need to work with them on a background thread, make them nonisolated. But they should probably not be Sendable. You don’t want to be in a position where some of the model is being updated on the main thread and other parts of the model are being updated on the background thread. Keeping model classes non-Sendable prevents this kind of concurrent modification from occurring. It's also easier, because making a class Sendable usually requires using a low-level synchronization mechanism like a lock. Like classes, closures can create shared state.

Here is a function similar to one we had earlier that scales and displays an image. It creates an image object. Then, it calls perform(afterDelay:), providing it with a closure that scales the image object. This closure contains another reference to the same image. We call this a capture of the image variable. Just like non-Sendable classes, a closure with shared state is still safe as long as it isn't called concurrently.

Only make a function type Sendable if you need to share it concurrently.

Sendable checking occurs whenever some data passes between actors and tasks. It’s there to ensure that there are no data races that could cause bugs in your app.

Many common types are Sendable, and these can be freely shared across concurrent tasks.

Classes and closures can involve mutable state that is not safe to share concurrently, so use them from one task at a time.

You can still send an object from one task to another, but be sure to make all modifications to the object before sending it.

Moving asynchronous tasks to background threads can free up the main thread to keep your app responsive. If you find that you have a lot of data on the main actor that is causing those asynchronous tasks to “check in” with the main thread too often, you might want to introduce actors.

As your app grows over time, you may find that the amount of state on the main actor also grows.

You’ll introduce new subsystems to handle things like managing access to the network. This can lead to a lot of state living on the main actor, for example the set of open connections handled by the network manager, which we would access whenever we need to fetch data over the network.

When we start using these extra subsystems, the fetch-and-display image task from earlier has gotten more complicated: it’s trying to run on the background thread, but it has to hop over to the main thread because that’s where the network manager’s data is. This can lead to contention, where many tasks are trying to run code on the main actor at the same time. The individual operations might be quick, but if you have a lot of tasks doing this, it can add up to UI glitches.

Earlier, we moved code off the main thread by putting it into an @concurrent function.

Here, all of the work is in accessing the network manager’s data. To move that out, we can introduce our own network manager actor.

Like the main actor, actors isolate their data, so you can only access that data when running on that actor.

Along with the main actor, you can define your own actor types. An actor type is similar to a main-actor class. Like a main actor-class, it will isolate its data so that only one thread can touch the data at a time.

An actor type is also Sendable, so you can freely share actor objects. Unlike the main actor, there can be many actor objects in a program, each of which is independent.

In addition, actor objects aren’t tied to a single thread like the main actor is.

So moving some state from the main actor over to an actor object will allow more code to execute on a background thread, leaving the main thread open to keep the UI responsive.

Use actors when you find that storing data on the main actor is causing too much code to run on the main thread. At that point, separate out the data for one non-UI part of your code, such as the network management code, into a new actor.

Be aware that most of the classes in your app probably are not meant to be actors: UI-facing classes should stay on the main actor so they can interact directly with UI state. Model classes should generally be on the main actor with the UI, or kept non-Sendable, so that you don’t encourage lots of concurrent accesses to your model.

In this talk, we started with single-threaded code.

As our needs grew, we introduced asynchronous tasks to hide latency, concurrent code to run on a background thread, and actors to move data access off the main thread.

Over time, many apps will follow this same course.

Use profiling tools to identify when and what code to move off the main thread.

Swift concurrency will help you separate that code from the main thread correctly, improving the performance and responsiveness of your app.

We have some recommended build settings for your app to help with the introduction of concurrency. The Approachable Concurrency setting enables a suite of upcoming features that make easier to work with concurrency. We recommend that all projects adopt this setting.

For Swift modules that are primarily interacting with the UI, such as your main app module, we also recommend setting the default actor isolation to 'main actor'. This puts code on the main actor unless you’ve said otherwise. These settings work together to make single-threaded apps easier to write, and provide a more approachable path to introducing concurrency when you need it.

Swift concurrency is a tool designed to help improve your app.

Use it to introduce asynchronous or concurrent code when you find performance problems with your app. The Swift 6 migration guide can help answer more questions about concurrency and the road to data-race safety. And to see how the concepts in this talk apply in an example app, please watch our code-along companion talk. Thank you.

Code

3:20 - Single-threaded program

var greeting = "Hello, World!"

func readArguments() { }

func greet() {
  print(greeting)
}

readArguments()
greet()

4:13 - Data types in a the app

struct Image {
}

final class ImageModel {
  var imageCache: [URL: Image] = [:]
}

final class Library {
  static let shared: Library = Library()
}

4:57 - Load and display a local image

import Foundation

class Image {
}

final class View {
  func displayImage(_ image: Image) {
  }
}

final class ImageModel {
  var imageCache: [URL: Image] = [:]
  let view = View()

  func fetchAndDisplayImage(url: URL) throws {
    let data = try Data(contentsOf: url)
    let image = decodeImage(data)
    view.displayImage(image)
  }

  func decodeImage(_ data: Data) -> Image {
    Image()
  }
}

final class Library {
  static let shared: Library = Library()
}

5:36 - Fetch and display an image over the network

import Foundation

struct Image {
}

final class View {
  func displayImage(_ image: Image) {
  }
}

final class ImageModel {
  var imageCache: [URL: Image] = [:]
  let view = View()

  func fetchAndDisplayImage(url: URL) throws {
    let (data, _) = try URLSession.shared.data(from: url)
    let image = decodeImage(data)
    view.displayImage(image)
  }

  func decodeImage(_ data: Data) -> Image {
    Image()
  }
}

final class Library {
  static let shared: Library = Library()
}

6:10 - Fetch and display image over the network asynchronously

import Foundation

class Image {
}

final class View {
  func displayImage(_ image: Image) {
  }
}

final class ImageModel {
  var imageCache: [URL: Image] = [:]
  let view = View()

  func fetchAndDisplayImage(url: URL) async throws {
    let (data, _) = try await URLSession.shared.data(from: url)
    let image = decodeImage(data)
    view.displayImage(image)
  }

  func decodeImage(_ data: Data) -> Image {
    Image()
  }
}

final class Library {
  static let shared: Library = Library()
}

7:31 - Creating a task to perform asynchronous work

import Foundation

class Image {
}

final class View {
  func displayImage(_ image: Image) {
  }
}

final class ImageModel {
  var imageCache: [URL: Image] = [:]
  let view = View()
  var url: URL = URL("https://swift.org")!

  func onTapEvent() {
    Task {
      do {
	try await fetchAndDisplayImage(url: url)
      } catch let error {
        displayError(error)
      }
    }
  }

  func displayError(_ error: any Error) {
  }

  func fetchAndDisplayImage(url: URL) async throws {
  }
}

final class Library {
  static let shared: Library = Library()
}

9:15 - Ordered operations in a task

import Foundation

class Image {
  func applyImageEffect() async { }
}

final class ImageModel {
  func displayImage(_ image: Image) {
  }

  func loadImage() async -> Image {
    Image()
  }
  
  func onButtonTap() {
    Task {
      let image = await loadImage()
      await image.applyImageEffect()
      displayImage(image)
    }
  }
}

9:38 - Fetch and display image over the network asynchronously

import Foundation

class Image {
}

final class View {
  func displayImage(_ image: Image) {
  }
}

final class ImageModel {
  var imageCache: [URL: Image] = [:]
  let view = View()

  func fetchAndDisplayImage(url: URL) async throws {
    let (data, _) = try await URLSession.shared.data(from: url)
    let image = decodeImage(data)
    view.displayImage(image)
  }

  func decodeImage(_ data: Data) -> Image {
    Image()
  }
}

10:40 - Fetch and display image over the network asynchronously

import Foundation

class Image {
}

final class View {
  func displayImage(_ image: Image) {
  }
}

final class ImageModel {
  var imageCache: [URL: Image] = [:]
  let view = View()

  func fetchAndDisplayImage(url: URL) async throws {
    let (data, _) = try await URLSession.shared.data(from: url)
    let image = decodeImage(data, at: url)
    view.displayImage(image)
  }

  func decodeImage(_ data: Data, at url: URL) -> Image {
    Image()
  }
}

11:11 - Fetch over network asynchronously and decode concurrently

import Foundation

class Image {
}

final class View {
  func displayImage(_ image: Image) {
  }
}

final class ImageModel {
  var imageCache: [URL: Image] = [:]
  let view = View()

  func fetchAndDisplayImage(url: URL) async throws {
    let (data, _) = try await URLSession.shared.data(from: url)
    let image = await decodeImage(data, at: url)
    view.displayImage(image)
  }

  @concurrent
  func decodeImage(_ data: Data, at url: URL) async -> Image {
    Image()
  }
}

11:30 - Implementation of decodeImage

final class View {
  func displayImage(_ image: Image) {
  }
}

final class ImageModel {
  var cachedImage: [URL: Image] = [:]
  let view = View()

  func fetchAndDisplayImage(url: URL) async throws {
    let (data, _) = try await URLSession.shared.data(from: url)
    let image = await decodeImage(data, at: url)
    view.displayImage(image)
  }

  @concurrent
  func decodeImage(_ data: Data, at url: URL) async -> Image {
    if let image = cachedImage[url] {
      return image
    }

    // decode image
    let image = Image()
    cachedImage[url] = image
    return image
  }
}

12:37 - Correct implementation of fetchAndDisplayImage with caching and concurrency

import Foundation

class Image {
}

final class View {
  func displayImage(_ image: Image) {
  }
}

final class ImageModel {
  var cachedImage: [URL: Image] = [:]
  let view = View()

  func fetchAndDisplayImage(url: URL) async throws {
    if let image = cachedImage[url] {
      view.displayImage(image)
      return
    }

    let (data, _) = try await URLSession.shared.data(from: url)
    let image = await decodeImage(data)
    view.displayImage(image)
  }

  @concurrent
  func decodeImage(_ data: Data) async -> Image {
    // decode image
    Image()
  }
}

13:30 - JSONDecoder API should be non isolated

// Foundation
import Foundation

nonisolated
public class JSONDecoder {
  public func decode<T: Decodable>(_ type: T.Type, from data: Data) -> T {
    fatalError("not implemented")
  }
}

15:18 - Fetch over network asynchronously and decode concurrently

import Foundation

class Image {
}

final class View {
  func displayImage(_ image: Image) {
  }
}

final class ImageModel {
  var imageCache: [URL: Image] = [:]
  let view = View()

  func fetchAndDisplayImage(url: URL) async throws {
    let (data, _) = try await URLSession.shared.data(from: url)
    let image = await decodeImage(data, at: url)
    view.displayImage(image)
  }

  @concurrent
  func decodeImage(_ data: Data, at url: URL) async -> Image {
    Image()
  }
}

16:30 - Example of value types

// Value types are common in Swift
import Foundation

struct Post {
  var author: String
  var title: String
  var date: Date
  var categories: [String]
}

16:56 - Sendable value types

import Foundation

// Value types are Sendable
extension URL: Sendable {}

// Collections of Sendable elements
extension Array: Sendable where Element: Sendable {}

// Structs and enums with Sendable storage
struct ImageRequest: Sendable {
  var url: URL
}

// Main-actor types are implicitly Sendable
@MainActor class ImageModel {}

17:25 - Fetch over network asynchronously and decode concurrently

import Foundation

class Image {
}

final class View {
  func displayImage(_ image: Image) {
  }
}

final class ImageModel {
  var imageCache: [URL: Image] = [:]
  let view = View()

  func fetchAndDisplayImage(url: URL) async throws {
    let (data, _) = try await URLSession.shared.data(from: url)
    let image = await self.decodeImage(data, at: url)
    view.displayImage(image)
  }

  @concurrent
  func decodeImage(_ data: Data, at url: URL) async -> Image {
    Image()
  }
}

18:34 - MyImage class with reference semantics

import Foundation

struct Color { }

nonisolated class MyImage {
  var width: Int
  var height: Int
  var pixels: [Color]
  var url: URL

  init() {
    width = 100
    height = 100
    pixels = []
    url = URL("https://swift.org")!
  }

  func scale(by factor: Double) {
  }
}

let image = MyImage()
let otherImage = image // refers to the same object as 'image'
image.scale(by: 0.5)   // also changes otherImage!

19:19 - Concurrently scaling while displaying an image is a data race

import Foundation

struct Color { }

nonisolated class MyImage {
  var width: Int
  var height: Int
  var pixels: [Color]
  var url: URL

  init() {
    width = 100
    height = 100
    pixels = []
    url = URL("https://swift.org")!
  }

  func scaleImage(by factor: Double) {
  }
}

final class View {
  func displayImage(_ image: MyImage) {
  }
}

final class ImageModel {
  var cachedImage: [URL: MyImage] = [:]
  let view = View()

  // Slide content start
  func scaleAndDisplay(imageName: String) {
    let image = loadImage(imageName)
    Task { @concurrent in
      image.scaleImage(by: 0.5)
    }

    view.displayImage(image)
  }
  // Slide content end

  func loadImage(_ imageName: String) -> MyImage {
    // decode image
    return MyImage()
  }
}

20:38 - Scaling and then displaying an image eliminates the data race

import Foundation

struct Color { }

nonisolated class MyImage {
  var width: Int
  var height: Int
  var pixels: [Color]
  var url: URL

  init() {
    width = 100
    height = 100
    pixels = []
    url = URL("https://swift.org")!
  }

  func scaleImage(by factor: Double) {
  }
}

final class View {
  func displayImage(_ image: MyImage) {
  }
}

final class ImageModel {
  var cachedImage: [URL: MyImage] = [:]
  let view = View()

  func scaleAndDisplay(imageName: String) {
    Task { @concurrent in
      let image = loadImage(imageName)
      image.scaleImage(by: 0.5)
      await view.displayImage(image)
    }
  }

  nonisolated
  func loadImage(_ imageName: String) -> MyImage {
    // decode image
    return MyImage()
  }
}

20:54 - Scaling and then displaying an image within a concurrent asynchronous function

import Foundation

struct Color { }

nonisolated class MyImage {
  var width: Int
  var height: Int
  var pixels: [Color]
  var url: URL

  init() {
    width = 100
    height = 100
    pixels = []
    url = URL("https://swift.org")!
  }

  func scaleImage(by factor: Double) {
  }
}

final class View {
  func displayImage(_ image: MyImage) {
  }
}

final class ImageModel {
  var cachedImage: [URL: MyImage] = [:]
  let view = View()

  @concurrent
  func scaleAndDisplay(imageName: String) async {
    let image = loadImage(imageName)
    image.scaleImage(by: 0.5)
    await view.displayImage(image)
  }

  nonisolated
  func loadImage(_ imageName: String) -> MyImage {
    // decode image
    return MyImage()
  }
}

21:11 - Scaling, then displaying and concurrently modifying an image is a data race

import Foundation

struct Color { }

nonisolated class MyImage {
  var width: Int
  var height: Int
  var pixels: [Color]
  var url: URL

  init() {
    width = 100
    height = 100
    pixels = []
    url = URL("https://swift.org")!
  }

  func scaleImage(by factor: Double) {
  }

  func applyAnotherEffect() {
  }
}

final class View {
  func displayImage(_ image: MyImage) {
  }
}

final class ImageModel {
  var cachedImage: [URL: MyImage] = [:]
  let view = View()

  // Slide content start
  @concurrent
  func scaleAndDisplay(imageName: String) async {
    let image = loadImage(imageName)
    image.scaleImage(by: 0.5)
    await view.displayImage(image)
    image.applyAnotherEffect()
  }
  // Slide content end

  nonisolated
  func loadImage(_ imageName: String) -> MyImage {
    // decode image
    return MyImage()
  }
}

21:20 - Applying image transforms before sending to the main actor

import Foundation

struct Color { }

nonisolated class MyImage {
  var width: Int
  var height: Int
  var pixels: [Color]
  var url: URL

  init() {
    width = 100
    height = 100
    pixels = []
    url = URL("https://swift.org")!
  }

  func scaleImage(by factor: Double) {
  }

  func applyAnotherEffect() {
  }
}

final class View {
  func displayImage(_ image: MyImage) {
  }
}

final class ImageModel {
  var cachedImage: [URL: MyImage] = [:]
  let view = View()

  // Slide content start
  @concurrent
  func scaleAndDisplay(imageName: String) async {
    let image = loadImage(imageName)
    image.scaleImage(by: 0.5)
    image.applyAnotherEffect()
    await view.displayImage(image)
  }
  // Slide content end

  nonisolated
  func loadImage(_ imageName: String) -> MyImage {
    // decode image
    return MyImage()
  }
}

22:06 - Closures create shared state

import Foundation

struct Color { }

nonisolated class MyImage {
  var width: Int
  var height: Int
  var pixels: [Color]
  var url: URL

  init() {
    width = 100
    height = 100
    pixels = []
    url = URL("https://swift.org")!
  }

  func scale(by factor: Double) {
  }

  func applyAnotherEffect() {
  }
}

final class View {
  func displayImage(_ image: MyImage) {
  }
}

final class ImageModel {
  var cachedImage: [URL: MyImage] = [:]
  let view = View()

  // Slide content start
  @concurrent
  func scaleAndDisplay(imageName: String) async throws {
    let image = loadImage(imageName)
    try await perform(afterDelay: 0.1) {
      image.scale(by: 0.5)
    }
    await view.displayImage(image)
  }

  nonisolated
  func perform(afterDelay delay: Double, body: () -> Void) async throws {
    try await Task.sleep(for: .seconds(delay))
    body()
  }
  // Slide content end
  
  nonisolated
  func loadImage(_ imageName: String) -> MyImage {
    // decode image
    return MyImage()
  }
}pet.

23:47 - Network manager class

import Foundation

nonisolated class MyImage { }

struct Connection {
  func data(from url: URL) async throws -> Data { Data() }
}

final class NetworkManager {
  var openConnections: [URL: Connection] = [:]

  func openConnection(for url: URL) async -> Connection {
    if let connection = openConnections[url] {
      return connection
    }

    let connection = Connection()
    openConnections[url] = connection
    return connection
  }

  func closeConnection(_ connection: Connection, for url: URL) async {
    openConnections.removeValue(forKey: url)
  }

}

final class View {
  func displayImage(_ image: MyImage) {
  }
}

final class ImageModel {
  var cachedImage: [URL: MyImage] = [:]
  let view = View()
  let networkManager: NetworkManager = NetworkManager()

  func fetchAndDisplayImage(url: URL) async throws {
    if let image = cachedImage[url] {
      view.displayImage(image)
      return
    }

    let connection = await networkManager.openConnection(for: url)
    let data = try await connection.data(from: url)
    await networkManager.closeConnection(connection, for: url)

    let image = await decodeImage(data)
    view.displayImage(image)
  }

  @concurrent
  func decodeImage(_ data: Data) async -> MyImage {
    // decode image
    return MyImage()
  }
}

25:10 - Network manager as an actor

import Foundation

nonisolated class MyImage { }

struct Connection {
  func data(from url: URL) async throws -> Data { Data() }
}

actor NetworkManager {
  var openConnections: [URL: Connection] = [:]

  func openConnection(for url: URL) async -> Connection {
    if let connection = openConnections[url] {
      return connection
    }

    let connection = Connection()
    openConnections[url] = connection
    return connection
  }

  func closeConnection(_ connection: Connection, for url: URL) async {
    openConnections.removeValue(forKey: url)
  }

}

final class View {
  func displayImage(_ image: MyImage) {
  }
}

final class ImageModel {
  var cachedImage: [URL: MyImage] = [:]
  let view = View()
  let networkManager: NetworkManager = NetworkManager()

  func fetchAndDisplayImage(url: URL) async throws {
    if let image = cachedImage[url] {
      view.displayImage(image)
      return
    }

    let connection = await networkManager.openConnection(for: url)
    let data = try await connection.data(from: url)
    await networkManager.closeConnection(connection, for: url)

    let image = await decodeImage(data)
    view.displayImage(image)
  }

  @concurrent
  func decodeImage(_ data: Data) async -> MyImage {
    // decode image
    return MyImage()
  }
}

Engage players with the Apple Games app

Meet the Games app – a new destination for players to keep up with what's happening in their games, discover new ones, and play with friends, all in one place. Learn how to set up your game for optimal visibility in the Games app, integrate Game Center to enable social play, and keep players coming back with In-App Events. To get the most out of this session, we also recommend watching “Get started with Game Center”.

Chapters

Resources

Related Videos

WWDC25

WWDC23

WWDC21

Transcript

Hi, I’m Logan Pratt, and I'm an engineer on the App Store Connect team.

In this video, I’ll guide you through the new Apple Games app, and all the great features you can use to bring players into your game, and get them excited to keep playing it. Today, players can go on the App Store to find new games and discover what features they offer.

And now, the Games app gives players a brand-new, all-in-one destination for their games. They can launch their games, discover new ones, and have even more fun playing with their friends.

The Games app comes pre-installed on iPhone, iPad, and Mac. Its features are integrated throughout the operating system, and can appear in widgets, notifications, and the App Store. Let’s take a look.

When a player first opens the app, they’re taken to the Home tab.

This will highlight events happening in their games, what their friends have been playing, updates on their friends’ game center activity, curated game collections, and more. Players can also tap on a recently played game to instantly start the game back up, right where they left off. Your games will each have a dedicated page in the app. New players can see all the details and get excited about your game, And existing players can check their progress on your game’s achievements, leaderboards, challenges, and multiplayer activities, as well as download updates. Tapping Invite to play makes it super easy to jump right into playing with friends.

In the Play Together tab, players can find out what their friends are playing, compare high scores, and invite friends to compete or join multiplayer experiences.

The Library is where players can find the games they've already installed in one place, or redownload a game they've already purchased. Then they can launch their games by tapping Play.

And the Search tab is a great starting point for players to discover their new favorite games. For many players, the Games app will be a daily hub whenever they're ready to have some fun.

To tap in to all of the benefits this app will provide your games, I’ll show you how to set up your game information to display in the app.

I’ll also show you how to leverage Game Center features, to maximize the places your game will appear.

And, I'll show you some other techniques that can help you keep players engaged in your game more than ever before.

Let’s get started! Setting your Primary or Secondary Category to Games is what makes your game eligible to display in the Games app, with its own customized game page.

For example, here's the game page for the new Arcade game, What the Clash, and it looks great! It shows the game's icon, name, and subtitle. These show up in lots of places throughout the Games app. They're your first opportunity to spark a player's imagination.

The Games app will automatically customize the background to complement the colors in your game's icon.

Since What the Clash has integrated with Game Center, I can see Memoji for my friends who most recently played the game. These appear around the icon. The game page also displays the age rating and category.

The subcategory you've selected for your game will also appear here. What the Clash is set as an Action game.

Subcategories help players find games in their favorite genres. Each one has a unique icon that appears right next to it on the game page.

I’m developing my own game with some friends. It's called The Coast, and I want it to look amazing in the Games app too! In App Store Connect, I've already set the game's Name, Subtitle, Category, and Age Rating in the App Information tab.

I want The Coast to appear in the Games app, so I set its Primary category to Games and its subcategory to Action.

Back in the Games app, I’m interested to see how players will learn more about “What the Clash”. Next to the category, I see a Game Controller icon, which tells me this game has support for them.

I love playing games with a controller! If I want to find a few more, I can go to the Library and Filter my installed games by controller support.

If your game supports controllers, make sure you add the Game Controller Capability to your game in Xcode. That way, your game will also have the game controller badge in the Games app and the App Store. To learn more, check out “Tap into virtual and physical game controllers”.

Alright, my friends play this game and it supports controllers?! Now I gotta check out its previews to see what I’m missing out on! This game looks fun! Underneath the previews, I can read the game's description and see a more prominent view of the game details.

Your game's name, category, and previews also appear in search results, and are a great way to attract new players.

To set these up for The Coast, I'll go back to App Store Connect, and click on my new iOS version. I uploaded a high-quality video preview and some screenshots, then set the description, and accurate keywords for search.

Anyone viewing The Coast in the Games app will now get a great preview into what it’s all about.

Let's rewind a bit. Even before your game is launched, you can generate excitement and anticipation with a Pre-Order.

The developer, Triband, set up a pre-order before What the Clash was released. When you set up a Pre-Order, your game appears in the Games app, listed as an Upcoming Release. Players can opt in to have it automatically downloaded to their device on release day. And they can even share it with friends! To learn more about configuring a Pre-Order, check out “What’s new in App Store pre-orders.” Those are the basics for making your game really shine on the Games app. But there's so much more you can do with the help of Game Center. Game Center is a suite of features that helps players engage deeper in their games, and discover new games through their friends.

Game Center also makes it easier for players to connect and compete with their existing contacts, in keeping with Apple’s commitment to privacy.

Your game can add many features from Game Center, including Achievements, Leaderboards, Challenges and Activities. These features let players keep track of their progress, compete for high scores, and invite their friends to play along.

Now that I’ve started playing What The Clash, its game center features are prioritized to display at the top of the game page. I can see the multiplayer activities I can join, which of my friends are also playing this game, active challenges I have with those friends, and all of the game's leaderboards and achievements.

I can tap "Play" to jump right back in where I left off, or “Invite To Play” if I want to start a new challenge or activity with my friends.

And that's just part of the story. Once you implement Game Center features, your game will show up in many other places in the Games app, like Home tab highlights, and the Play Together tab. You can configure your Game Center features using App Store Connect or the App Store Connect API. And new this year, directly in Xcode.

My friends, Josh and Varokas, have also been working on The Coast with me and created some new game center features for it using Xcode. I'm leaving most of the setup to them, but I’ll highlight some important pieces in App Store Connect too.

You can find out more about each feature, and how they're set up, in "Get started with Game Center." To add game center features to The Coast, I first had to click the Game Center checkbox for my iOS version to enable it.

The more Game Center features you add to your game, the more chances your game has to appear across the Games app.

Let’s talk about Achievements! They track gameplay progress, encourage exploration to complete them all, reward players for reaching special milestones, and allow them to share that progress with friends.

I’ve already completed a few achievements in What the Clash, but I wanna see what other goals to aim for. I’ll tap on Achievements to see them all.

Now, I can see all the Achievements I've completed, progress on ones I'm still working on, and which ones my friends have earned.

When players have completed an achievement, they’ll see its localized image, display name, and earned description. From here, they can tap the Play button to keep earning more.

Another feature that lets players track their progress in your game is Leaderboards.

They’re a great way to let players measure their skill, compare their talents with friends, and encourage competition and continued attempts at a high score.

In the Games app, Home and Play Together highlight score updates on leaderboards that friends are playing on, which will encourage players to attempt to beat the high score.

Leaderboards display their localized display name and description, at the top of the screen.

Description is a new field, so be sure you set it for all of your new and existing leaderboards.

For a recurring leaderboard, players will also see how much time is left until it resets.

This leaderboard has “pts" set as the suffix for the score, in the English localization. I’m 280 points behind first place. I gotta keep playing to catch up! To see all the other leaderboards in What the Clash, I can tap the Leaderboards section on its Game page.

Here, I can see which leaderboards I’ve ranked on, which ones I haven’t played yet, and which ones my friends are playing on. Josh and Varokas told me they've set up a new leaderboard for the Cape Cod level in The Coast. Since leaderboard descriptions are a new field, and the images show up in the Games app, I want to make sure they’re added. So I’ll check it out in App Store Connect. I can go to the Game Center tab on the left side, to see all of The Coast’s leaderboards. Nice! I see one for Cape Cod, that level is super fun.

Everything they've set up looks good so far. But I want to check its description and image. To do this, I’ll click into a localization. Perfect! They set up a detailed description, and uploaded a high quality image. Our players better avoid the sneaky lobsters! Now that I’ve made sure our leaderboards are set up properly, I want to see what else What The Clash uses leaderboards for. Tapping “play” will bring me directly into the level associated with the leaderboard, using a deep link. But for now, let’s see what happens when I tap ‘Challenge’.

Challenges are a great new way to help players bring friends into your game. They’re built on top of leaderboards and turn single-player games into social experiences with friends. Challenges give players a way to compete in score-based rounds with any of their friends, see scores in real time, crown a winner, and have a rematch.

And they’re time-limited, to motivate players to jump in before the time runs out.

The Home tab of the Games app highlights the results of Challenges, and tracks progress and invitations. This gives extra visibility to your leaderboards and your game.

Nikki won this challenge, so I can tap “Rematch” to restart it.

But, I know she’s too good at this level, so instead, I’ll go to the Play Together tab to start a different Challenge.

In Play Together, players can see ongoing activity from their game center friends such as invites, multiplayer progress, and score updates.

I want to start a new Challenge, so I’ll tap “Pick a game”, This shows all installed games that support challenges and multiplayer activities along with recommendations on which ones to start playing.

I selected What The Clash, and want to start the Elefriends Rally challenge.

Its localized image, display name, and description, give a preview into what kind of challenge this will be.

I can choose how many tries we’ll each get to submit our best score and decide how long the challenge will last.

I want this one to last 1 day. So we’ll all have 3 tries in the next day to compete for the win.

Then, I can invite friends to join me in the challenge, and tap start to send them a notification to join. You can also show this invite flow inside your game, so players can invite each other to challenges without leaving it.

Looks like my friends accepted my challenge and started playing! Nikki is in the lead and used 2 of her 3 tries so far.

I’ve only used one and there’s 23 hours left, so I’m not worried, there’s plenty of time for a comeback! Tapping play will take me directly to the Elefriends Rally in the game, where I can start the challenge using the same deep link as its leaderboard. Challenges can be a great way to boost player engagement! We definitely need to set some up for The Coast! Varokas created one for our Cape Cod leaderboard, so I’ll check it out in App Store Connect. I’ll head back to the Game Center page, scroll down to the Challenges section, and click on the Cape Cod challenge. Awesome! He already uploaded a high-quality default image to show the experience of the challenge and get our players excited to play it. It’s attached to the Cape Cod leaderboard so players will be taken straight to the Cape Cod level when they start the challenge.

He also set its Display Name and Description for each localization, thanks Varokas! We haven’t had a chance to create images for different localizations yet, but we can upload them here or in Xcode, to override the default image when we do.

When setting up challenges in your game, you can specify the minimum platform versions required for participation. Only live versions with Game Center enabled can be selected here.

Now that we’ve set up our Cape Cod Challenge and Leaderboard, we need to deep link players directly to the Cape Cod level in our game.

Our other new feature, Activities, can do this. I’ll show you how.

Activities allow you to define a destination in your game, and use a deep link to bring players directly to it. They can be associated with Game Center features to deep link players directly into scoring on a leaderboard, starting a challenge, or completing an achievement. And they’re especially powerful for helping players discover and join multiplayer experiences. The Games app promotes your Multiplayer Activities to players on the Home tab and in the Play Together tab, same as with Challenges. On the Game Page, Multiplayer Activities are displayed right at the top. And tapping “Invite to Play” brings me to the Play Together popup to view Challenges and Multiplayer Activities. This activity looks cool, I’m gonna check it out.

When tapping into an activity, players will see its localized image, display name and description.

The activity’s participant range will let players know how many friends can join them. And the generated party code will give players the option to share it with friends. They can share this invitation as a link outside of the Games app, even to friends playing across platforms.

Or they can invite friends in the Games app. I want to play this activity with Nikki, so I’ll select her name.

Inviting a friend to an activity within the Games app will send them a push notification to join.

Then I can tap “invite and join” to start it.

And look at that! I’m in The What the Clash activity lobby where Nikki can join by tapping the notification or entering the party code in the game. To promote and give more visibility for our multiplayer mode where players can race around our levels in the Coast, Josh and Varokas created a Multiplayer Activity called Race at Sea.

I want to check out what they’ve set up in App Store Connect. They uploaded a high-quality default image, set the participants range, and selected “yes” for supports party code.

This will display '2-32 players' on the activity card in the Games app, so players will know how many people they can play with, and give them a party code to share with friends.

The activity deep link is what brings players directly to this activity in the game, and has customizable properties to influence how it affects your gameplay.

Attaching the Cape Cod leaderboard to this activity will use the configured deep link to bring players straight to the Cape Cod level in the Coast, when tapping play on its Leaderboard or Challenge.

And just like Challenges, its Display name, description, and image can be localized.

So far, I’ve covered setting up your game in App Store Connect, and all the Game Center features that can boost your game's visibility across the Games app, and make it easier for players to invite their friends into your game. Now, let’s talk about something super important for a long-term success: using time-limited content to keep players engaged after they’ve downloaded your game. In the Games app, In-App events highlight timely events within Games, such as limited or new content, seasonal activities, or special promotions. This creates a sense of urgency and excitement! Using events will expand your reach by promoting them all over the Games app, keeping your current players informed, and reconnecting with players who haven’t played in a while. The Home tab highlights your ongoing events with their Event card artwork or video, and Event badge to inform players what type of event it is, and the Event's localized name and short description, giving players a quick peek into the event. There’s also a dedicated Events & Updates section within the Library, which will show all current and upcoming events.

And when existing players search for your game, they’ll see its ongoing event as the search result card, in place of the regular app preview card.

The Game Page will also highlight your game’s events, with high priority events taking precedence. I wanna see what other events What The Clash has. Tapping into the events section will show all current and upcoming events. Current events show as “Happening Now”, and Upcoming events show their configured start date and time.

This event looks cool, so I’ll definitely come back to play when it starts. Tapping on an event brings up the full-screen event details page with its high-quality artwork or video, badge, name, and long description, giving an even more detailed preview into the event. I can share it with friends and tap play to go straight to the event in the game.

Existing players can tap “play” from any of the places events show in the Games app, making it super easy to jump in. This uses your configured event deep link to bring them directly to the event in your game.

And for players who don't have your game installed, the View button will take them to the Game Page so they can download it and learn more.

To promote our new Race at Sea activity to our player base, I created two in-app events for it. Let’s check them out! In App Store Connect, I’ll go to the In-App Events tab to view them. I created an In-App Event to promote the launch of our Race at Sea Activity, so players see it front and center in the Games app. I also created an event for National Lobster Day, called Lobster Bash, to bring players back to the game, and straight into the Race at Sea activity. We have all sorts of exciting surprises, special content and even more sneaky lobsters in store for this event! I uploaded a short looping video for the event card, and high-quality artwork for the event details page. This gives players a preview of the event and grabs their attention when it pops up in the Games app. To make Lobster Bash even more visible, I set the Publish Start Date to be two weeks before the actual start date. This way, it’ll appear as an “upcoming event” in the Games app. Players can then sign up to get a notification when it goes live and share it with friends. I set the Event deep link to point to our Race at Sea activity. This will bring players directly to the activity in The Coast when they tap “play” on the event in the Games app. I also added the “event=lobsterbash” parameter to the Event deep link, so our in-game handler will know which event linked the player to this activity. I want this to be the first event players see on The Coast’s game page, so I set this to be a high-priority event. I can’t wait for our players to see this! To learn more about In-App Events and how to set them up in App Store Connect, go check out our previous videos, "Meet in-app events on the App Store", and "Get started with in-app events".

I've covered a lot of ground, from setting up your game’s information, game center features, in-app events, and more. You should now be all set to bring your game to the Games app! If you have an existing game, take a look at your game page in the Games app, and see if you need to update any information. Make sure you have Game Center enabled, and integrate any features you haven't added yet, to appear in more places across the new app. And consider adding some excitement by planning a new in-app event.

Thanks for watching, I can’t wait to play all of your fun games soon!

Enhance child safety with PermissionKit

Discover how PermissionKit helps you enhance communication safety for children in your app. We'll show you how to use this new framework to create age-appropriate communication experiences and leverage Family Sharing for parental approvals. You'll learn how to build permission requests that seamlessly integrate with Messages, handle parental responses, and adapt your UI for child users. To get the most out of this session, we recommend first watching “Deliver age-appropriate experiences in your app” from WWDC25.

Chapters

Resources

Related Videos

WWDC25

Transcript

Hi, I’m Andrew, an engineer on the family team. Welcome to “Enhance child safety with PermissionKit.” A parent’s most important responsibility is to keep their children safe. In a digital environment, keeping children safe begins with a conversation between a child and their parents and guardians.

That’s especially true when it comes to who the child communicates with online. Starting that conversation from an app can be difficult and involves a significant amount of technical overhead. The best place for parents to communicate with their child is somewhere they're already communicating: Messages. In iOS, iPadOS, and macOS 26, you can help start a conversation in Messages between a child and their parents using PermissionKit.

With PermissionKit, children can ask their parents to communicate with someone new. In this video, I’ll start by introducing you to the new PermissionKit framework.

Then, I’ll show you how you can adapt your app’s UI with child safety in mind by using this new API.

Next, I’ll explain how to create a communication permission question that a child can ask their parents and guardians right in Messages.

Finally, I’ll show you how to listen for and respond to answers from parents and guardians. This video assumes that your app has communication functionality, and that you already have a way to determine the age or age range of each of your users. The PermissionKit API I’m about to show should only be used when you know the current user is a child.

If you don’t already have your own account systems to determine if the current user is a child, use the new declared age range API.

Parents can allow their kids to share the age range associated with their child accounts, which can then be requested through the declared age range API.

Watch “Deliver age appropriate experiences in your app” from WWDC25 for more details.

Here are a few other things to consider when deciding if PermissionKit is right for your app.

PermissionKit leverages users’ Family Sharing group to connect children with their parents and guardians.

This means that users must be part of a Family Sharing group to realize the full potential of PermissionKit. If the child is not part of a Family Sharing group when a request is made, the API will return a default response. Additionally, the parents or guardians must enable Communication Limits for the child. The API will also return a default response if Communication Limits is not enabled.

Let me start by introducing you to PermissionKit.

PermissionKit is a new framework that provides you a quick and easy way to create consistent, first-class permission experiences, between a requester and permission authorities. In iOS, iPadOS, and macOS 26, you can help keep children safe online by adopting PermissionKit. PermissionKit can be used to start a conversation in Messages between a child and the parents and guardians in their Family Sharing group. In your app, children can request to communicate with someone new over Messages. Parents have the opportunity to approve or decline their child’s request to communicate directly within the resulting Messages conversation.

Now I’ll talk about how to adopt PermissionKit, starting with how to make your UI age appropriate. Landmarks is an app for viewing and learning all about Landmarks. I’ll go over an example of adding a new chat feature to the app, so users can talk with each other about their favorites. Landmarks is an app made for all ages, So it’s a perfect fit for PermissionKit.

For children, your app should hide content from unknown senders. Some examples include a preview for a message, profile pictures, and any other potentially sensitive content not suitable for children.

One way this can be accomplished is by awaiting the knownHandles(in:) method of the CommunicationLimits singleton. Given a set of handles, this API will perform an optimized lookup and return a subset of those handles that are known by the system. This could be supplemented with information from your own databases. There’s no need to start fresh if this data is already available to you in your own systems. Then, determine if the handle or group of handles that the child is trying to interact with, contains only known handles. Only show the underlying content if the handles are known.

Next, I’ll show you how to create and send a question so that the child can ask permission from their parents. PermissionKit provides a question abstraction, which contains all the details of a child’s permission request. Each question has with it a topic which further describes the question being asked. For example, a communication topic packages all the information about a person, or group of people, a child is trying to communicate with in some way. At a minimum, this information includes a handle for the person, specifically the person’s phone number, email address, or some other identifier, like a username.

Now diving into the code. Create a permission question, passing in the unknown handles.

You can also add metadata about specific handles. Prefer adding the most amount of metadata possible. Any metadata provided will be shown to parents and guardians by the system and can be used to help inform their decision to approve or deny their child’s request.

To do this, create a CommunicationTopic.

CommunicationTopic uses PersonInformation.

PersonInformation has affordances for the handle, name, and an image. You can also optionally set actions on the topic.

The chosen actions should correspond to how the child is attempting to communicate with the specified people.

Depending on which actions you choose, the system will use specific verbiage when showing the request to the parent for review.

In this example, I’m using message, but you can also provide call, video, and others.

Finally, initialize your PermissionQuestion with your custom CommunicationTopic.

Now, using the PermissionQuestion you created, initialize a CommunicationLimitsButton inside of your SwiftUI view.

When the child taps your button, they’ll receive a system prompt, which allows them to choose to send the question to their parents and guardians in Messages.

For UIKit apps, use the CommunicationLimits singleton to start the Ask Flow on the child’s device. You’ll need to pass in a UIViewController, which the system will use to present UI.

You can also use the CommunicationLimits singleton to start the Ask Flow on the child’s device for AppKit apps. You'll just need to provide an NSWindow.

And that's it for the asking experience. Now all that’s left is to handle the answers chosen by parents and guardians. When a parent responds inside Messages, your app will be launched in the background on the child’s devices. You’ll need to handle these responses as they’re received.

Shortly after your app launches, obtain an AsyncSequence from the CommunicationLimits singleton and iterate over it on a background task so that your app is always informed of permission updates. When a response is received, this is an opportunity to update your UI or post new data to your own databases.

And that’s it. Really. Now I’ll put it all together and go over how it works in Landmarks. The device on the left is the child’s device, and the device on the right is a parent’s device. Here in Landmarks on the child’s device, we’re on the new chats view that I added.

The message previews for all of the unknown handles are hidden from the child by default.

Tapping into one of the conversations shows that the message content is hidden from the child and that they’re unable to respond without first asking their parents and guardians for permission.

When the child taps the communication limits button, an alert is shown. This alert allows them to confirm they’d like to send their parents and guardians a message. Alternatively, they can get approval from their parent or guardian in-person using the Screen Time passcode. If the child taps the Ask option in the alert, a Messages Compose window is presented, with the parents and guardians from the child’s Family Sharing group prefilled in the To field, and the question staged in the text view.

The child can optionally add a name for this person by tapping the Add or Edit Name button. If you provide metadata for the person via the previously discussed PermissionQuestion API, that metadata will appear here.

This metadata will be shown to parents and guardians when they review the request on their devices.

Once the child taps the send button, the question will send to their parents and guardians just like any other iMessage.

Now directing attention to the parent device, the parent received the question from their child in Messages.

Parents can either decline directly from the bubble, or they can choose to review this ask.

If they review, they’ll see additional context about this ask, and they can either choose to approve or decline.

Regardless of what the parent chooses, the child will automatically receive a notification informing them of their parent's choice. At the time the notification is delivered to the child, the parent’s choice will also be delivered directly to your app in the background. Respond to this choice by, for example, updating your UI and local caches, or by posting information to your servers.

Now that the parent has approved, the child can see and respond to messages from this person in Landmarks. Now that you know how to adopt PermissionKit, here are some other things to consider for enhancing child safety in your app. Use PermissionKit as a launchpad to add similar experiences outside of Apple’s platforms, like your app’s website, by persisting information obtained from PermissionKit to your own servers. And don’t stop at PermissionKit. Determine if your app is a good fit for our other family and child safety API offerings. For example, here are more ways to keep kids safe using our APIs. The new Sensitive Content Analysis API expands communication safety to protect kids by detecting and blocking nudity in live streaming video calls. The new Declared Age Range API allows you to build safe, age-appropriate experiences for kids.

The Screen Time framework gives you the tools you need to help parents and guardians supervise their children’s web usage. And the Family Controls framework helps apps provide their own parental controls.

Now that you know how to improve child safety in your app using PermissionKit, here’s what to do next. Start by determining the age range of each of your users, either by using data from your own servers or the new Declared Age Range API. Then, adapt your app’s UI with children in mind. From there, adopt PermissionKit. Create the questions to ask and respond to parent and guardian answers.

Thanks for giving me permission to introduce you to PermissionKit.

Code

4:03 - Tailor your UI for children

import PermissionKit

let knownHandles = await CommunicationLimits.current.knownHandles(in: conversation.participants)

if knownHandles.isSuperset(of: conversation.participants) {
    // Show content
} else {
    // Hide content
}

5:15 - Create a question

import PermissionKit

var question = PermissionQuestion(handles: [
    CommunicationHandle(value: "dragonslayer42", kind: .custom),
    CommunicationHandle(value: "progamer67", kind: .custom)
])

5:38 - Create a question - additional metadata

import PermissionKit

let people = [
    PersonInformation(
        handle: CommunicationHandle(value: "dragonslayer42", kind: .custom),
        nameComponents: nameComponents,
        avatarImage: profilePic
    ),
    PersonInformation(
        handle: CommunicationHandle(value: "progamer67", kind: .custom)
    )
]

var topic = CommunicationTopic(personInformation: people)
topic.actions = [.message]

var question = PermissionQuestion(communicationTopic: topic)

6:25 - Ask a question - SwiftUI

import PermissionKit
import SwiftUI

struct ContentView: View {
    let question: PermissionQuestion<CommunicationTopic>

    var body: some View {
        // ...
        CommunicationLimitsButton(question: question) {
            Label("Ask Permission", systemImage: "paperplane")
        }
    }
}

6:43 - Ask a question - UIKit

import PermissionKit
import UIKit

try await CommunicationLimits.current.ask(question, in: viewController)

6:54 - Ask a question - AppKit

import PermissionKit
import AppKit

try await CommunicationLimits.current.ask(question, in: window)

7:19 - Parent/guardian responses

import PermissionKit
import SwiftUI

struct ChatsView: View {
    @State var isShowingResponseAlert = false

    var body: some View {
        List {
           // ...
        }
        .task {
            let updates = CommunicationLimits.current.updates
            for await update in updates {
                // Received a response!
                self.isShowingResponseAlert = true
            }
        }
    }
}

Enhance your app with machine-learning-based video effects

Discover how to add effects like frame rate conversion, super resolution, and noise filtering to improve video editing and live streaming experiences. We'll explore the ML-based video processing algorithms optimized for Apple Silicon available in the Video Toolbox framework. Learn how to integrate these effects to enhance the capabilities of your app for real-world use cases.

Chapters

Resources

Transcript

Hello, I’m Makhlouf, an engineer on the Video processing team. Video Toolbox is one of the most used frameworks in video applications. It has a rich collection of features intended for many video needs. Starting in macOS 15.4, Video Toolbox was enhanced with VTFrameProcessor API. A new set of ML-based video processing algorithms optimized for Apple Silicon. I’m happy to share that this API is now also available in iOS 26.

In this video, I’ll start by showing the effects currently available with VTFrameProcessor API. Then, I’ll go over the basic steps to integrate effects into your app. And finally, I’ll demonstrate how to implement a few examples for major use cases. VTFrameProcessor API offers a variety of effects that cover many usages.

Frame rate conversion, super resolution and motion blur are designed for high quality video editing. Low latency frame interpolation and super resolution effects are for applications with real-time processing needs. Temporal noise filter can be used in both cases.

I’ll show you the results that you can achieve with these effects.

Frame rate conversion adjusts the number of frames per second in a clip to match the target FPS. It can also be used to create a slow-motion effect. On the left side is a video of soccer players celebrating a goal. On the right side is the same video with the slo-mo effect. By slowing down the action, the video captures the intensity of the celebration and highlights the emotions of the players.

Super Resolution scaler enhances video resolution and restores fine details in old videos, making it ideal for application like photo enhancement and media restoration. In this example, applying the super resolution effect enhances the sharpness and clarity of the video, making the boats on the right side appear more detailed and less blurry. Two ML models are available for super resolution, one for images and the other for videos.

Motion blur, a popular feature in filmmaking, is essential for creating natural movements that an audience expects. It can also be used to create various effects. In this example the motion blur effect is applied to the video on the right, making the biker appear to be traveling at a significantly faster pace. Additionally, the effect helps to smooth out the jarring motion on the video on the left, and it is more enjoyable to watch. Another very helpful effect is temporal noise filtering. It’s based on motion estimation utilizing past and future reference frames. It can help smooth out temporal noise and remove artifacts in a video, making it easier to compress and improving its quality. On the left, there is a video of a tree with a lot of color noise. After applying the effect, the video on the right is much cleaner, with less noise, especially on the traffic sign. The low latency frame interpolation effect upsamples the frame rate with real-time performance. Optionally, the resolution can also be upsampled. On the left side, there is a choppy, low frame rate video of a woman walking and talking to the camera. The enhanced video on the right is smoother and more pleasant to watch. Low latency video super resolution is a lightweight super scaler. It is specifically optimized to enhance the video conferencing experience even when the network condition is bad. It reduces coding artifacts and sharpens edges to produce enhanced video frames. To demonstrate this effect, on the left is a low-resolution, low-quality video of a man talking. The processed video on the right has higher resolution, reduced compression artifacts, sharpened edges, and more detailed facial features.

In the next topic, I will cover how an application can integrate these effects. I will explain how data is exchanged between the application and the framework. I will also outline the main steps to process a clip. To access VTFrameProcessor API, applications need to import the Video Toolbox framework. After that, there are two main steps to process a clip. Step one is to select the effect. During this stage, the application starts a processing session by providing a configuration of settings, to describe how to use the effect for the whole session. Once the session is created, the application gets a VTFrameProcessor object, that is ready to process frames. VTFrameProcessor is a frame-based API, meaning the application must send the input video frames with their parameters one by one. After processing is done, the framework returns the output frame.

There are many use cases that can benefit from the VTFrameProcessor API. I’ll go over some examples and show how to implement them.

But before that, I want to mention there is a fully functional sample code attached to this presentation to use as a reference. The demo application has test clips to try out these effects.

I’m going to focus on two use cases, video editing and live video enhancement.

Frame rate conversion, super resolution, motion blur and temporal noise filter are all well suited for video editing applications. where video quality is the primary concern. I will start with frame rate conversion.

Frame rate conversion is the process of increasing the number of frames in a video clip. This is achieved by synthesizing new frames and inserting them between the existing ones. It is usually done to improve playback smoothness, especially in case of judder caused by frame rate mismatch between the source and the target display. It can help filling the gaps created by missing frames. It can also be used to create slow-motion effect, which is often used in filmmaking to slow down action scenes, making them visually more impressive. In sports, slow motion is used to highlight and analyze key moments in a game.

On the left side, there is a video of a man practicing a dance routine. On the right side is the same video with the slow-motion effect. Applying the effect gives the viewers more time to appreciate the complexity of the dance movements, making it even more captivating.

Now I will demonstrate how to implement frame rate conversion using the VTFrameProcessor API.

The first step is to create the session for the effect. To do this, I will create a VTFrameProcessor object. Next, I create the configuration object. For frame rate conversion, use VTFrameRateConversionConfiguration type. To initialize the configuration, I need to provide few settings, like input frame width and height, whether optical flow was pre-computed or not, quality level, and algorithm revision. I can now invoke the start session method to initialize the frame processing engine.

Step two is to use the parameters object to process frames. In this case the object should be of type VTFrameRateConversionParameters.

But before using the parameters class, I need to allocate all the needed buffers. In general, all input and output frame buffers are allocated by the caller. Source and destination pixel buffer attributes of the configuration class can be used to set up CVPixelBuffer pools. First, I create the current source and next frame objects. Then, the interpolationPhase array is created to indicate where to insert the interpolated frames. The array size indicates how many frames to interpolate. Finally, the destination array is created with buffers to receive the output. It is of the same size as the interpolationPhase array. Now that I have my buffers ready, I can set the remaining parameters. I set the optical flow to nil, allowing the processor to calculate the flow for me. I also set the submission mode to indicate whether frames are sent in sequence or in a random order. Once VTFrameRateConversionParameters has been created, I can now call the process function for the actual processing.

In summary, each effect is defined by two types of classes: the VTFrameProcessorConfiguration class which describes how to set up the processing session for the effect; the VTFrameProcessorParameters class which describes the input and output frames and all associated parameters. An important decision to make when developing your app is whether or not to pre-compute optical flow. Flow calculation can be expensive and some apps choose to do it beforehand to improve performance at the rendering phase. To compute optical flow beforehand, use VTOpticalFlowConfiguration and VTOpticalFlowParameters classes. When the use pre-computed flow parameter is set to false in a configuration, the framework computes the flow on the fly.

Now I will go over another important video editing feature, Motion Blur.

Motion Blur simulates a slow shutter speed to create a blurred effect on moving objects. The blur strength can be adjusted through the API to control the intensity of the blur. It is used in many situations, such as making motion look natural, or for artistic reasons, like adding a sense of speed to fast moving objects.

On the left, there is a time-lapse video of a freeway. After applying the motion blur effect, the video on the right is now more fluid. Applying motion blur to time-lapse videos replicates natural movement, making them look more realistic and less like a series of still images.

To create a motion blur processing session use the VTMotionBlurConfiguration class. As a reference you can refer to the frame rate conversion example as they share similar settings. Next I will show how to initialize the VTMotionBlurParameters object to process frames.

Two reference frames are needed for motion blur; next and previous. For the first frame in a clip, the previous frame should be set to nil. Also, at the end of a clip, the next frame should be set to nil. Now that I have the frame buffers, I can create the VTMotionBlurParameters. I let the processor compute optical flow by setting the flow parameters to nil. Then, I choose the blur strength, which ranges from 1 to 100. Once I have created motion blur parameters, I call the process function to initiate the actual processing.

Now, I will discuss a couple of features intended for real-time usage scenarios, like video conferencing and live streaming.

Temporal noise filtering, Low latency frame interpolation and super resolution effects are all designed for real-time usages. They provide enhancements with performance in mind. For the low latency effects, the enhancement is usually performed on the receiving devices.

Now I will go over a couple of these effects. I’ll begin with low latency super resolution. To implement this effect, I need to use the LowLatencySuperResolutionScalerConfiguration and parameters classes. Both classes are straightforward to use. The configuration class only needs frame width, height, as well as the scaling ratio, while the parameters class only requires source and destination frame buffers. This example demonstrates how low latency super resolution can enhance video conferencing session. The video on the left shows a bearded man in a video call, talking, smiling and making hand gestures. It appears blurry due to its low resolution. After applying the super-resolution effect, as the video on the right shows, the man’s face becomes significantly sharper, especially the texture on his facial hair.

Low latency frame interpolation is also intended for enhancing real-time video conferencing. This is highly beneficial for video call applications, especially when the connection is slow. To apply this effect, use the appropriate version of the LowLatencyFrameInterpolationConfiguration and Parameters classes.

Previously, we have shown how Low latency frame interpolation can smooth out juddering to provide a pleasant streaming experience. We have an additional utility in this API that combines frame rate doubling and resolution upscaling capabilities into one filter.

After processing, the woman’s facial features and the background are sharper and the stream is smoother.

The Video Toolbox framework already provides many features, like direct access to hardware video encoding and decoding capabilities. It allows for services like video compression, decompression, as well as pixel format conversions. With the addition of the VTFrameProcessor API, developers can now create even more compelling applications. Now that you have learned about the VTFrameProcessor API, it’s time to integrate these effects into your app. Enhance your video editing features with new effects like slo-mo or motion blur. Improve your live streaming experiences with low latency super resolution and frame interpolation.

These are just a few examples of what you can accomplish with the VTFrameProcessor API. Thank you for watching my video and I’m looking forward to seeing your videos.

Code

8:06 - Frame rate conversion configuration

// Frame rate conversion configuration


let processor = VTFrameProcessor()

guard let configuration = VTFrameRateConversionConfiguration(frameWidth: width,
                                                            frameHeight: height,
                                                     usePrecomputedFlow: false,
                                                  qualityPrioritization: .normal,
                                                               revision: .revision1)
else {
     throw Fault.failedToCreateFRCConfiguration
}

try processor.startSession(configuration: configuration)

8:56 - Frame rate conversion buffer allocation

// Frame rate conversion buffer allocation

//use sourcePixelBufferAttributes and destinationPixelBufferAttributes property of VTFrameRateConversionConfiguration to create source and destination CVPixelBuffer pools

sourceFrame = VTFrameProcessorFrame(buffer: curPixelBuffer, presentationTimeStamp: sourcePTS)
nextFrame = VTFrameProcessorFrame(buffer: nextPixelBuffer, presentationTimeStamp: nextPTS)

// Interpolate 3 frames between reference frames for 4x slow-mo
var interpolationPhase: [Float] = [0.25, 0.5, 0.75]

//create destinationFrames
let destinationFrames = try framesBetween(firstPTS: sourcePTS,
                                           lastPTS: nextPTS,
                            interpolationIntervals: intervals)

9:48 - Frame rate conversion parameters

// Frame rate conversion parameters

guard let parameters = VTFrameRateConversionParameters(sourceFrame: sourceFrame,
                                                         nextFrame: nextFrame,
                                                       opticalFlow: nil,
                                                interpolationPhase: interpolationPhase,
                                                    submissionMode: .sequential,
                                                 destinationFrames: destinationFrames)
else {
     throw Fault.failedToCreateFRCParameters
}

try await processor.process(parameters: parameters)

12:35 - Motion blur process parameters

// Motion blur process parameters

sourceFrame = VTFrameProcessorFrame(buffer: curPixelBuffer, presentationTimeStamp: sourcePTS)
nextFrame = VTFrameProcessorFrame(buffer: nextPixelBuffer, presentationTimeStamp: nextPTS)
previousFrame = VTFrameProcessorFrame(buffer: prevPixelBuffer, presentationTimeStamp: prevPTS)
destinationFrame = VTFrameProcessorFrame(buffer: destPixelBuffer, presentationTimeStamp: sourcePTS)

guard let parameters = VTMotionBlurParameters(sourceFrame: currentFrame,
                                                nextFrame: nextFrame,
                                            previousFrame: previousFrame,
                                          nextOpticalFlow: nil,
                                      previousOpticalFlow: nil,
                                       motionBlurStrength: strength,
                                           submissionMode: .sequential,
                                         destinationFrame: destinationFrame) 
else {
    throw Fault.failedToCreateMotionBlurParameters
}

try await processor.process(parameters: parameters)

Enhance your app’s audio recording capabilities

Learn how to improve your app's audio recording functionality. Explore the flexibility of audio device selection using the input picker interaction on iOS and iPadOS 26. Discover APIs available for high-quality voice recording using AirPods. We'll also introduce spatial audio recording and editing capabilities that allow you to isolate speech and ambient background sounds — all using the the AudioToolbox, AVFoundation, and Cinematic frameworks.

Chapters

Resources

Related Videos

WWDC25

Transcript

Hello! My name is Steve Nimick. I’m an audio software engineer working on spatial audio technologies. In this video, I’ll talk about how to enhance the audio capabilities of your app.

I’ll present API updates for input device selection, audio capture, and playback.

The first step to capture audio is to select an input device. There are many different types of microphones, and new API allows changing the active audio source from within your app.

Additional enhancements enable your app to use AirPods in a new high quality recording mode. Other updates include Spatial Audio capture, plus new features that provide your app with more possibilities for processing audio. And finally, new API unlocks the Audio Mix feature during spatial audio playback.

I'll begin with the input route selection, with updates for how your app interacts with connected devices.

Content creators may use multiple audio devices for different applications, such as recording music or podcasting.

IOS 26 has improvements to how the system manages audio hardware improvements that also extend to apps! New API in AVKit displays the list of available inputs and allows audio source switching from within the app, without a need to navigate over to the System Settings. Here’s an example of this UI.

Your app can have a UI button that brings up the new input selection menu. It shows the list of devices, with live sound level metering. And there’s a microphone mode selection view, for displaying the modes that the input device supports. The audio stack remembers the selected device, and chooses the same input the next time the app goes active. Here's the API that enables this for your app.

First, the Audio Session needs to be configured before calling this API. This ensures that the input selection view shows the correct list of devices.

To present the input picker, create an instance of AVInputPickerInteraction. Do this after setting up the audio session. Then, assign the InputPickerInteraction’s delegate as the presenting view controller. Your app may designate a UI element, like a button, that brings up the picker interaction view.

Finally, in your UI callback function, use the 'present' method to show the audio input menu. Now, when the button is tapped, the picker interaction view appears and lets people select and change the device.

This API provides a nice, intuitive way for users to change inputs while keeping your app active.

For content creators, the best microphone is the one that's the most readily available. So now, I’ll talk about a popular, convenient input device, AirPods. In IOS 26, there’s a new high quality, high sample rate, bluetooth option for apps that feature audio capture. With a new media tuning designed specifically for content creators, it strikes a great balance between voice and background sounds, just like someone would expect from a LAV microphone.

When this tuning mode is active, your app uses a more reliable bluetooth link that is designed specifically for AirPods high quality recording.

Here's how an app configures this feature.

It is supported in both AVAudioSession and AVCaptureSession. For the AudioSession, there’s a new category option, called bluetoothHighQualityRecording. If your app already uses the AllowBluetoothHFP option, then, by adding the high quality option, your app will use it as the default. BluetoothHFP is a fallback in case the input route doesn’t support bluetooth high quality. For AVCaptureSession, there’s a similar property that, when set to true, enables this High Quality mode, without your app having to set up the audio session manually. With both sessions, if this option is enabled, the system-level audio input menu will include high-quality AirPods in the device list. This AirPods feature is a great addition to apps that record audio, and you can support this with minimal code change.

In addition to high quality recording, AirPods also have built-in controls that make it easy to record. People can start and stop by pressing the stem of the AirPods. To learn more about how to support this in your app, check out “Enhancing your camera experience with capture controls” from WWDC25.

Next, I’ll introduce new updates for Spatial Audio capture. In IOS 26, apps that use AVAssetWriter are able to record with Spatial Audio. First, it's important to define how "Spatial Audio” works. Spatial Audio capture uses an array of microphones, like the ones on an iPhone, to take a recording of the 3D scene, and then, the microphone captures are transformed into a format based on spherical harmonics, called Ambisonics. Spatial Audio is stored as First Order Ambisonics, or FOA. FOA uses the first 4 spherical harmonic components. There is an omni component, and 3 perpendicular dipoles, in the X, Y, and Z directions or front-back, left-right, and up-down. Audio that’s recorded in this format benefits from Spatial Audio playback features, like headtracking on AirPods. In addition, your apps can use new API for the Audio Mix effect, which allows people to easily adjust the balance of foreground and background sounds.

Spatial Audio capture API was introduced in iOS 18. Apps that use AVCaptureMovieFileOutput can record Spatial Audio by setting the multichannelAudioMode property of the AVCaptureDevice Input to .firstOrderAmbisonics.

In IOS 26, audio-only apps, like Voice Memos, now have the option to save data in the QuickTime audio format with the extension .qta.

Similar to QuickTime movies or MPEG files, the QTA format supports multiple audio tracks with alternate track groups, just like how Spatial Audio files are composed.

Here’s an overview of a properly-formatted Spatial Audio asset. There are two audio tracks: a stereo track in AAC format, and a Spatial Audio track in the new Apple Positional Audio Codec (or APAC) format. During ProRes recording, these audio tracks are encoded as PCM. The stereo track is included for compatibility with devices that don’t support Spatial Audio. Lastly, there’s at least one metadata track that contains information for playback. When a recording stops, the capture process creates a data sample that signals that the Audio Mix effect can be used. It also contains tuning parameters that are applied during playback.

I'll expand on this topic in the next section on Audio Mix. If you'd like more information on creating track groups and fallback relationships, please read the TechNote, “Understanding alternate track groups in movie files”. For apps that assemble their own file with AVAssetWriter instead of MovieFileOutput, I’ll go through the elements needed to create a Spatial Audio recording.

There must be two audio tracks and a metadata track. When the multichannelAudioMode property of the CaptureDeviceInput is set to FOA, the AVCaptureSession can support up to two instances of AudioDataOutput (or ADO).

A single ADO can produce either four channels of FOA or two channels in Stereo. Spatial Audio, with two tracks, requires two ADOs, One of them must be configured in FOA, and the other must output Stereo. There’s a new channel layout tag property on the ADO object, called spatialAudioChannelLayoutTag This layout tag can take two possible values - Stereo, or first order ambisonics, which is 4 channels of the ambisonic layout HOA - ACN - SN3D. Your app needs 2 AssetWriter Inputs to make the audio tracks. One for stereo, and one for FOA. The final piece, is the metadata, and there's new API to create that sample. Use the helper object: AVCaptureSpatialAudioMetadataSampleGenerator. The sample generator object receives the same buffers that are coming from the FOA AudioDataOutput. When the recording stops, after sending the final buffer, the sample generator creates a timed metadata sample that is passed into another AssetWriterInput and then compiled into the final composition as a metadata track.

There’s one more update to AVCaptureSession that affects the MovieFileOutput and the AudioDataOutput, and it’s useful for apps that could benefit from using both objects. AudioDataOutput provides access to audio sample buffers as they’re being received, so your app can apply effects or draw waveforms on screen. in IOS 26, the CaptureSession supports the operation of both the MovieFileOutput and the AudioDataOutput simultaneously. This means your app can record to a file, and process or visualize the audio samples in real-time. This update gives you more freedom to add those “surprise and delight” elements to your app. For an example of spatial audio capture with AVAssetWriter, check out the new “Capturing Spatial Audio in your iOS app” sample app linked to this video. In IOS 26, there’s also the option to record Cinematic Videos, with Spatial Audio included. To learn more, check out “Capture Cinematic video in your app” from WWDC25. In the next section, I’ll discuss one more element of Spatial Audio: playback and editing, using Audio Mix.

New in IOS and macOS 26, the Cinematic Framework includes options to control the Audio Mix effect. This is the same as the Photos edit feature, for videos recorded with Spatial Audio.

Audio Mix enables control of the balance between foreground sounds, like speech, and background ambient noise. The new API includes the same mix modes that Photos app uses Cinematic, Studio, and In-Frame. And, there are 6 additional modes available for your app. These other modes can provide the extracted speech by itself, as a mono, foreground stem, or only the ambience background stem, in FOA format.

This is a powerful addition to apps that play Spatial Audio content, Like this next demo.

This is a demo to show how to control the Audio Mix effect on Spatial Audio recordings. I'm here at the beautiful Apple Park campus - a wonderful setting for my video. But the unprocessed microphones on my phone are picking up all of the sounds around me. And that's not what I have in mind for my audio recording. Steve has added a UI element to his app for switching between the various audio mix styles: standard, cinematic, studio, or one of the background stem modes. Selecting Cinematic applies the cinematic audio mix style.

There, that sounds a lot better. There's also now a slider for controlling the balance between the speech and ambient noise. I'll find the position where my voice comes through loud and clear.

There, I think that position works pretty well.

If I choose a background mode, my voice would be removed. The audio track will only contain ambient sounds. This can be used for creating a pure ambient track for use in post production later on. I'll select that mode now.

Now, back to voice mode.

Now, Steve will show you how to add this to your apps.

Here’s how you can implement this. First, import the Cinematic framework.

The two primary Audio Mix parameters are, the effectIntensity, and the renderingStyle the demo app uses UI elements to change them in real-time. The intensity operates within a range of 0 to 1, and CNSpatialAudioRenderingStyle is an enum that contains the style options. Next, initialize an instance of CNAssetSpatialAudioInfo, this class contains many properties and methods for working with Audio Mix. For example, in the next line, run audioInfo.audioMix() this creates an AVAudioMix using the current mix parameters.

And then set this new mix on the audio mix property of the AVPlayerItem. And that is all you need to start using Audio Mix in your AVPlayer app.

Outside of AVPlayer, you can run the Audio Mix processing with a new AudioUnit called AUAudioMix.

This is the AU that performs the separation between speech and ambience.

Using this AU directly is useful for apps that don’t use AVPlayer, which configures many settings automatically. If your app needs a more specific, customized workflow, AUAudioMix provides more flexibility and tuning options. Here are the different components inside the AU. The input is 4 channels of FOA spatial audio. It flows into the processing block that separates speech and ambience. And the output of that is sent into AUSpatialMixer, which provides other playback options. The first 2 AU parameters are the RemixAmount and the Style, the 2 fundamental elements of audio mix.

There’s also the AUAudioMix property EnableSpatialization, which turns the SpatialMixer on or off. This changes the output format of the entire AU, and I’ll talk more about that shortly.

AudioUnit property SpatialMixerOutputType provides the option to render the output to either headphones, your device built-in speakers, or external speakers.

The AU also has a property for the input and output stream formats. Since the AU receives FOA audio, set the input stream with 4 channels.

There is one more property called SpatialAudioMixMetadata. This is a CFData object that contains automatically - generated tuning parameters for the dialog and ambience components. Here’s how this works.

Immediately after spatial audio recording is stopped, the capture process analyzes the sounds in the foreground and background. It calculates audio parameters, such as gain and EQ, that get applied during playback. Those values are saved in a metadata track. When configuring AUAudioMix, your app needs to read this data from the input file, and apply those tuning parameters on the AU. Here's an example of how to extract this metadata from a file.

Again, it starts with an instance of CNAssetSpatialAudioInfo, Retrieve the MixMetadata property by calling audioInfo.spacialAudioMixMetadata This needs to be type CFData to set this property on the AU.

Earlier, I mentioned the EnableSpatialization property. It’s turned off by default, and in this mode, the AU outputs the 5 channel result of the sound separation. That is, 4 channels of ambience, in FOA, plus one channel of dialog.

With the spatialization property turned on, the AU supports other common channel layouts, such as 5.1 surround, or 7.1.4.

Lastly, linked to this video, is a command-line tool sample project, called “Editing Spatial Audio with an audio mix”.

SpatialAudioCLI has examples for how to apply an audio mix in three different ways. Preview mode uses 'AVPlayer' to play the input, and apply audio mix parameters. The Bake option uses AVAssetWriter to save a new file with the audio mix params, including a stereo compatibility track. And Process mode sends the input through 'AUAudioMix' and renders the output to a channel layout that you specify.

Now that you know all the new audio features, here’s how to take your app to the next level.

Add the AVInputPickerInteraction to let people select the audio input natively within your app.

Enable the bluetooth high quality recording option for AirPods, so content creators can quickly and easily capture remarkable sound.

Give your app more flexibility by using MovieFileOutput and AudioDataOutput to record, and, apply audio effects.

For utmost control, integrate Spatial Audio capture with AVAssetWriter, and use the new Audio Mix API during playback.

To get started with Spatial Audio, download the related sample code projects.

I look forward to being immersed by everything that people create using your apps. Have a great day!

Code

2:10 - Input route selection

import AVKit

class AppViewController {

    // Configure AudioSession

    // AVInputPickerInteraction is a NSObject subclass that presents an input picker
    let inputPickerInteraction = AVInputPickerInteraction()   
    inputPickerInteraction.delegate = self

    // connect the PickerInteraction to a UI element for displaying the picker
    @IBOutlet weak var selectMicButton: UIButton!
    self.selectMicButton.addInteraction(self.inputPickerInteraction)

    // button press callback: present input picker UI
    @IBAction func handleSelectMicButton(_ sender: UIButton) {
	    inputPickerInteraction.present()
    }
}

3:57 - AirPods high quality recording

// AVAudioSession clients opt-in - session category option
AVAudioSessionCategoryOptions.bluetoothHighQualityRecording

// AVCaptureSession clients opt-in - captureSession property
session.configuresApplicationAudioSessionForBluetoothHighQualityRecording = true

13:26 - Audio Mix with AVPlayer

import Cinematic

// Audio Mix parameters (consider using UI elements to change these values)
var intensity: Float32 = 0.5 // values between 0.0 and 1.0
var style = CNSpatialAudioRenderingStyle.cinematic

// Initializes an instance of CNAssetAudioInfo for an AVAsset asynchronously
let audioInfo = try await CNAssetSpatialAudioInfo(asset: myAVAsset)
    
// Returns an AVAudioMix with effect intensity and rendering style.
let newAudioMix: AVAudioMix = audioInfo.audioMix(effectIntensity: intensity,
                                                 renderingStyle: style)

// Set the new AVAudioMix on your AVPlayerItem
myAVPlayerItem.audioMix = newAudioMix

16:45 - Get remix metadata from input file

// Get Spatial Audio remix metadata from input AVAsset

let audioInfo = try await CNAssetSpatialAudioInfo(asset: myAVAsset)

// extract the remix metadata. Set on AUAudioMix with AudioUnitSetProperty()
let remixMetadata = audioInfo.spatialAudioMixMetadata as CFData

Enhance your app’s multilingual experience

Create a seamless experience for anyone who uses multiple languages. Learn how Language Discovery allows you to optimize your app using a person's preferred languages. Explore advances in support for right-to-left languages, including Natural Selection for selecting multiple ranges in bidirectional text. We'll also cover best practices for supporting multilingual scenarios in your app.

Chapters

Resources

Related Videos

WWDC25

WWDC24

WWDC22

Transcript

مرحبا! (Marhaba) My name is Omar and I’m very excited to talk to you today about how to enhance your app’s multilingual experience. We live in a deeply multilingual world. You're no longer just building apps. You’re building experiences that need to work anywhere for anyone. Whether you’re designing a social app for people in Singapore and Southeast Asia, or a productivity tool for remote teams in London and the rest of Europe, or building apps for Beirut and the Arab world, language is never just text on a screen. It's culture. It's identity. And for millions of people, it’s also the difference between feeling included or feeling left out.

In iOS 26, we’re introducing many new features that make the multilingual experience even better.

And as developers, you can tap into these improvements to build more accessible, globally friendly apps. Let’s walk through some of these new features.

In iOS 26, Arabic speakers can use the new Arabizi transliteration keyboard. This means that they can now type Arabic words in Latin script, and then the keyboard automatically enters Arabic script for them. If you’re used to typing on the English or the French keyboard, this makes it very easy for you to type in Arabic. We’ve also added the ability for the transliteration keyboard to offer bilingual suggestions. This means that when you type an English word on this Hindi keyboard, iOS 26 automatically suggests the translation for it. We’re also introducing a new multi-script bilingual Arabic and English keyboard. If you speak both Arabic and English, it will now auto detect the language as you type, which simplifies the experience of typing in both languages.

Finally, we have a new Thai keyboard with a 24-key layout that makes the experience much easier for Thai speakers.

Internationalization is a fundamental first step in building apps for a global audience. With Apple’s powerful tools and technologies, including Xcode, Foundation APIs, and Unicode support, it's easy to prepare your apps to support multiple languages, even before you know which languages you’d like to add.

Before I get into our new tools and APIs for internationalization, I’d like to share some fundamental best practices on how to make your app multilingual ready.

TextKit2 makes it easy to support multiple languages in your app. It handles complex scripts like Korean or Hindi and gives you more control over layout and styling so that your text looks just right no matter the language. TextKit2 seamlessly manages bidirectional text. Later, my colleague Danny will introduce you to the latest advances for bidirectional text.

Formatters in Swift help you display dates, numbers, and text in a way that automatically adapts to each person’s language and region.

With just a few lines of code, you can localize everything from currency to date formats without writing custom logic for each locale.

There are many APIs in Swift that simplify the experience for managing different input modes like software and even hardware keyboards.

For example, you can use inputAccessoryView to place a view directly above the keyboard, or set textInputContextIdentifier to help the keyboard automatically remember its last used language and layout. For more details on TextKit, formatters, and text input, watch the video from last year linked below. You can also check out the associated sample code. Now, let’s get into some new APIs that enhance your app’s multilingual experience. We’ll introduce a new feature, Language discovery. Learn how to support Alternate calendars in your app and explore advances in handling bidirectional text. Let’s start with Language discovery.

Billions of people around the world are multilingual. In many regions, the majority of people use more than one language in their everyday lives. We believe that great apps need to consider the full range of the human experience. It’s crucial for people to be able to interact with them in their own languages. Before iOS 26, the only way for people to choose their languages was by manually adding them in the Settings app. We understand how challenging it is to do that. For example, my own iPhone is set to English, but my native language is Arabic. I listen to music and podcasts in Arabic. I read the news and send messages in both English and Arabic. So even though I’d like to have my iPhone’s user interface in English, I would love to get news, music, and podcast recommendations in Arabic.

In iOS 26, Siri is personal, proactive, and can offer intelligent suggestions to help set up my device in my own languages. Using on-device intelligence, Siri can recognize that even though I set up my iPhone in English, I text, listen, and browse in Arabic as well. So when I tap on Siri’s suggestion, I can choose to switch my iPhone language to Arabic, add an Arabic-English bilingual keyboard, and ask to get content recommendations such as news, music, and podcasts in my language. This experience applies to millions of users around the world who are multilingual and would love to get recommended content in their own languages.

Thanks to the Foundation framework, you can use Locale.preferredLanguages to get the list of the user’s languages and personalize your app’s multilingual experience. If you’re familiar with Locale.preferredLanguages, you will know that it returns an array of language identifiers as Strings. These String values conform to BCP-47 language tag.

However, it can be daunting and complex to handle and manipulate these String-based identifiers.

This year, we’re introducing Locale.preferredLocales, which returns an array of Locales instead that contains a superset of information compared to Locale.preferredLanguages. A locale represents both the language and the region, like English UK or English Canada. In my iPhone's use case, for example, the locale would be Arabic Lebanon. Using Locales is crucial for your apps to adapt to the right spellings, date formats, and currencies to better match your users’ preferences and better personalize their experience.

With Locale, you can still access the String-based identifier if desired, and even choose the precise format of the identifier, such as BCP-47, ICU, or CLDR.

You also gain access to a rich array of properties about the Locale. For example, you can get the numberingSystem or use the localizedString APIs to get access to localized names for languages and regions.

Locale.preferredLanguages may be deprecated in the future, so we would encourage you to switch over to Locale.preferredLocales instead. In iOS 26, we’re using preferredLocales extensively.

For example, the Translate app can now show people’s languages at the top instead of showing a long list of languages to select from. In Calendar, when an alternate calendar is set, the app can now show UI elements such as days and months in their languages. And lastly, Apple Music uses preferredLocales to recommend and offer translations for lyrics. This makes the multilingual experience even more personalized. This experience in the Translate app applies to many apps where people have to pick their languages from a long list of languages.

You can personalize the experience in your app by using the preferredLocales API and Foundation.

Let's get into the Translate app and build this experience together. Let’s consider availableLocales as the array of your app’s available locale objects. We will use matchedLocales as the array that matches people’s preferredLocales with your app’s availableLocales. You can then loop through the Translate app’s availableLocales. For every available locale, you can then check if that locale is in their preferredLocales. Once a match is found, you can add it to matchLocales and break. matchLocales can then be used to prioritize the languages and place them at the top of the list for easy access. To your preference, you can choose here to use isEquivalent or hasCommonParent.

Language discovery helps your app feel more personal, natural, and takes away the step of asking people for their languages. The use cases are endless, and we’re excited to see how you will use preferredLocales in your app. Now, let’s talk about the new alternate calendar APIs available in Foundation. With iOS 26, you can select from many new alternative calendars. For example, we now have alternate calendar options for Gujarati, Marathi, and Korean. These new calendars are available on all platforms. Now, in addition to the existing 16 calendar identifiers, we’ve added 11 new calendars that you can access using Calendar.Identifier in the Foundation framework.

We have exciting new updates on internationalization today. We've discussed language discovery and our new calendar identifiers. Now, I’d like to hand off to my colleague, Danny, to walk you through exciting new updates on bidirectional text.

Thanks, Omar. Xin chào. My name is Danny, and I am excited to discuss advancements in bidirectional text on iOS and iPadOS. These innovations have the potential to significantly enhance the multilingual experience in your app.

For a comprehensive understanding of bidirectional text, refer to the “Get it right (to left)” session. To begin, let’s review the definition of bidirectional text. When English is written, text flows from the left and concludes on the right. This writing direction is referred to as left-to-right or LTR.

When a language such as Hebrew is written, text flows from the right and concludes on the left. This writing direction is known as right-to-left or RTL.

When LTR and RTL texts are combined, the written text becomes bidirectional, which has significant implications for text selection.

To understand how text selection works in bidirectional text, we must first consider the order of the text that is displayed versus the order of the text that is stored. In this LTR example, the characters are stored in the order that they are written.

The characters in RTL text are also stored in the order that they are written, just like LTR text. However, the difference lies in how the text is displayed, which is right to left. This doesn’t pose any immediate issues because text can still be selected by simply flipping the selection method to select in descending order in storage as the selection is dragged to the right.

This will still result in a single contiguous range of text in storage that corresponds to what is visually selected.

The issue arises when we combine LTR and RTL text. The text is still stored in the same order that it is written, but it must now accommodate multiple directions when it is displayed.

If we restrict the selection to a single range of text in storage, text selection will no longer behave naturally.

For instance, if we initiate text selection by dragging the selection from the left to the right, the selection will no longer align with the cursor as it crosses the LTR and RTL boundary. Instead, it will begin selecting the text on the right side first, leaving an awkward gap in the middle of the selection. This occurs because despite our intention to select text that is displayed, the selection behaves as if it’s happening in storage. Since the text flow on the screen doesn’t match the text flow in storage, it’s impossible for the selection to be contiguous on screen and in storage at the same time.

In iOS 26, instead of forcing the selection to follow the storage order, we now allow the selection to naturally follow the cursor instead. Consequently, as the selection point is dragged, the selection closely follows behind it. We call this Natural Selection. Natural Selection ensures a seamless and consistent text selection experience, regardless of whether you’re selecting single directional text or bidirectional text.

With Natural Selection, the selection gap is no longer visible but is now concealed within the text storage. So instead of a single selectedRange, multiple selectedRanges are now required.

On macOS, NSTextView already supports Natural Selection by representing the selection as an array of values that are NSRanges. In iOS 18, UITextView has a single selectedRange property to represent the selection which can only express a single contiguous range. That means that in iOS 18, bidirectional text cannot be naturally selected without any visible gaps because the selectedRange will mistakenly encompass the range that wasn’t actually selected. In iOS 26, we have a new property called selectedRanges to represent an array of non-contiguous NSRanges, while the single selectedRange property will be deprecated in a future release.

The new SwiftUI Rich Text Editor also supports Natural Selection, where the ranges are represented as a range set of type attributed string index.

For more details, please see the “Cook up a rich text experience in SwiftUI with AttributedString” talk. Now as we had previously selected the bidirectional text, the selectedRanges will now accurately represent the text that was chosen. If your app has been performing an action that relies on the selectedRange, like deleting text from the TextView storage, it could inadvertently delete the incorrect range of text. Instead, you should use selectedRanges to ensure that only the text that was actually selected is deleted. In addition, we’ve updated the UITextViewDelegate and UITextViewDelegate protocols to accept an array of values that are NSRanges instead of a single range.

These methods are triggered whenever the text input system needs to modify the text within the specified ranges in response to text input. For instance, if you select text and then paste in a text view, the shouldChangeTextInRanges method will be called to confirm with the delegate that the text can be changed within the specified ranges before the deleted text is replaced with the pasted text.

Since the ranges are non-contiguous, there are multiple potential insertion locations. The text input system determines the appropriate location within the deleted ranges to insert the new text based on various factors, including the currently used keyboard. If the text should not be inserted into any of these locations, the delegate should return false.

Additionally, there are new versions of the editMenuForTextInRange method that now accept an array of ranges instead of a single range to more accurately represent the ranges of text for which the returned edit menu should correspond. By switching to using selectedRanges and implementing the new delegate methods, you can ensure that your apps work seamlessly with Natural Selection, enhancing the bidirectional text experience. To maximize the benefits of Natural Selection, your apps must utilize TextKit2. UITextView and UITextField already employ TextKit2 as their text engine, and now support Natural Selection by default in iOS 26.

However, if your app accesses textView.layoutManager, it will revert the text engine to TextKit1, disabling Natural Selection and other features. If you need to use layoutManager, please use textView.textLayoutManager, which is the layout manager for TextKit2.

Next, we will discuss the concept of writing direction.

To understand writing direction, we first need to understand text direction. Text direction refers to the direction of character flow within a continuous single span of text, while writing direction refers to the direction of flow of those texts within a paragraph.

The writing direction is determined by the direction of the first span of text.

For example, when LTR text is written first, the writing direction will be left-to-right. If RTL text like Urdu is subsequently typed, the text will be displayed on the right side of the LTR text because the writing direction is still left-to-right. Even as more Urdu text is typed, forming an Urdu sentence, the writing direction remains the same. This year, when left-to-right text is written first, the writing direction also remains left-to-right after right-to-left text is typed because it’s still considered an English sentence.

However, writing direction will now be dynamically determined based on the content of the text.

Which means that this year, as more Urdu text is typed to form an Urdu sentence, the writing direction will change to right-to-left. This automatic adjustment will occur within your app if it utilizes Apple provided text views and text fields on all platforms.

But maybe your app doesn’t use the Apple provided text views, but instead uses its own custom text engine.

If so, refer to the Language Introspector sample code to learn how to use new APIs to determine the writing direction based on the content of the text within your app.

We hope this new way of determining writing direction for bidirectional text will help to further elevate your app’s multilingual experience.

In this session, we’ve talked about some exciting new features that help to enhance your apps for a multilingual audience. Language discovery automatically detects preferred languages, making your app feel more personal and natural. Additional alternate calendars offer even more customization options to enhance the multilingual experience. And with the introduction of Natural Selection support on iOS and iPadOS, your apps now function seamlessly and consistently across a wider range of devices, catering to your multilingual audience.

I’m really looking forward to you creating opportunities to chat with your friends and family from around the world using bidirectional text. Thanks for watching.

Code

5:35 - Language discover

// Language discovery

let preferredLanguages = Locale.preferredLanguages

let preferredLocales = Locale.preferredLocales

7:49 - Match preferred locales with your app’s available locales

let preferredLocales = Locale.preferredLocales

// array of available Locale objects to translate from
let availableLocales = getAvailableLocalesForTranslatingFrom()

var matchedLocales: [Locale] = []

for locale in availableLocales {
    for preferredLocale in preferredLocales {
        if locale.language.isEquivalent(to:
        preferredLocale.language) {
            matchedLocales.append(locale)
            break
        }
    }
}

14:57 - Delete text in ranges

let ranges = textView.selectedRanges.reversed()
for range in ranges {
    textView.textStorage.deleteCharacters(in: range)
}

Summary

  • 0:00 - Introduction

  • iOS and iPadOS 26 introduce new features that make the multilingual experience even better, and your apps can tap into these improvements too. Internationalization is a first step in building apps for a global audience, and Apple’s tools and technologies make it easy to prepare your app to support multiple languages.

  • 3:57 - Language discovery

  • With the new language discovery feature, Siri uses on-device intelligence to recognize people's language preferences and helps them enable additional languages. The new preferredLocales API replaces preferredLanguages and provides more details, including language and region, spellings, date formats, currencies, and more. Apps like Translate, Calendar, and Apple Music now utilize preferredLocales to optimize their UI and recommend content. Support preferredLocales to make your apps feel more personal and natural for people worldwide.

  • 8:43 - Alternate calendars

  • All platforms now support 11 new alternate calendars, including Gujarati, Marathi, and Korean, bringing the total to 27 alternate calendars.

  • 9:29 - Bidirectional text

  • iOS and iPadOS introduce improvements to handling bidirectional text, where text combines languages written from left-to-right (LTR), like English, with languages written from right-to-left (RTL), like Arabic and Hebrew. Natural selection is now supported in iOS and iPadOS so people can select text easily and naturally, regardless of the language direction. To support it, you need to use the new selectedRanges property instead of the selectedRange property, as it can handle multiple, noncontiguous ranges of text. Additionally, the writing direction for bidirectional text is now determined by its content. If you type in LTR followed by RTL, the writing direction can automatically switch to RTL.

  • 19:40 - Next steps

  • The video covers new features that help your apps support multilingual users, including language discovery, alternate calendars, and enhanced support for bidirectional text in iOS and iPadOS.

Enhancing your camera experience with capture controls

Learn how to customize capture controls in your camera experiences. We'll show you how to take photos with all physical capture controls, including new AirPods support, and how to adjust settings with Camera Control.

Chapters

Resources

Related Videos

WWDC24

WWDC23

Transcript

Hi, my name is Vitaliy. I'm an engineer on the AVKit team, and welcome to 'Enhancing your camera experience with capture controls'. In this session, we will explore powerful new ways to improve user interactions in your app. With Capture Controls, you can programmatically map physical button gestures to camera actions, giving users the familiar feel of the native iOS Camera app. We'll also be showcasing an amazing new feature introduced in iOS 26, and we think you are going to love it.

To illustrate the power of Capture Controls, let’s use the camera app on iOS. Instead of taking a photo with the UI button on the screen, let’s use physical buttons on the phone.

With just a simple click of the volume up button, we can initiate capture.

But this API isn't just about simple clicks. It also supports long presses. For example, in camera mode, if we press and hold the volume down button, it'll kick off video recording.

Just like that, with Capture Controls, we have given our users more flexibility in using the camera to capture unforgettable moments with their iPhones.

In this session, We will first cover what physical capture is and what physical buttons are supported. Then, we’ll show you how to effectively use the API in your app to handle user interactions and build a robust and responsive camera interface like we saw in the demo We will also go over an exciting new feature in iOS 26 Remote Camera Control with AirPods.

Finally, my colleague, Nick, will present an overview of Camera Control on iPhone 16.

Before exploring the API's core functionalities let's review the key frameworks involved in creating a great camera experience in iOS At the heart of our camera app's architecture is AVFoundation, providing the low-level capture functionality through APIs like AVCapturePhotoOutput for photo and AVCaptureMovieFileOutput for video.

To combine the UI layer with AVFoundation, we have AVKit that sits on top of these frameworks.

One capture-specific API that AVKit includes is AVCaptureEventInteraction.

Now, let’s dive into the key features and capabilities of AVCaptureEventInteraction. This API lets you override the default behavior of physical buttons like the volume up and down or the Camera Control introduced with iPhone 16.

Another button that AVCaptureEventInteraction supports is the Action button. Please check out our session from last year on how to configure the Action button for capture.

Every physical button press triggers a capture event notification. This notification includes an event phase, giving you precise control over the entire press lifecycle. There are three possible phases: began: The moment the button is pressed. Perfect for preparing your app’s camera object.

cancelled: For those moments when the app goes into background or capture is not available.

ended: The moment when the button is released. This is the phase when the camera object should initiate capture.

AVCaptureEventInteraction also gives you the ability to distinguish button presses between primary and secondary actions. A button press triggers a capture notification that is sent only to its designated handler.

Primary actions are triggered by the volume down button, Action button, and the Camera Control, while the secondary action is triggered by the volume up button. The same button cannot trigger both handlers as they are designed to represent distinct actions. The secondary handler is optional, if it's not specified, the volume up click will trigger the primary action. Such design introduces modularity into your app and gives you more flexibility in designing a great camera experience for your users. This API is intended for capture use cases. The system sends capture events only to apps that actively use the camera. Adopting this API overrides default physical button behavior, so apps must always respond appropriately to any events received. Failing to handle events results in a nonfunctional button that provides a poor user experience.

Backgrounded capture apps, or those that do not have an active AVCaptureSession running, won’t receive any events. Any button press will trigger its default system action, such as adjusting volume or launching the camera.

Although AVCaptureEventInteraction is an API designed specifically for UIKit, SwiftUI developers can also access its functionality through the onCameraCaptureEvent view modifier. Adopting either of them will result in the same behavior.

Now that we understand AVCaptureEventInteraction, let's build a simple camera app that uses the SwiftUI onCameraCaptureEvent view modifier to allow photo capture with physical buttons. First, let’s begin by crafting the user interface for our camera app. We will include the CameraView that is responsible for displaying the camera output on the screen.

Next, let's add a button to the view, giving the user a way to capture photos. We'll connect this button to our CameraModel, so tapping it triggers the photo capture.

Now, we have a functional view that takes pictures when the user presses on the on-screen button. However, pressing on any physical buttons will trigger system actions. So, let’s make these hardware button presses camera specific with the API we discussed in this session. First, we will import the AVKit framework. Then, we would attach the onCameraCaptureEvent view modifier to the CameraView and if the event phase is ended, we take a picture.

It’s as simple as that! With only 6 additional lines of code, we've enabled the same intuitive physical button interactions for photo capture as you find in the built-in Camera app. That's the power of leveraging the Capture Controls API. This year, we're excited to announce that AVCaptureEventInteraction will also support primary actions triggered by clicks on either AirPod stem, specifically from AirPods equipped with the H2 chip. This will allow the user to remotely capture unforgettable moments without needing to interact with their phones.

For those who have already adopted the API, this feature will come for free with iOS 26.

Now, let's checkout this new feature in action. First, I'll put in my right AirPod, then the left one.

Looking sharp! Then, to configure the AirPods for capture, let's go into settings and scroll down to the “Remote Camera Capture” section.

There, we can choose which type of click will trigger capture. I’ll choose the “Press once” option.

Now, let’s go back into the camera app.

I’ll step back a few steps, and take a photo by clicking on one of the stems.

Great, now I can control my camera without touching the device.

Because AirPods are used for remote camera control, audio feedback is crucial, as users may not be viewing the screen during capture but need confirmation that their command has been recognized.

Therefore, we're introducing a new API for controlling sound playback, specifically for AirPod stem clicks. If a capture event is triggered by something other than an AirPod click, the AVCaptureEventInteraction object will not allow control of sound feedback.

To provide this new feature to AVCaptureEventInteraction users without requiring additional work, we've added a default tone for AirPods clicks.

However, this sound may not be optimal for your use case. So, you can customize the sound playback by providing your own sound within the app’s bundle.

Let’s go back to our camera app, specifically the .onCameraCaptureEvent view modifier. In iOS 26, if we leave this code unchanged, when the user clicks their AirPod stem, they will hear the default sound. Since we are building an app specifically for taking photos, the default sound may not be appropriate for our app.

To tune the capture sound for our specific scenario, we first disable the default sound using defaultSoundDisabled parameter.

If audio feedback is needed, we play the cameraShutter sound using the playSound method on the event object. Note, that the shouldPlaySound property will only be true if the capture action was triggered by an AirPod stem click.

To sum up, the AVCaptureEventInteraction API significantly simplifies the process of building high-quality camera experiences for your apps. We reviewed the API's key features and best practices for implementation, including this year's update: a way to control camera capture with AirPod stem clicks. Now, over to Nick to talk about AVCaptureControls. Hi, I’m Nick Lupinetti. I’m a software engineer on the Camera Experience team, and I’m excited to introduce you to AVCaptureControl, a powerful way to turn Camera Control on iPhone 16 into a physical piece of hardware for your camera interface. Camera Control is a versatile capture tool, which can click to launch a camera app, act as a shutter button, and make quick adjustments, all while keeping your finger in one place.

Let’s take those three functions in order, starting with launch. In order to launch your app, Camera Control needs access to it even when the device is locked. That means creating a Lock Screen capture extension. For more details, check out last year’s session on building a Lock Screen camera experience.

Next, it’s easy to use Camera Control as a shutter button for your app. Just set up one of the capture event APIs that Vitaliy showed you earlier, and you’re done! Camera Control will run the same primary event callback you already provided for the volume buttons. Now, my favorite part: adjusting camera settings. Camera Control supports a light press gesture, like on a traditional camera shutter, which activates a clean preview to focus on composing, and presents a setting to adjust by sliding the control. Other settings are available for selection by light pressing twice, then sliding on Camera Control and light pressing again to confirm.

At the highest level, there are two kinds of controls: sliders for numeric values, and pickers for items in a list.

Sliders, in turn, come in two flavors: continuous, and discrete.

Continuous sliders can select any numeric value within a specified range. For example, iOS comes out of the box with a continuous zoom slider. It supports the entire gamut of zoom factors between the recommended minimum and maximum for the device it operates. Discrete sliders also select numeric values: either from an explicit set you provide, or by fixed steps from one value to another. iOS offers a built-in discrete slider to drive exposure bias, with one-third increments between plus or minus two, which are manageable units from traditional photography.

Pickers, LIKE discrete sliders, allow finite selections, but are instead indexed. Each index maps to a named state for the control, like “On” and “Off” for Flash, or the names of Photographic Styles in Camera app. Okay, now that we understand the various types of controls, let’s take a look at the classes that implement them. Controls are defined in the AVFoundation framework. AVCaptureControl is the abstract base class the others inherit from.

There are two system-defined subclasses, which allow any app to adopt the same behavior as the built-in camera app for zoom or exposure bias. Finally, there are two generic subclasses, for sliders, both continuous and discrete, and pickers, whose behaviors you can define yourself. To adopt controls in your app, you add them to an AVCaptureSession. Usually, the first step in configuring a session is creating a capture device, and adding it to the session wrapped as an input. Next, any system-defined controls are created, with a reference to that same device for them to drive its properties on your behalf. Any controls with app-defined behaviors are also created at this time, and finally, each control is added to the session.

For your app to respond as a person interacts with Camera Control, there are a couple of flows to consider. System-provided controls apply values from Camera Control directly to the configured device. You don’t need to set videoZoomFactor or exposureTargetBias yourself.

But you may have models or UI that need to be updated. If you already use Key-Value Observing, or KVO, to know when the relevant property changes, you can continue using that mechanism to update your interface.

If you’re not using KVO, create the system control with an action handler, which will be called on the main thread as the value changes, so you can update your UI directly.

App-defined controls are always created with an action callback, run on a queue that you specify. If your control needs to drive capture settings, you can do that synchronously by specifying the queue where you manage that state.

Then, you can update your UI on the main queue.

Great, now we’re ready to adopt Camera Control in the app I’m making with Vitaliy. Since he’s added a capture interaction, the control already works as a shutter button to take photos, without any additional code.

But I’d also like to support zooming in and out with Camera Control, so let’s add that next.

Here’s where we’re currently configuring our capture session to use a device. First we’ll check if controls are actually supported, since they’re only available on devices with a Camera Control, like iPhone 16.

System-provided controls are so easy to create, it takes a single line of code.

Before adding your control, make sure the session will accept it. For example, you can’t add a system-provided zoom control for a device that’s already associated with a zoom control. Finally, we add the control to the session.

That’s all it took to make zoom work perfectly in our app using Camera Control! But there’s one problem. I also support a pinch gesture to zoom in. After I zoomed in with Camera Control, the UI wasn’t up to date, so the pinch gesture jumped back to a stale zoom factor.

But that’s an easy fix. We’ll just create the slider with an action closure, which receives the zoom factor as an argument. The new value gets applied to the UI with a delegate callback or an observable model property.

Once the model behind the pinch gesture is in sync, I can zoom with Camera Control, and the pinch always starts from the right zoom factor.

Now I’d like to add a control that isn’t available right out of the box. But before we create our own, let’s give some thought to what makes a great control. Camera Control, true to its name, is meant to be used with the iPhone Camera, so it should affect the appearance or experience of capturing. That makes it confusing if your controls affect unrelated areas of your app. And never introduce a capture session just to adopt Camera Control, since running the camera has power and privacy requirements best reserved for a capture experience. A great example of a custom control is a Picker that iterates through the effects or filters your app supports.

Such a control typically needs to operate closer to capture than to UI. So let’s checkout what that looks like. Starting with the code we just wrote, we’ll make the zoom control one of many. Note that there’s a limit to how many controls you can add, reported by the session’s maxControlsCount property. canAddControl will return false once you reach the limit. Now we can define our effects. Your effects might be rendered on a dedicated queue using video sample buffers, but for our app, I’ll use the reaction effects introduced with iOS 17. Here we’ve built an ordered list of effects as well as their names out of the available unordered set. The Picker gets initialized with the effect names to show as it iterates through the options. The Picker also needs a name of its own, as well as an SF Symbol image, to distinguish it from the zoom control. It’s a great idea to disable your control when it’s not currently supported, rather than omitting it. That way, it’s still visible, just not interactive. Otherwise, another control will be selected as a fallback, and people might wonder what happened.

As the user changes the Picker’s selected index, we’ll perform the corresponding reaction. We’re targeting the action to the session queue, since that’s our isolation context for managing the device. And all that’s left, is to add the picker to the array of controls. Let’s checkout how we did! As I interact with Camera Control, note how zoom is disabled in this new configuration. Let’s change the control. I’ll swipe out on the overlay, and now I can checkout my picker in the list. I’ll scroll over to it and press lightly to display its options. Now I can select each effect in turn, and watch as they play in the preview.

Rock on! So, with what Vitaliy and I showed you in this session, you’ve got a ton of great tools to build a world-class camera app.

We’ve seen how to capture media with physical buttons on iOS devices, including how easily this works with AirPods interactions on iOS 26. And we learned how to harness Camera Control as a convenient tool to supercharge your app’s capture interactions. And there’s even more great resources at developer.apple.com, including Human Interface Guidelines for Camera Control, guidance on setting up AirPods to test new features, and articles and sample code with more information on how to adopt Camera Control. I can’t wait for you to build a camera that will capture peoples’ attention. Thanks for watching!

Code

5:35 - Initial PhotoCapture view setup

import SwiftUI

struct PhotoCapture: View {
    var body: some View {
        VStack {
            CameraView()
        }
    }
}

5:37 - Connecting a button to the camera model

import SwiftUI

struct PhotoCapture: View {
    let camera = CameraModel()
    var body: some View {
        VStack {
            CameraView()
            Button(action: camera.capturePhoto) {
                Text("Take a photo")
            }
        }
    }
}

6:10 - Importing AVKit

import AVKit
import SwiftUI

struct PhotoCapture: View {
    let camera = CameraModel()
    var body: some View {
        VStack {
            CameraView()
            Button(action: camera.capturePhoto) {
                Text("Take a photo")
            }
        }
    }
}

6:14 - Setting up onCameraCaptureEvent view modifier

import AVKit
import SwiftUI

struct PhotoCapture: View {
    let camera = CameraModel()
    var body: some View {
        VStack {
            CameraView()
            .onCameraCaptureEvent { event in
                if event.phase == .ended {
                   camera.capturePhoto()
                }
            }
            Button(action: camera.capturePhoto) {
                Text("Take a photo")
            }
        }
    }
}

8:53 - Default sound for onCameraCaptureEvent view modifier

.onCameraCaptureEvent { event
	if event.phase == .ended {
   	camera.capturePhoto() 
  }
}

9:13 - Play photo shutter sound on AirPod stem click

.onCameraCaptureEvent(defaultSoundDisabled: true) { event in
    if event.phase == .ended {a
        if event.shouldPlaySound {d
            event.play(.cameraShutter)
        }
    }
    camera.capturePhoto()
 }

14:46 - Add a build-in zoom slider to Camera Control

captureSession.beginConfiguration()

// configure device inputs and outputs

if captureSession.supportsControls {
    let zoomControl = AVCaptureSystemZoomSlider(device: device)

    if captureSession.canAddControl(zoomControl) {
        captureSession.addControl(zoomControl)
    }
}

captureSession.commitConfiguration()

15:40 - Modifying the built-in zoom slider to receive updates when zoom changes

let zoomControl = AVCaptureSystemZoomSlider(device: device) { [weak self] zoomFactor in
    self?.updateUI(zoomFactor: zoomFactor)
}

16:46 - Adding a custom reaction-effects picker alongside zoom slider

let reactions = device.availableReactionTypes.sorted { $0.rawValue < $1.rawValue }
let titles = reactions.map { localizedTitle(reaction: $0) }
let picker = AVCaptureIndexPicker(“Reactions", symbolName: “face.smiling.inverted”,
    localizedIndexTitles: titles)

picker.isEnabled = device.canPerformReactionEffects
picker.setActionQueue(sessionQueue) { index in
    device.performEffect(for: reactions[index])
}

let controls: [AVCaptureControl] = [zoomControl, picker]

for control in controls {
    if captureSession.canAddControl(control) {
        captureSession.addControl(control)
    }
}

Summary

  • 0:00 - Introduction

  • Learn about using AVKit and AVFoundation to improve user interactions in camera apps. You can now programmatically map physical button gestures, such as volume "up" and "down", to camera actions, enabling people to take photos and start video recording using the phone's buttons. A new feature in iOS 26 is Remote Camera Control with AirPods. Learn the supported physical buttons, how to use the AVCaptureEventInteraction API, and an overview of Camera Control on the iPhone 16.

  • 2:51 - Physical capture

  • AVCaptureEventInteraction supports the Action button; configuration details can be found in last year's session “Extend your app’s controls across the system”.

  • 3:01 - Event handling

  • AVCaptureEventInteraction is an API that enables you to control the lifecycle of physical button presses within camera apps. It provides three event phases: 'began', 'cancelled', and 'ended'. The 'ended' phase is when the camera object should initiate capture. The API distinguishes between primary and secondary actions, triggered by specific buttons (volume down, Action, Camera Control for primary; volume up for secondary). This design allows for modularity and flexibility in app design. SwiftUI developers can access this functionality through the 'onCameraCaptureEvent' view modifier. By importing the AVKit framework and attaching this modifier to the CameraView, you can enable photo capture with physical buttons in just a few lines of code, mimicking the behavior of the built-in Camera app.

  • 6:39 - AirPods remote capture

  • Starting in iOS 26, AirPods equipped with the H2 chip will enable remote camera control through stem clicks. Users can configure this feature in the settings app and choose which stem click action triggers photo capture. A new API has been introduced to provide audio feedback for these stem clicks, with a default tone included, though you can customize this sound for your specific apps. This enhancement allows people to capture moments hands-free and ensures confirmation of their commands through audio cues.

  • 10:00 - Camera control

  • AVCaptureControl is a new feature in iPhone 16 that enables you to create physical hardware controls for your camera interfaces. It allows people to launch camera apps and adjust camera setting quickly, and it can act as a shutter button. The control supports two main types of adjustments: sliders (continuous and discrete) for numeric values, and pickers for items in a list. These controls are defined in the AVFoundation framework. You can add system-defined controls for zoom or exposure bias, or create custom controls with app-defined behaviors. These controls are then added to an AVCaptureSession, and the app responds to interactions through action handlers or Key-Value Observing (KVO), updating the UI accordingly. To enhance the camera app functionality, utilize Camera Control to easily add system-provided controls such as zoom, which requires only a single line of code. However, it is crucial to ensure the session accepts the control and to synchronize the UI with the control's state to avoid issues like stale zoom factors. When creating custom controls, consider that Camera Control is specifically designed for capturing experiences. A good example of a custom control is a Picker that allows people to select effects or filters. This control needs to operate closer to the capture session than the UI. You can define effects, initialize the Picker with effect names, and disable the control when not supported. By targeting the action to the session queue, you can ensure proper isolation and management of the device. With Capture Controls, you can build world-class camera apps that capture people's attention.

Evaluate your app for Accessibility Nutrition Labels

Use Accessibility Nutrition Labels on your App Store product page to highlight the accessibility features supported by your app. You'll learn how to evaluate your app's accessibility features — such as VoiceOver, Larger Text, Captions, and more — and choose accurate and informative Accessibility Nutrition Labels. You'll also find out how to approach accessibility throughout the design phase.

Chapters

Resources

Related Videos

WWDC24

Transcript

Hi, I’m James, a software engineer in Accessibility. Today, I’m going to talk about how you can evaluate your app to add Accessibility Nutrition Labels to your app’s product page. Hi, I’m Lisa. I’m a designer and I also work in Accessibility. I’m going to talk about design and things you can do to make your app more accessible. In this session, we’ll explore how to evaluate your app for Accessibility Nutrition Labels. Lisa and I will introduce core principles of app accessibility and walk through each of the features you can indicate support for. Along the way, we’ll cover how you can design for Accessibility, test your app with VoiceOver and Voice Control, and deliver accessible media.

By the end of this session, you’ll know how to highlight your app’s accessibility with Accessibility Nutrition Labels.

At Apple, we believe that the best technology works for everyone. When you design your app with Accessibility, you open it up to so many more people. And when you build in support for assistive technologies like VoiceOver and Voice Control, you unblock access to your app.

Accessibility Nutrition Labels showcase the features your app supports on the App Store. This helps people to know whether your app supports the key features they rely on. To indicate support for accessibility features in your app, people should be able to complete all of the common tasks of your app.

Start by defining your common tasks. These are your app’s primary functionalities that people download your app for, plus functionality that is fundamental to using an app in general. This includes first launch experience, login, purchase, and settings. Define what people can do in your app.

After you identify your app’s common tasks, evaluate your app for each of the accessibility features available in Accessibility Nutrition Labels. Consider a testing strategy that evaluates each common task in your app with each feature. This strategy will help you discover which features are supported by your app and which ones are not supported yet. Make sure you test on every device that your app supports. This includes iPhone, iPad, Mac products, Apple Watch, and more.

If a feature is not relevant to your app's functionality, don't indicate support. It’s critical to accurately inform people how they can use your app. To ensure accuracy, consult the Accessibility Nutrition Labels documentation. Evaluation criteria are meant to standardize the user experiences an app must provide for that app to be considered as supporting a feature. This ensures that apps have consistent responses across the App Store. When you’re ready, you can add Accessibility Nutrition Labels. You’ll add them to your apps product page in App Store Connect.

All of your apps supported accessibility features will now appear on your product page.

At the center of Accessibility Nutrition Labels are core principles of app accessibility. These principles will help you evaluate each feature.

Hey James, this might be a great opportunity to use the app you and I are working on, Landmarks. Yeah, that’s a great idea. Why don’t you tell them about how we got started? App accessibility starts with design.

You want to design an interface that everyone can enjoy, regardless of how they interact with their device. Like James mentioned, it’s so important to get familiar with the accessibility features, turn them on, like Voice Control and Larger Text, and pay attention to what people might hear and see when using your app. And whenever possible, connect with the disability community. Testing with people who use accessibility features is one of the most effective ways to know how accessible your app is. Like me, for example, I’m legally blind due to a rare form of Macular Degeneration. This affects my central vision, causing everything to be blurry. To give you a better idea, imagine I’m out on one of my favorite hikes. I can see big objects like the trees, the rocks, the trail ahead, but I need to get really close in order to see details, like reading a sign. And when I use my iPhone, I rely on the largest text sizes to do everything, from reading, doing work, and personal tasks. In the disability community, we strongly believe there should be “Nothing about us without us.” People with disabilities should have full participation in the decisions that affect our lives. And there’s no better way to ensure the accessibility of your app than to bring in the people who will be directly impacted by it. And I can tell you firsthand, if an app is designed to support my experience as a blind person, I feel included as a valued customer.

That’s why Accessibility Nutrition Labels are so important. Let’s dive into the first set of features. Yes, let's get started.

This first set of features provides a great opportunity to apply accessible design.

Let's start with color. Color can enhance communication and help people understand information.

There should be a high enough contrast between foreground and background colors. This helps to ensure legibility of text and to help people find what they’re looking for. It is important to design your app using higher contrast color schemes.

If by default your app does not provide minimum contrast, check that it does when the Increase Contrast setting is turned on.

For our app, we intentionally designed using high contrasting colors, but we found some colors didn’t provide minimum contrast, so we selected alternate colors to support the Increase Contrast setting. And this applies to both light and dark appearances. After thoroughly testing the Landmark app’s common tasks for sufficient contrast, we didn’t find any other issues. I’ll plan on indicating support for sufficient contrast in the Accessibility Nutrition Labels. Up next is Dark Interface. People often use dark mode if they’re sensitive to light or prefer a darker background. Large white or bright areas on the screen can cause discomfort or even physical pain. My eyes are very sensitive to light so dark interfaces make it much more comfortable for me to read. If your app doesn’t already have a Dark Interface, make sure it does when the dark mode setting is turned on.

We’ve designed our app to support dark mode, but to make sure, we’ll complete some common tasks. We’ll also check with Smart Invert turned on. Smart Invert is an accessibility feature that reverses the colors in the interface. You want to make sure that colors in any media do not get inverted. Once we’ve confirmed that our app supports a mostly dark background, we can add Dark Interface to our Accessibility Nutrition Labels. Some people need to increase the size of text in order to read it. When designing your app, make sure people can increase the text size to at least 200% larger. Ideally, the text can grow even larger. Personally, I need 310%.

So, support text sizes larger than 200% if possible.

It’s important to also design a layout that allows the text to increase in size without any overlapping or severe truncation.

Allow enough space and additional lines to support the larger text sizes. For controls that cannot reasonably increase in size, you can review documentation for other design options.

One of the best ways to support Larger Text in your app is to use Dynamic Type. To learn more, check out the WWDC24 video "Get Started with Dynamic Type." If the text scales up to at least 200% throughout your common task in your iOS app, you can add Larger Text to your Accessibility Nutrition Labels.

Hey James, let's test our app.

That's a great idea. I have questions about how large text works for somebody who needs it. You and I have worked together for a while. I know that you rely on larger text every day. Is it OK if I ask questions? Absolutely.

I'll also document any issues we find. I already have larger accessibility text sizes turned on. and the text size is set to 235% on this iPhone. In the Landmarks app, I can already see that the text is bigger than the default size. One of the features of Landmarks is that it lets you group different landmarks into collections. We have an Accessibility Team collection already started to help us plan hiking trips. Hey, I know! Let’s update the Accessibility team collection. Sounds good. Maybe we can pick a location to take the team on the next hike. Perfect. Great, let’s check out the collections. First, I'll tap on the back button, and then I’ll tap on the collections button. Hey Lisa, I noticed the 2023 trip collection is a little bit misaligned. That’s definitely a bug, what do you think? Well, it’s not polished, but I can still read the collection name. I don’t think it’ll prevent us from indicating support for Larger Text, but we should take a note to fix the layout. I’ll write it down.

Actually, let’s take a deeper look.

Hmm, the description is truncated. We really should make that text wrap into multiple lines. I'll add it to the list of bugs.

I'll edit the collection.

Hold up. Why is the text in the description text field not growing? I'll check if the text can scroll.

Well, it doesn’t look like the description field grows to support the larger text sizes. While I still can scroll, it’s much more work. Lisa, now that we have some more information, What do you think? Honestly, all of those bugs make it difficult to get information about the collection. We really should fix those bugs before we indicate support for Larger Text. I agree. After we finish the rest of our common tasks, we'll look to developer documentation to find solutions for the bugs we found. We can make text fields grow to accommodate larger text and allow for text to wrap to the next line instead of truncating.

For now, these issues mean we will not indicate support for Larger Text in Landmarks. Once we fix these issues and test our other common tasks, we can come back to App Store Connect and indicate support later.

Let's talk more about color. When designing using color, it’s important to consider that people can perceive color differently. Many apps use color to convey information. For example, some apps use red or green to communicate a status like success or failure. People who are colorblind might miss out on this status. When designing, use shapes, icons, or text in addition to color to communicate important information. You can also let people customize how color is used in your app. We’ve designed our app using color and icons to draw attention to important elements like the trash button. This means we do not rely on color alone for these elements.

To make sure we meet evaluation criteria, we’ll walk through common tasks paying attention to the use of color.

Once we’ve confirmed that color isn’t the only means to communicate information in the Landmarks app, we can add Differentiate Without Color Alone to our Accessibility Nutrition Labels. Next, let’s talk about motion.

Motion can enrich an app experience, but it’s important to consider that certain types of motion can cause severe dizziness or nausea. This can prevent people from using your app. Triggers can include: zooming or sliding transitions, flashing or blinking, animations that play automatically, or parallax effects.

To help with this, people can turn on the reduced motion setting.

To support this setting, go through your common task and identify any animations that are known triggers. You can make changes to these animations instead of removing them.

If your app doesn’t have any of these known triggers to begin with, you can indicate support for Reduced Motion. To evaluate our app, we’ll complete common tasks paying close attention to any motion. We’ll check documentation for guidance for any problematic animations or animations we’re not sure about.

Once you’ve confirmed that motion triggers have been removed, you can add

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment