Talk Runbook: Generative AI in Media Production 2026

Slot: 16:10 - 17:00, June 25, 2026 Room: Afternoon breakout, Spielfeld Digital Hub Language: English Slides: ~/code/contiamo/claw/conference2026/slides/index.html (Reveal.js, serve locally or Tailscale)

Pre-talk checklist (do during lunch break)

Slide-by-slide runbook

1. WELCOME (slides 1-2, ~2 min)

Slide 1 - Title (burgundy, big C)

"Welcome, fourth year of this talk."
For returning audience: "Some of you have been here since 2023."

Slide 2 - Era labels: 2023/2024/2025/2026: ?

Quick verbal recap: "Proof of Concept, Coherence Shock, Sensory Integration. This year's label comes at the end."

2. AUDIO (slides 3-6, ~4 min)

Slide 3 - Section: "Audio" (burgundy)

Slide 4 - "Emotion is now a parameter."

ElevenLabs v3: you can now write [laughs] or [whispers] inline and the model performs it
Driven by voice bot deployment AND creative tooling - same model does both

Slide 5 - ElevenLabs demo (placeholder)

ACTION: Play 15-20s audio clip showing emotional range
Let the quality speak

Slide 6 - "Audio lives inside video."

Tease: "Since Veo 3 last May, video models generate their own audio. Dialogue, ambient, foley - one pass. You'll hear plenty of that later."

3. IMAGE (slides 7-9, ~4 min)

Slide 7 - Section: "Image" (burgundy)

Slide 8 - "Generate and hope -> iterate and refine."

Last year showed Flux Kontext. Since then: FLUX.2, GPT Image 2, Nano Banana Pro
The paradigm shifted - professional teams iterate, they don't regenerate from scratch
Both GPT Image and Nano Banana now in Photoshop 27.0

Slide 9 - Image examples (placeholder)

ACTION: Show 2-3 Nano Banana Pro or GPT Image 2 outputs
Point: the quality ceiling is no longer the issue, the workflow is

4. FACE (slides 10-13, ~4 min)

Slide 10 - Section: "Face" (burgundy)

Slide 11 - "Reference images are native now."

Seedance: 16 references. Veo 3.1: 3 angles = solved. Kling: elements system.
The face swap era is over. Models just know who you are.

Slide 12 - Zeitreise vs Lichtspiel side-by-side (placeholder)

ACTION: Play 2025 booth output (5s silent tween) then 2026 booth output (cinematic with sound)
"Last year: face-swapped, Flux Kontext edited, WAN tweened. Silent. 5 seconds. Kind of worked."
"This year: 3 photos in, 4 movie scenes out, native sound, 60 seconds."

Slide 13 - "That's the gap that closed in one year."

Let it sit. This is the most concrete proof-of-progress slide.

5. VIDEO (slides 14-22, ~6 min)

Slide 14 - Section: "Video" (burgundy)

Slides 15-22 - Year-by-year progression + clips

2023: Darth Vader, Will Smith spaghetti. Play a clip, get a laugh.
2024: Runway Gen-3. The moment coherence arrived.
2025: Veo 3 with native audio. The audience heard this last year.
2026: ACTION Play 2-3 of your own Veo/Seedance clips. Let them run.
Timeline slide: rapid fire verbal delivery. Pause on "Sora shut down."

6. AGENTIC PRODUCTION (slides 23-36, ~9 min including booth)

Slide 23 - Section: "Agentic Media Production" (burgundy)

Slide 24 - "The tools chain themselves."

"Last year I showed you individual tools. This year: coding agents orchestrate them."
Claude Code + skills files = media production studio
Agencies are encoding their processes into skills. Not just for developers.

Slide 25 - Pipeline chips: Scene scouting -> Stills -> Video -> Evaluation -> Rough cut

Walk through verbally: "The agent scouts locations, generates reference stills, produces video segments with retries and fallbacks, evaluates its own output, assembles the rough cut."

Slide 26 - "Vonovia Prototype"

Brief context: "Real client project. Vonovia wanted a 1-minute AI showcase for a leadership offsite."

Slide 27 - Final video

ACTION: Play the Vonovia video. Let it run ~30s, enough to show it works.

Slide 28 - "Hmm, room for improvement, let's say."

Transition to outtakes: "But getting there was a process."

Slide 29 - First attempt still (clip01 v1)

Let the audience see the gap between first attempt and final

Slide 30 - After feedback still (clip01 v3)

The improvement is visible. You directed it.

Slide 31 - "The coffee mug could be less dirty."

Get a laugh. This is the human art direction moment.

Slide 32 - Aspect ratio iterations (three images fading)

"Three tries to get 16:9 right. The model kept defaulting to cinemascope."

Slide 33 - Screen tapping video (autoplay loop)

Get a laugh. "She keeps tapping the screen instead of using the mouse. I used the first 2-3 seconds and moved on."

Slide 34 - 70% / 30%

"70% of the work is generation. 30% is review. But ALL decisions happen in the 30%."

Slide 35 - 10x / -86%

"Agencies are seeing these numbers. Same team, same clients, 10x the output."

Slide 36 - Booth pipeline: "FULLY AUTONOMOUS" + 85 seconds

"The booth running next door right now does this without any human in the loop. 3 photos, choose a mood, 85 seconds later you have a cinema reel and a printed movie poster."

Slide 37 - Lichtspiel video (placeholder)

ACTION: Play one booth video. ~8-10 seconds. The audience will recognize the movie reference.

7. PROGRESSION TABLE (slides 38-40, ~3 min)

Slide 38 - Section: "Solved?" (burgundy)

Slide 39 - The table

Walk through: "Voice was solved in 2023. Image in 2024. This year: music and video both graduate."
Point at the new column: "Production. Advanced, not solved. That's the frontier."

Slide 40 - "The remaining challenge is orchestration."

8. WHERE NEXT (slides 41-45, ~3 min)

Slide 41 - Section: "Where next?" (burgundy)

Slide 42 - "10 min" (single-take demonstrated) Slide 43 - "60s" (was 5 minutes a year ago) Slide 44 - "$0.80" (was $2.40 six months ago)

Rapid fire through these three. The numbers tell the story.

Slide 45 - "Separate tools for each modality will feel archaic within 18 months."

The convergence thesis. Gemini Omni, multimodal foundation models absorbing everything.

9. ACTIONABLE (slide 46, ~1 min)

Slide 46 - "Start here." + tool chips

"You probably already have a coding agent. Claude Code, Cursor, Copilot. Teach it a skill. Write a document describing what you want produced. It handles the coordination, the API calls, the retries. The barrier isn't the APIs anymore - it's the instructions."

10. SUPERCUT (slides 47-49, ~3 min)

Slide 47 - "One more thing." (deep burgundy)

Build tension. Brief pause.
"Everything you saw today at the booth - we can turn that into a trailer."

Slide 48 - Supercut video (placeholder)

ACTION: Play the 75-second supercut. DO NOT TALK OVER IT.
The room will react when they see colleagues they recognize.
Music drop at 1:13, logo at the end.

Slide 49 - Era label reveal: 2026 "Era of Agentic Production" (deep burgundy)

After the trailer ends, one beat of silence.
"Era of Agentic Production."
Pause.

11. Q&A (slide 50)

Slide 50 - "Questions?" (burgundy, big C)

Emergency fallbacks

Supercut not ready: Skip slide 47-48, go straight from "Start here" to era label reveal. The talk still works.
Video won't play: Advance past it, describe verbally. "I have a video here that isn't cooperating - it shows [X]."
Running long: Cut the 3 "where next" stat slides (42-44) and deliver them verbally on the convergence slide.
Running short: Expand on the Vonovia outtakes (you have more material than slides), or let Q&A run longer.

Key files

What	Where
Slides	`~/code/contiamo/claw/conference2026/slides/index.html`
Outline v2	`~/code/contiamo/claw/conference2026/docs/talk-outline-v2.md`
Vonovia media	`~/code/contiamo/claw/conference2026/slides/media/vonovia/`
Vonovia full inventory	See subagent report in session `390e82c6`
Vonovia on Drive	https://drive.google.com/drive/folders/1fM4W2a6PYv_LtU3CXkiq5m5whIe6tyqc
Lichtspiel booth code	`~/code/contiamo/lichtspiel/`
Supercut builder	`~/code/contiamo/lichtspiel/backend/build_trailer.py`
Trailer music	`~/code/contiamo/lichtspiel/assets/trailer_music.mp3`
Spike handover	`~/code/contiamo/claw/conference2026/docs/video-model-spiking-handover.md`
2024 deck (reference)	`~/Downloads/genai-conf-2024-wow.pdf`
2025 deck (reference)	`~/Downloads/genai-conf-2025-wow.pdf`

dmahlow/talk-runbook.md

Select an option

No results found