When searching for video content using AI-generated embeddings (like "man with white t-shirt"), current frame-by-frame indexing creates a result clustering problem:
- Current Setup: Each video frame is stored as a separate searchable item
- Search Behavior: Query returns individual frames ranked by similarity score
- Undesired Outcome: Top results often come from the same video, even if other videos contain equally relevant content