Computer Vision

Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning
Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning
Maestro: Self-Improving Text-to-Image Generation via Agent Orchestration
Self-evolving T2I through iterative evolution of prompts.