Media Coverage

GAMMA Lab & NVIDIA Release Audio Flamingo Next for Open Audio-Language Reasoning

GAMMA Lab researchers collaborated with NVIDIA to release Audio Flamingo Next (AF-Next), a next-generation open audio-language model designed for advanced reasoning over speech, sound, and music. AF-Next introduces Temporal Audio Chain-of-Thought, a reasoning paradigm that grounds intermediate reasoning steps to timestamps in long audio. This enables more faithful and interpretable reasoning over complex audio inputs, including speech, environmental sounds, music, and long-form recordings. The model family includes three specialized variants: AF-Next-Instruct for general audio question answering, AF-Next-Think for multi-step audio reasoning, and AF-Next-Captioner for detailed audio captioning.

Lin and Manocha Receive 2026 IEEE ICRA Most Influential Paper Award

University of Maryland professors Ming Lin and Dinesh Manocha, together with Jur van den Berg, received the 2026 IEEE International Conference on Robotics and Automation Most Influential Paper Award for their work on “Reciprocal Velocity Obstacles for real-time multi-agent navigation.” The award recognizes research that has had a lasting impact on the robotics and automation community. The honored work introduced influential methods for real-time multi-agent navigation, helping robots and virtual agents avoid collisions while moving efficiently in shared spaces.

GAMMA Lab & Apple Develop AMUSE to Advance Agentic Multimodal Reasoning

GAMMA Lab researchers collaborated with Apple Machine Learning Research to develop AMUSE (Audio-Visual Benchmark and Alignment framework for Agentic Multi-Speaker Understanding), a new benchmark designed to evaluate and improve multimodal AI systems operating in complex, real-world conversational settings. AMUSE focuses on agentic multi-speaker reasoning — requiring models to track who is speaking over time, ground dialogue in visual context, and generate coherent multimodal summaries. The benchmark reveals significant limitations in existing multimodal large language models when reasoning across audio, vision, and language simultaneously.

GAMMA Collaborates with NVIDIA on Music Flamingo, Adopted by Universal Music Group

GAMMA's joint work on "Sensible Agents" with Google Research

New Research Helps Robots Grasp Situational Context

Sanjoy Chowdhury’s Vision for Smarter, Multimodal AI

Joint work with NVIDIA on Audio Flamingo 3

Why 'Thinking More' Isn't Always Making Generative AI Smarter

Sreyan Ghosh received NVIDIA Fellowship