Cameras as Rays - Jason Y. Zhang Artwork

Talking Papers Podcast

Talking Papers Podcast: deep dives into research papers in computer vision, 3D, machine learning, and AI, with the authors who wrote them. Where research meets conversation. By researchers, for researchers.

Each episode is structured like the paper itself: a TL;DR / abstract to set the stage, then related work, approach, results, conclusions, and future work. We close with a bonus segment called "What did Reviewer 2 say?", where the authors share the candid peer-review story behind the publication.

Hosted by Itzik Ben-Shabat. Guests are PhD students, postdocs, and faculty from leading labs across academia and industry. Aimed at fellow researchers and graduate students who want the candid version of the work, not a polished press release.

All Episodes

Talking Papers Podcast

Cameras as Rays - Jason Y. Zhang

March 14, 2024 • Itzik Ben-Shabat • Season 1 • Episode 33

0:00 | 42:47

Talking Papers Podcast Episode: "Cameras as Rays: Pose Estimation via Ray Diffusion" with Jason Zhang

Welcome to the latest episode of the Talking Papers Podcast! This week's guest is Jason Zhang, a PhD student at the Robotics Institute at Carnegie Mellon University who joined us to discuss his paper, "Cameras as Rays: Pose Estimation via Ray Diffusion". The paper was published in the highly-respected conference ICLR, 2024.

Jason's research hones in on the pivotal task of estimating camera poses for 3D reconstruction - a challenge made more complex with sparse views. His paper proposes an inventive and out-of-the-box representation that perceives camera poses as a bundle of rays. This innovative perspective makes a substantial impact on the issue at hand, demonstrating promising results even in the context of sparse views.

What's particularly exciting is that his work, be it regression-based or diffusion-based, showcases top-notch performance on camera pose estimation on CO3D, and effectively generalizes to unseen object categories as well as captures in the wild.

Throughout our conversation, Jason explained his insightful approach and how the denoising diffusion model and set-level transformers come into play to yield these impressive results. I found his technique a breath of fresh air in the field of camera pose estimation, notably in the formulation of both regression and diffusion models.

On a more personal note, Jason and I didn't know each other before this podcast, so it was fantastic learning about his journey from the Bay Area to Pittsburgh. His experiences truly enriched our discussion and coined one of our most memorable episodes yet.

We hope you find this podcast as enlightening as we did creating it. If you enjoyed our chat, don't forget to subscribe for more thought-provoking discussions with early career academics and PhD students. Leave a comment below sharing your thoughts on Jason's paper!

Until next time, keep following your curiosity and questioning the status quo.

#TalkingPapersPodcast #ICLR2024 #CameraPoseEstimation #3DReconstruction #RayDiffusion #PhDResearchers #AcademicResearch #CarnegieMellonUniversity #BayArea #Pittsburgh

All links and resources are available in the blogpost: https://www.itzikbs.com/cameras-as-rays

🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikbs.com

📧Subscribe to our mailing list: http://eepurl.com/hRznqb

🐦Follow us on Twitter: https://twitter.com/talking_papers

🎥YouTube Channel: https://bit.ly/3eQOgwP

Yizhak Ben-Shabat (Itzik)

Host