PRG Seminar Series

on

Robotics and Computer Vision


A Talk on

Robust Visual Understanding in the Multimodal Era

Monday, November 13, 2023

Time: 12:00 PM

Room IRB 5105

Tejas Gokhale

Assistant Professor

Department of Computer Science

University of Maryland, Baltimore County


ABSTRACT

In the last 5 years or so, computer vision has undergone a paradigm shift -- language is now front and center as a source of knowledge about the visual world and models that learn from such vision+language data have led to an unprecedented rise in the functionalities and capabilities of vision systems such as image generation and interactive tasks such as visual question-answering. This rise has also resulted in wide-spread democratization to non-expert and non-research communities. In this talk, we will start with an overview of reliability challenges in computer vision, and see how "data transformation discovery" guided by adversarial training can help improve robustness of image classifiers. We will then extend this discussion to the realm of interactive tasks like visual question answering and review intriguing failure modes and mitigation strategies. We will end with a discussion about the challenges associated with evaluation of text-guided image generation and some of our recent efforts in that direction.



ABOUT THE SPEAKER

Tejas Gokhale is an Assistant Professor of Computer Science in the University of Maryland, Baltimore County. He earned his Ph.D. from Arizona State University in 2023 and an M.S. from Carnegie Mellon University in 2017. His work focuses on the design of robust and reliable systems that can understand the visual world. His current focus is on (1) grounded evaluation of machine learning models that learn from multiple data modalities and (2) improving performance when faced with out-of-distribution data and adversaries. His work has been published at several premier venues across AI, CV, and NLP. He is the lead organizer of the CVPR ODRUM workshop (2022, 2023) that discusses reasoning and reliability under multimodal settings. Tejas is a recipient of the ASU Engineering Graduate Fellowship, SCAI Doctoral Fellowship, ASU GPSA Outstanding Mentor award, ASU GPSA Outstanding Researcher award, and top reviewer awards at NeurIPS and ICLR. Website: https://www.tejasgokhale.com/.





                 

  info[at]prg.cs.umd.edu