Posts by Collection



Semi‐Supervised Segmentation and VQA on Aerial Flood Images

[ongoing] Designed a semi‐supervised Image Segmentation and Graph based Visual Question Answering system for the FloodNet challenge.
The Segmentation network is based on CutMix and Cross Pseudo Supervision.
The VQA system applies geodesic dilation and morphological operations on the segmentation maps. Connected component counts are per‐formed on 4‐adjacency graphs made from the processed segmentation maps.
Supervised by: Sravan Danda - BITS Pilani

Assessing the Aptitude of Language Models in Comprehending Advertisements

[ongoing] Advertisement media are fundamentally different from typical videos and images. They are more than just their content, persuade users to take certain actions, and often use creative atypicalities to deliver their message.
Advertisement images from the Kovashka Ads Dataset were textually verbalized. These were presented to text‐based language models like GPT‐3.5, GPT‐4, and FLAN‐T5, which were evaluated on Action‐Reason pair and Atypicality understanding tasks.
As a comparison between text and vision, Vision‐Language models like BLIP2 were also evaluated on the advertisement images.
Supervised by: Adriana Kovashka - University of Pittsburgh

A Non‐Chaotic Pruning Strategy for Neural Networks

Pruning strategies such as L0, L1, Rank based pruning change the feature importance orderings of Neural Networks in chaotic manners. This project hypothesizes that Granger‐Causality based pruning may be a non‐chaotic neural network pruning strategy.
Supervised by: APPCAIR - BITS Pilani


Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior Permalink

International Conference on Learning Representations (ICLR) 2024 — Spotlight

Recommended citation: Ashmit Khandelwal, Aditya Agrawal, Aanisha Bhattacharyya, Yaman K Singla, Somesh Singh, Uttaran Bhattacharya, Ishita Dasgupta, Stefano Petrangeli, Rajiv Ratn Shah, Changyou Chen, Balaji Krishnamurthy, 2023. "Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior", ICLR 2024



QSTP 2022: Introduction to Deep Learning

Instructor, Quark, BITS Pilani - Goa, 2022

Co‐instructed for the Introduction to Deep Learning course. The course provides introductory knowledge and assignments on Deep Learning, Computer Vision, Natural Language Processing, and Generative Models.

Introduction to ML and DL

Instructor, Center of Technical Education, 2022

Co‐instructed for the Introduction to Machine Learning and Deep Learning course. Teaching the mathematical theory and providing python implementations of Machine Learning algorithms and Deep Learning models.


Research Intern - Adobe

Simulating and optimizing for behavioural aspects of video/image content, such as memorability and rewatchability, using a large language model. The first work to embed content and the elicited human response in the same space. Successfully embedded vision into a Vicuna‐13B LLM and instruction fine‐tuned it to understand the relationship between human behavior and video content. Beat few‐shot GPT‐4, showing that current SoTA models do not understand behavior.