Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

projects

Semi‐Supervised Segmentation and VQA on Aerial Flood Images

[ongoing] Designed a semi‐supervised Image Segmentation and Graph based Visual Question Answering system for the FloodNet challenge.
The Segmentation network is based on CutMix and Cross Pseudo Supervision.
The VQA system applies geodesic dilation and morphological operations on the segmentation maps. Connected component counts are per‐formed on 4‐adjacency graphs made from the processed segmentation maps.
Supervised by: Sravan Danda - BITS Pilani

Assessing the Aptitude of Language Models in Comprehending Advertisements

[ongoing] Advertisement media are fundamentally different from typical videos and images. They are more than just their content, persuade users to take certain actions, and often use creative atypicalities to deliver their message.
Advertisement images from the Kovashka Ads Dataset were textually verbalized. These were presented to text‐based language models like GPT‐3.5, GPT‐4, and FLAN‐T5, which were evaluated on Action‐Reason pair and Atypicality understanding tasks.
As a comparison between text and vision, Vision‐Language models like BLIP2 were also evaluated on the advertisement images.
Supervised by: Adriana Kovashka - University of Pittsburgh

A Non‐Chaotic Pruning Strategy for Neural Networks

Pruning strategies such as L0, L1, Rank based pruning change the feature importance orderings of Neural Networks in chaotic manners. This project hypothesizes that Granger‐Causality based pruning may be a non‐chaotic neural network pruning strategy.
Supervised by: APPCAIR - BITS Pilani

publications

Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior Permalink

International Conference on Learning Representations (ICLR) 2024 — Spotlight

Recommended citation: Ashmit Khandelwal, Aditya Agrawal, Aanisha Bhattacharyya, Yaman K Singla, Somesh Singh, Uttaran Bhattacharya, Ishita Dasgupta, Stefano Petrangeli, Rajiv Ratn Shah, Changyou Chen, Balaji Krishnamurthy, 2023. "Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior", ICLR 2024 doi.org/10.48550/arXiv.2309.00359

talks

teaching

QSTP 2022: Introduction to Deep Learning

Instructor, Quark, BITS Pilani - Goa, 2022

Co‐instructed for the Introduction to Deep Learning course. The course provides introductory knowledge and assignments on Deep Learning, Computer Vision, Natural Language Processing, and Generative Models.

Introduction to ML and DL

Instructor, Center of Technical Education, 2022

Co‐instructed for the Introduction to Machine Learning and Deep Learning course. Teaching the mathematical theory and providing python implementations of Machine Learning algorithms and Deep Learning models.

work

Research Intern - Adobe

Simulating and optimizing for behavioural aspects of video/image content, such as memorability and rewatchability, using a large language model. The first work to embed content and the elicited human response in the same space. Successfully embedded vision into a Vicuna‐13B LLM and instruction fine‐tuned it to understand the relationship between human behavior and video content. Beat few‐shot GPT‐4, showing that current SoTA models do not understand behavior.