Mahesh's webpage

301B Davis Hall

UB North Campus

Amherst, NY, 14260.

I am a Computer Science PhD candidate advised by Dr. David Doermann. Prior to this, I completed my Masters in CS, also at UB, and B.Tech at Walchand College of Engineering, Shivaji University, India. I developed deep-learning models for filesystems at Veritas Technologies LLC, where I was fortunate to be advised by Anindya Banerjee.

My current research focuses on image/video synthesis via diffusion models, and VQA and fairness issues of Multi-Modal Large Language Models (MLLMs).

Education

PhD in Computer Science - University at Buffalo, The State University of New York (2023 - Expected 2027)
Masters in Computer Science - University at Buffalo, The State University of New York (2021 - 2023)
B.Tech in Information Technology - Walchand College of Engineering, Shivaji University, India (2013 - 2017)

Professional Experience

Visiting Research Scholar - Johns Hopkins University (June 2026 - August 2026)
Research Topics: Multi-video understanding.
Research Assistant & Lab Manager - A2IL Lab, University at Buffalo (2022 - Present)
Research Topics: Multi-modal generative AI.
Software Engineer - Veritas Technologies LLC (2017 - 2021)
Developed deep-learning models for storage filesystems. Reduced execution time of resource-intensive tasks by 56%.

news

Jun 01, 2026	Excited to be joining Johns Hopkins University as a Visiting Research Scholar (June – August 2026), working on multi-video understanding.
May 15, 2026	Two papers accepted to the ACL 2026 Multimodal Augmented Generation Workshop — CRAFT and TRACE.
Mar 15, 2026	One paper accepted as Oral (≤ 8% of submitted papers) at MIDL 2026 — Category-wise Structured Radiology Report Generation with Contrastive Decoding.
Feb 27, 2026	One paper accepted to CVPR 2026 — FairLLaVA: Fairness-Aware Parameter-Efficient Fine-Tuning for Large Vision-Language Assistants.
Sep 25, 2025	One paper accepted to NeurIPS 2025 — AutoEdit: Automatic Hyperparameter Tuning for Image Editing.

latest posts

Nov 28, 2024	Distance metric for fairness in vision language models
Nov 28, 2024	Quotodian commands
Nov 28, 2024	Diffusion Models Background and Code

selected publications

CVPR 2026

FairLLaVA: Fairness-Aware Parameter-Efficient Fine-Tuning for Large Vision-Language Assistants

Mahesh Bhosale, Abdul Wasi, Shantam Shrivastva, and 5 more authors

In IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2026

Bib

@inproceedings{Bhosale2026FairLLaVA,
  author = {Bhosale, Mahesh and Wasi, Abdul and Shrivastva, Shantam and Latif, Shifa and Luan, Tianyu and Gao, Mingchen and Doermann, David and Gong, Xuan},
  title = {FairLLaVA: Fairness-Aware Parameter-Efficient Fine-Tuning for Large Vision-Language Assistants},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year = {2026},
  keywords = {published},
}

ACL 2026

CRAFT: Critic-Refined Adaptive Key-Frame Targeting for Multimodal Video Question Answering

Mahesh Bhosale, Abdul Wasi, Vishvesh Trivedi, and 3 more authors

In ACL Multimodal Augmented Generation via MultimodAl Retrieval Workshop , 2026

Bib

@inproceedings{Bhosale2026CRAFT,
  author = {Bhosale, Mahesh and Wasi, Abdul and Trivedi, Vishvesh and Yan, Pengyu and Gorugantu, Akhil V S S and Doermann, David},
  title = {CRAFT: Critic-Refined Adaptive Key-Frame Targeting for Multimodal Video Question Answering},
  booktitle = {ACL Multimodal Augmented Generation via MultimodAl Retrieval Workshop},
  year = {2026},
  keywords = {published},
}

ACL 2026

TRACE: Evidence Grounding-Guided Multi-Video Event Understanding and Claim Generation

Pengyu Yan, Akhil V S S Gorugantu, Mahesh Bhosale, and 3 more authors

In ACL Multimodal Augmented Generation via MultimodAl Retrieval Workshop , 2026

Bib

@inproceedings{Yan2026TRACE,
  author = {Yan, Pengyu and Gorugantu, Akhil V S S and Bhosale, Mahesh and Wasi, Abdul and Trivedi, Vishvesh and Doermann, David},
  title = {TRACE: Evidence Grounding-Guided Multi-Video Event Understanding and Claim Generation},
  booktitle = {ACL Multimodal Augmented Generation via MultimodAl Retrieval Workshop},
  year = {2026},
  keywords = {published},
}

ICCV 2025

PathDiff: Histopathology Image Synthesis with Unpaired Text and Mask Conditions

Mahesh Bhosale, Abdul Wasi, Yuanhao Zhai, and 7 more authors

In IEEE/CVF International Conference on Computer Vision , 2025

Bib

@inproceedings{Bhosale2025PathDiff,
  author = {Bhosale, Mahesh and Wasi, Abdul and Zhai, Yuanhao and Tian, Yunjie and Border, Samuel and Xi, Nan and Sarder, Pinaki and Yuan, Junsong and Doermann, David and Gong, Xuan},
  title = {PathDiff: Histopathology Image Synthesis with Unpaired Text and Mask Conditions},
  booktitle = {IEEE/CVF International Conference on Computer Vision},
  year = {2025},
  keywords = {published},
}

NeurIPS 2025

AutoEdit: Automatic Hyperparameter Tuning for Image Editing

Chau Pham, Quan Dao, Mahesh Bhosale, and 3 more authors

In Neural Information Processing Systems , 2025

Bib

@inproceedings{Pham2025AutoEdit,
  author = {Pham, Chau and Dao, Quan and Bhosale, Mahesh and Tian, Yunjie and Metaxas, Dimitris N. and Doermann, David},
  title = {AutoEdit: Automatic Hyperparameter Tuning for Image Editing},
  booktitle = {Neural Information Processing Systems},
  year = {2025},
  keywords = {published},
}