Divyansh Jhunjhunwala

Divyansh Jhunjhunwala

About Me

Hi! I'm Divyansh, a fifth year PhD candidate in the Electrical and Computer Engineering department at Carnegie Mellon University, advised by Dr. Gauri Joshi. My research focuses on efficiently fine-tuning and training ML models on data distributed across individual users (e.g. mobile devices) via on-device training. I aim to develop both theoretically grounded and practical algorithms that speed-up convergence and improve model accuracy while addressing the communication/computation constraints inherent to such settings.

Lately, I've also been deeply interested in knowledge transfer - understanding when and why pre-training works and developing better methods to merge knowledge across multiple fine-tuned models into one or more models (check out some recent work [1, 2] in this area).

During my PhD, I have had the opportunity to intern at IBM Research (summers 2022 and 2023) and Bosch AI Research (summer 2024) working on problems related to accelerating model training and efficient LLM inference.

Prior to CMU, I completed my Bachelors in Technology (B.Tech) in Electronics and Electrical Communication Engineering from IIT Kharagpur, where I received the Institute Silver Medal for graduating with the highest CGPA in my department.

Email / Google Scholar

Recent News

May 25: New preprint on improving robustness of federated LoRA fine-tuning to heterogeneity.

Dec 24: New preprint on understanding why pre-trained initialization can dramatically improve performance of FedAvg.

May 24: I will be interning at Bosch AI Research, Pittsburgh working on fusing multiple fine-tuned models.

April 24: Work on coded over neural networks got accepted to ISIT 2024!

Jan 24: Work on one-shot federated learning using Fisher information got accepted to AISTATS 2024!

Sep 23: Attended the New Frontiers in Federated Learning Workshop at Toyota Institute of Chicago (TTIC). Thanks to all the organizers!

May 23: I am returning to IBM T.J. Watson Research Center, New York as a summer research intern.

Jan 23: My internship work on tuning the server step size in federated learning was accepted as a spotlight presentation at ICLR 2023!

Oct 22: Work on incentivizing clients for federated learning was accepted as an oral presentation at the FL-Neurips 22 workshop! (12% acceptance rate).

Aug 22: Completed my internship at IBM T.J. Watson Research Center, New York.

April 22: Our team was selected as a finalist for the Qualcomm Innovation Fellowship for the research proposal "Incentivized Federated Learning for Data-Heterogeneous and Resource-Constrained Clients".

Research

	FedRPCA: Enhancing Federated LoRA Aggregation Using Robust PCA Divyansh Jhunjhunwala , Arian Raje, Madan Ravi Ganesh, Chaithanya Kumar Mummadi, Chaoqun Dong, Jiawei Zhou, Wan-Yi Lin, Gauri Joshi, Zhenzhen Li Under submission Proposed FedRPCA, a Robust-PCA–based aggregation method for federated LoRA fine-tuning that separates common and client-specific knowledge to improve both convergence speed and final accuracy across vision and language tasks.
	RAVAN: Multi-Head Low-Rank Adaptation for Federated Fine-Tuning Arian Raje, Baris Askin, Divyansh Jhunjhunwala , Gauri Joshi Under submission Proposed RAVAN, an adaptive multi-head LoRA method for federated fine-tuning that improves accuracy by 2–8% over prior approaches by balancing parameter efficiency and expressivity.
	Initialization Matters: Unraveling the Impact of Pre-training on Federated Learning Divyansh Jhunjhunwala , Pranay Sharma, Zheng Xu, Gauri Joshi Under submission Provide the first theoretical explanation for why pre-training significantly boosts performance of FedAvg by introducing the notion of misaligned filters at initialization and showing that a) data heterogeneity only affects misaligned filters b) pretraining can reduce the number of misaligned filters at initialization.
	Navigating the Accuracy-Size Trade-Off with Flexible Model Merging Akash Dhasade, Divyansh Jhunjhunwala , Gauri Joshi, Anne-Marie Kermarrec, Milos Vujasinovic Under submission Proposed FlexMerge, a model-merging approach that offers the flexibility to fuse fine-tuned foundation models into one or more models, balancing task accuracy and model size.
	Erasure Coded Neural Network Inference via Fisher Averaging Divyansh Jhunjhunwala ^* , Neharika Jali ^, Shiqiang Wang, Gauri Joshi IEEE International Symposium on Information Theory (ISIT) 2024* Develop COIN, a model-fusion framework to approximate the sum of outputs of multiple neural networks with a single neural network for handling demand uncertainty in multi-model inference.
	FedFisher: Leveraging Fisher Information for One-Shot Federated Learning Divyansh Jhunjhunwala , Shiqiang Wang, Gauri Joshi International Conference on Artificial Intelligence and Statistics (AISTATS) 2024 Propose FedFisher, an algorithm for learning the global model for federated learning using just one round communication with novel theotetical guarantees for two layer overparameterized ReLU networks.
	FedExP: Speeding up Federated Averaging via Extrapolation Divyansh Jhunjhunwala , Shiqiang Wang, Gauri Joshi International Conference on Learning Representations (ICLR), 2023 ( Spotlight, top 25% of accepted papers* )* Develop FedExP, a method to adaptively determine the server step size in FL based on dynamically varying pseudo-gradients throughout the FL process.
	Maximizing Global Model Appeal in Federated Learning Yae Jee Cho, Divyansh Jhunjhunwala , Tian Li, Virginia Smith, Gauri Joshi Transactions of Machine Learning Research (TMLR), 2024 Propose MaxFL algorithm to explicitly maximize the fraction of clients that are incentivized to use the global model in federated learning.
	FedVARP: Tackling the Variance Due to Partial Client Participation in Federated Learning Divyansh Jhunjhunwala , Pranay Sharma, Aushim Nagarkatti, Gauri Joshi Uncertainty in Artificial Intelligence (UAI), 2022 Propose FedVARP algorithm to deal with variance caused by only a few clients participating in every round of federated training.
	Leveraging Spatial and Temporal Correlations in Sparsified Mean Estimation Divyansh Jhunjhunwala , Ankur Mallick, Advait Gadhikar, Swanand Kadhe, Gauri Joshi Advances in Neural Information Processing Systems (NeurIPS), 2021 Introduce notions of spatial and temporal correlations and show how they can be used to efficiently compute the mean of a set of vectors in a communication-limited setting.
	Adaptive Quantization of model updates for communication-efficient federated learning Divyansh Jhunjhunwala , Advait Gadhikar, Gauri Joshi, Yonina C. Eldar International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021 Propose an adaptive quantization strategy that aims to achieve communication efficiency as well as a low error floor by changing the number of quantization levels during training in federated learning.

( ^* denotes equal contribution)

Source code credit to Dr. Jon Barron.