Increase trust in AI systems
Just like medical treatments or economic policies, AI systems can be considered as interventions in our society. I build methods to help experts interpret an AI system’s output and estimate their (differential) impact on different subpopulations. One of these methods, Diverse Counterfactual Explanations (DiCE), has been integrated as a part of Microsoft’s Responsible AI platform.
I also work on improving the generalizability of ML models so that they are less sensitive to their training data distribution. This line of work has led to a NASSCOM AI GameChangers Award (2023-24).
Counterfactual explanations
- A methods that generates diverse counterfactual explanations for explaining ML classifiers (Mothilal et al., 2020)
- Using LLMs-generated counterfactuals to explain AI model’s predictions (Gat et al., 2024)
- Unifying feature attribution and counterfactual explanation methods: When to use which (Kommiya Mothilal et al., 2021)
- Evaluating and mitigating bias in image classifiers (Dash et al., 2020)
Building prediction models that generalize better
- The challenges of prediction and explanation in social systems (Hofman et al., 2017)
- Using causal reasoning to build generalizable ML classifiers (missing reference)
Role of causality in trustworthy ML
- The necessary role of causality in understanding when ML explanations can improve human understanding (Chen et al., 2023)
- A framework for deploying trustworthy ML systems based on technology readiness levels (Lavin et al., 2022)
- ML fairness estimates can be misleading without modeling data missingness (Goel et al., 2021)
References
- FAccT 2020Explaining machine learning classifiers through diverse counterfactual explanationsProceedings of the 2020 ACM conference on Fairness, Accountability and Transparency (FAccT), Mar 2020
- ICLR 2024Faithful Explanations of Black-box NLP Models Using LLM-generated CounterfactualsIn The Twelfth International Conference on Learning Representations (ICLR), Mar 2024
- AIES 2021Towards Unifying Feature Attribution and Counterfactual Explanations: Different Means to the Same EndIn Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, Jul 2021
- WACV 2020Evaluating and mitigating bias in image classifiers: A causal perspective using counterfactualsProc. IEEE Workshop Appl. Comput. Vis., Sep 2020
- SciencePrediction and explanation in social systemsIn Science, Sep 2017
- TMLR 2023Machine Explanations and Human UnderstandingTransactions on Machine Learning Research, Sep 2023
- Nature Commun.Technology readiness levels for machine learning systemsNature Communications, Oct 2022
- AAAI 2021The importance of modeling data missingness in algorithmic fairness: A causal perspectiveProceedings of the AAAI Conference on Artificial Intelligence, Oct 2021