Satyapriya Krishna

Orcid: 0000-0002-5324-5824

According to our database1, Satyapriya Krishna authored at least 27 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective.
Trans. Mach. Learn. Res., 2024

Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse RL.
CoRR, 2024

Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation.
CoRR, 2024

Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs).
CoRR, 2024

More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness.
CoRR, 2024

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence.
CoRR, 2024

Understanding the Effects of Iterative Prompting on Truthfulness.
Proceedings of the Forty-first International Conference on Machine Learning, 2024


2023
Explaining machine learning models with interactive natural language conversations using TalkToModel.
Nat. Mac. Intell., August, 2023

On the Intersection of Self-Correction and Trust in Language Models.
CoRR, 2023

Are Large Language Models Post Hoc Explainers?
CoRR, 2023

On the Trade-offs between Adversarial Robustness and Actionable Explanations.
CoRR, 2023

Post Hoc Explanations of Language Models Can Improve Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten.
Proceedings of the International Conference on Machine Learning, 2023

2022
TalkToModel: Understanding Machine Learning Models With Open Ended Dialogues.
CoRR, 2022

Rethinking Stability for Attribution-based Explanations.
CoRR, 2022

The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective.
CoRR, 2022

OpenXAI: Towards a Transparent Evaluation of Model Explanations.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Measuring Fairness of Text Classifiers via Prediction Sensitivity.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021
Grounding Complex Navigational Instructions Using Scene Graphs.
CoRR, 2021

BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation.
Proceedings of the FAccT '21: 2021 ACM Conference on Fairness, 2021

Towards Realistic Single-Task Continuous Learning Research for NER.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

ADePT: Auto-encoder based Differentially Private Text Transformation.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Does Robustness Improve Fairness? Approaching Fairness with Word Substitution Robustness Methods for Text Classification.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020
Towards classification parity across cohorts.
CoRR, 2020

2019
FineText: Text Classification via Attention-based Language Model Fine-tuning.
CoRR, 2019


  Loading...