Amanda Askell
According to our database1,
Amanda Askell
authored at least 26 papers
between 2019 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
2019
2020
2021
2022
2023
2024
2025
0
5
10
1
1
6
8
1
1
2
1
1
2
1
1
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming.
CoRR, January, 2025
2024
CoRR, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.
Trans. Mach. Learn. Res., 2023
Towards Measuring the Representation of Subjective Global Opinions in Language Models.
CoRR, 2023
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned.
CoRR, 2022
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback.
CoRR, 2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21, 2022
2021
Proceedings of the 38th International Conference on Machine Learning, 2021
2020
CoRR, 2020
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
2019