Steven Basart
According to our database1,
Steven Basart
authored at least 18 papers
between 2019 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
2023
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark.
CoRR, 2023
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark.
Proceedings of the International Conference on Machine Learning, 2023
2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the International Conference on Machine Learning, 2022
2021
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021
Proceedings of the 9th International Conference on Learning Representations, 2021
Proceedings of the 9th International Conference on Learning Representations, 2021
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
2019