Teun van der Weij

According to our database1, Teun van der Weij authored at least 3 papers between 2023 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of five.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
AI Sandbagging: Language Models can Strategically Underperform on Evaluations.
CoRR, 2024

Extending Activation Steering to Broad Skills and Multiple Behaviours.
CoRR, 2024

2023
Evaluating Shutdown Avoidance of Language Models in Textual Scenarios.
CoRR, 2023


  Loading...