Samuel Marks
According to our database1,
Samuel Marks
authored at least 3 papers
between 2023 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models.
CoRR, 2024
2023
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets.
CoRR, 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback.
CoRR, 2023