Maksym Andriushchenko

According to our database1, Maksym Andriushchenko authored at least 29 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Does Refusal Training in LLMs Generalize to the Past Tense?
CoRR, 2024

Improving Alignment and Robustness with Circuit Breakers.
CoRR, 2024

Is In-Context Learning Sufficient for Instruction Following in LLMs?
CoRR, 2024

Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs.
CoRR, 2024

Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks.
CoRR, 2024

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models.
CoRR, 2024

Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Layer-wise linear mode connectivity.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Scaling Compute Is Not All You Need for Adversarial Robustness.
CoRR, 2023

The Effects of Overparameterization on Sharpness-aware Minimization: An Empirical and Theoretical Analysis.
CoRR, 2023

Why Do We Need Weight Decay in Modern Deep Learning?
CoRR, 2023

Transferable Adversarial Robustness for Categorical Data via Universal Robust Embeddings.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Sharpness-Aware Minimization Leads to Low-Rank Features.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

SGD with Large Step Sizes Learns Sparse Features.
Proceedings of the International Conference on Machine Learning, 2023

A Modern Look at the Relationship between Sharpness and Generalization.
Proceedings of the International Conference on Machine Learning, 2023

2022
On the effectiveness of adversarial training against common corruptions.
Proceedings of the Uncertainty in Artificial Intelligence, 2022

Towards Understanding Sharpness-Aware Minimization.
Proceedings of the International Conference on Machine Learning, 2022

ARIA: Adversarially Robust Image Attribution for Content Provenance.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

Sparse-RS: A Versatile Framework for Query-Efficient Sparse Black-Box Adversarial Attacks.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
RobustBench: a standardized adversarial robustness benchmark.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
RobustBench: a standardized adversarial robustness benchmark.
CoRR, 2020

Understanding and Improving Fast Adversarial Training.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Square Attack: A Query-Efficient Black-Box Adversarial Attack via Random Search.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
Provably robust boosted decision stumps and trees against adversarial attacks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Why ReLU Networks Yield High-Confidence Predictions Far Away From the Training Data and How to Mitigate the Problem.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Provable Robustness of ReLU networks via Maximization of Linear Regions.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018
Logit Pairing Methods Can Fool Gradient-Based Attacks.
CoRR, 2018

2017
Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017


  Loading...