Razvan Pascanu

Orcid: 0000-0002-5470-1238

According to our database1, Razvan Pascanu authored at least 150 papers between 2010 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Continual Learning: Applications and the Road Forward.
Trans. Mach. Learn. Res., 2024

Promoting Exploration in Memory-Augmented Adam using Critical Momenta.
Trans. Mach. Learn. Res., 2024

A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks.
CoRR, 2024

Retrieval-Augmented Decision Transformer: External Memory for In-context RL.
CoRR, 2024

Round and Round We Go! What makes Rotary Positional Encodings useful?
CoRR, 2024

softmax is not enough (for sharp out-of-distribution).
CoRR, 2024

When can transformers compositionally generalize in-context?
CoRR, 2024

Investigating Low-Rank Training in Transformer Language Models: Efficiency and Scaling Analysis.
CoRR, 2024

Normalization and effective learning rates in reinforcement learning.
CoRR, 2024

Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers.
CoRR, 2024

Transformers meet Neural Algorithmic Reasoners.
CoRR, 2024

State Soup: In-Context Skill Learning, Retrieval and Mixing.
CoRR, 2024

Attention as a Hypernetwork.
CoRR, 2024

Transformers need glasses! Information over-squashing in language tasks.
CoRR, 2024

Deep Grokking: Would Deep Neural Networks Generalize Better?
CoRR, 2024

No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO.
CoRR, 2024

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models.
CoRR, 2024

Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons.
CoRR, 2024

Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models.
CoRR, 2024

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models.
CoRR, 2024

Disentangling the Causes of Plasticity Loss in Neural Networks.
CoRR, 2024

Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Improving fine-grained understanding in image-text pre-training.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Kalman Filter for Online Classification of Non-Stationary Data.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Discovering modular solutions that generalize compositionally.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Nevis'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research.
J. Mach. Learn. Res., 2023

Uncovering mesa-optimization algorithms in Transformers.
CoRR, 2023

On the Universality of Linear Recurrences Followed by Nonlinear Projections.
CoRR, 2023

Towards Robust and Efficient Continual Language Learning.
CoRR, 2023

Towards Compute-Optimal Transfer Learning.
CoRR, 2023

Learning to Modulate pre-trained Models in RL.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Deep Reinforcement Learning with Plasticity Injection.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

The Tunnel Effect: Building Data Representations in Deep Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Latent Space Representations of Neural Algorithmic Reasoners.
Proceedings of the Learning on Graphs Conference, 27-30 November 2023, Virtual Event., 2023

Asynchronous Algorithmic Alignment With Cocycles.
Proceedings of the Learning on Graphs Conference, 27-30 November 2023, Virtual Event., 2023

Resurrecting Recurrent Neural Networks for Long Sequences.
Proceedings of the International Conference on Machine Learning, 2023

Understanding Plasticity in Neural Networks.
Proceedings of the International Conference on Machine Learning, 2023

Pre-training via Denoising for Molecular Property Prediction.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

SemPPL: Predicting Pseudo-Labels for Better Contrastive Representations.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Continually learning representations at scale.
Proceedings of the Conference on Lifelong Learning Agents, 2023

2022
An empirical study of implicit regularization in deep offline RL.
Trans. Mach. Learn. Res., 2022

Behavior Priors for Efficient Reinforcement Learning.
J. Mach. Learn. Res., 2022

NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research.
CoRR, 2022

Architecture Matters in Continual Learning.
CoRR, 2022

Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?
CoRR, 2022

Disentangling Transfer in Continual Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Reasoning-Modulated Representations.
Proceedings of the Learning on Graphs Conference, 2022


Correlation Based Semantic Transfer with Application to Domain Adaptation.
Proceedings of the Neural Information Processing - 29th International Conference, 2022

The CLRS Algorithmic Reasoning Benchmark.
Proceedings of the International Conference on Machine Learning, 2022

Wide Neural Networks Forget Less Catastrophically.
Proceedings of the International Conference on Machine Learning, 2022

When Does Re-initialization Work?
Proceedings of the Proceedings on "I Can't Believe It's Not Better!, 2022

Probing Transfer in Deep Reinforcement Learning without Task Engineering.
Proceedings of the Conference on Lifelong Learning Agents, 2022

Test Sample Accuracy Scales with Training Sample Density in Neural Networks.
Proceedings of the Conference on Lifelong Learning Agents, 2022

2021
Wide Neural Networks Forget Less Catastrophically.
CoRR, 2021

Task-agnostic Continual Learning with Hybrid Probabilistic Models.
CoRR, 2021

Predicting Unreliable Predictions by Shattering a Neural Network.
CoRR, 2021

A study on the plasticity of neural networks.
CoRR, 2021

Drawing Multiple Augmentation Samples Per Image During Training Efficiently Decreases Test Error.
CoRR, 2021

Regularized Behavior Value Estimation.
CoRR, 2021

Continual World: A Robotic Benchmark For Continual Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Powerpropagation: A sparsity inducing weight reparameterisation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

On the Role of Optimization in Double Descent: A Least Squares Study.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021


Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective.
Proceedings of the 38th International Conference on Machine Learning, 2021

Linear Mode Connectivity in Multitask and Continual Learning.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
BYOL works even without batch statistics.
CoRR, 2020

Temporal Difference Uncertainties as a Signal for Exploration.
CoRR, 2020

Pointer Graph Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Understanding the Role of Training Regimes in Continual Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Top-KAST: Top-K Always Sparse Training.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Stabilizing Transformers for Reinforcement Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

Improving the Gating Mechanism of Recurrent Neural Networks.
Proceedings of the 37th International Conference on Machine Learning, 2020

Functional Regularisation for Continual Learning with Gaussian Processes.
Proceedings of the 8th International Conference on Learning Representations, 2020

Multiplicative Interactions and Where to Find Them.
Proceedings of the 8th International Conference on Learning Representations, 2020

Meta-Learning with Warped Gradient Descent.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
A Deep Neural Network's Loss Surface Contains Every Low-dimensional Pattern.
CoRR, 2019

Improving the Gating Mechanism of Recurrent Neural Networks.
CoRR, 2019

Meta-Learning with Warped Gradient Descent.
CoRR, 2019

Task Agnostic Continual Learning via Meta Learning.
CoRR, 2019

Meta-learning of Sequential Strategies.
CoRR, 2019

Ray Interference: a Source of Plateaus in Deep Reinforcement Learning.
CoRR, 2019

Exploiting Hierarchy for Learning and Transfer in KL-regularized RL.
CoRR, 2019

Functional Regularisation for Continual Learning using Gaussian Processes.
CoRR, 2019

Continual Unsupervised Representation Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Deep reinforcement learning with relational inductive biases.
Proceedings of the 7th International Conference on Learning Representations, 2019

Meta-Learning with Latent Embedding Optimization.
Proceedings of the 7th International Conference on Learning Representations, 2019

Hyperbolic Attention Networks.
Proceedings of the 7th International Conference on Learning Representations, 2019

Information asymmetry in KL-regularized RL.
Proceedings of the 7th International Conference on Learning Representations, 2019

A RAD approach to deep mixture models.
Proceedings of the Deep Generative Models for Highly Structured Data, 2019

Distilling Policy Distillation.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018
Vector-based navigation using grid-like representations in artificial agents.
Nat., 2018

Adapting Auxiliary Losses Using Gradient Similarity.
CoRR, 2018

Relational Deep Reinforcement Learning.
CoRR, 2018

Relational inductive biases, deep learning, and graph networks.
CoRR, 2018

Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery.
CoRR, 2018

Block Mean Approximation for Efficient Second Order Optimization.
CoRR, 2018

Learning Deep Generative Models of Graphs.
CoRR, 2018

Relational recurrent neural networks.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Progress & Compress: A scalable framework for continual learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

Been There, Done That: Meta-Learning with Episodic Recall.
Proceedings of the 35th International Conference on Machine Learning, 2018

Mix & Match Agent Curricula for Reinforcement Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

Memory-based Parameter Adaptation.
Proceedings of the 6th International Conference on Learning Representations, 2018

Model compression via distillation and quantization.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
Imagination-Augmented Agents for Deep Reinforcement Learning.
CoRR, 2017

Visual Interaction Networks.
CoRR, 2017

Learning model-based planning from scratch.
CoRR, 2017

Visual Interaction Networks: Learning a Physics Simulator from Video.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Distral: Robust multitask reinforcement learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

A simple neural network module for relational reasoning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Imagination-Augmented Agents for Deep Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Sobolev Training for Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Sharp Minima Can Generalize For Deep Nets.
Proceedings of the 34th International Conference on Machine Learning, 2017

Discovering objects and their relations from entangled scene representations.
Proceedings of the 5th International Conference on Learning Representations, 2017

Learning to Navigate in Complex Environments.
Proceedings of the 5th International Conference on Learning Representations, 2017

Metacontrol for Adaptive Imagination-Based Optimization.
Proceedings of the 5th International Conference on Learning Representations, 2017

Sim-to-Real Robot Learning from Pixels with Progressive Nets.
Proceedings of the 1st Annual Conference on Robot Learning, CoRL 2017, Mountain View, 2017

2016
Local minima in training of deep networks.
CoRR, 2016

Progressive Neural Networks.
CoRR, 2016

Policy Distillation.
Proceedings of the 4th International Conference on Learning Representations, 2016

Overcoming catastrophic forgetting in neural networks.
CoRR, 2016

Theano: A Python framework for fast computation of mathematical expressions.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, 2016

Interaction Networks for Learning about Objects, Relations and Physics.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

2015
Natural Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Malware classification with recurrent networks.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Revisiting Natural Gradient for Deep Networks
Proceedings of the 2nd International Conference on Learning Representations, 2014

On the number of inference regions of deep feed forward networks with piece-wise linear activations.
Proceedings of the 2nd International Conference on Learning Representations, 2014

How to Construct Deep Recurrent Neural Networks.
Proceedings of the 2nd International Conference on Learning Representations, 2014

On the saddle point problem for non-convex optimization.
CoRR, 2014

Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2014

On the Number of Linear Regions of Deep Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013
Natural Gradient Revisited
Proceedings of the 1st International Conference on Learning Representations, 2013

Metric-Free Natural Gradient for Joint-Training of Boltzmann Machines
Proceedings of the 1st International Conference on Learning Representations, 2013

Learned-norm pooling for deep neural networks.
CoRR, 2013

Pylearn2: a machine learning research library.
CoRR, 2013

On the difficulty of training recurrent neural networks.
Proceedings of the 30th International Conference on Machine Learning, 2013


Advances in optimizing recurrent networks.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Learning Algorithms for the Classification Restricted Boltzmann Machine.
J. Mach. Learn. Res., 2012

Theano: new features and speed improvements
CoRR, 2012

Understanding the exploding gradient problem
CoRR, 2012

2011
Contextual tag inference.
ACM Trans. Multim. Comput. Commun. Appl., 2011

A neurodynamical model for working memory.
Neural Networks, 2011

Deep Learners Benefit More from Out-of-Distribution Examples.
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011

Autotagging music with conditional restricted Boltzmann machines
CoRR, 2011

2010
Deep Self-Taught Learning for Handwritten Character Recognition
CoRR, 2010

Theano: A CPU and GPU Math Compiler in Python.
Proceedings of the 9th Python in Science Conference 2010 (SciPy 2010), Austin, Texas, June 28, 2010

Extraction of quadrics from noisy point-clouds using a sensor noise model.
Proceedings of the IEEE International Conference on Robotics and Automation, 2010


  Loading...