Kazuki Irie

According to our database1, Kazuki Irie authored at least 51 papers between 2014 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Neural networks that overcome classic challenges through practice.
CoRR, 2024

MoEUT: Mixture-of-Experts Universal Transformers.
CoRR, 2024

Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers.
CoRR, 2024

Exploring the Promise and Limits of Real-Time Recurrent Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Self-organising Neural Discrete Representation Learning à la Kohonen.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2024, 2024

2023
Unsupervised Learning of Temporal Abstractions With Slot-Based Transformers.
Neural Comput., 2023

SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention.
CoRR, 2023

Automating Continual Learning.
CoRR, 2023

Mindstorms in Natural Language-Based Societies of Mind.
CoRR, 2023

Accelerating Neural Self-Improvement via Bootstrapping.
CoRR, 2023

Topological Neural Discrete Representation Learning à la Kohonen.
CoRR, 2023

Contrastive Training of Complex-Valued Autoencoders for Object Discovery.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Practical Computational Power of Linear Transformers and Their Recurrent and Self-Referential Extensions.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Approximating Two-Layer Feedforward Networks for Efficient Transformers.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

2022
Learning to Control Rapidly Changing Synaptic Connections: An Alternative Type of Memory in Sequence Processing Artificial Neural Networks.
CoRR, 2022

Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Modern Self-Referential Weight Matrix That Learns to Modify Itself.
Proceedings of the International Conference on Machine Learning, 2022

The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention.
Proceedings of the International Conference on Machine Learning, 2022

The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization.
Proceedings of the Tenth International Conference on Learning Representations, 2022

CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021
Improving Baselines in the Wild.
CoRR, 2021

Training and Generating Neural Networks in Compressed Weight Space.
CoRR, 2021

Linear Transformers Are Secretly Fast Weight Memory Systems.
CoRR, 2021

Going Beyond Linear Transformers with Recurrent Fast Weight Programmers.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Linear Transformers Are Secretly Fast Weight Programmers.
Proceedings of the 38th International Conference on Machine Learning, 2021

The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

2020
Advancing neural language modeling in automatic speech recognition.
PhD thesis, 2020

The Rwth Asr System for Ted-Lium Release 2: Improving Hybrid Hmm With Specaugment.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

How Much Self-Attention Do We Need? Trading Attention for Feed-Forward Layers.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Domain Robust, Fast, and Compact Neural Language Models.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
RWTH ASR Systems for LibriSpeech: Hybrid vs Attention - w/o Data Augmentation.
CoRR, 2019

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling.
CoRR, 2019

Model Unit Exploration for Sequence-to-Sequence Speech Recognition.
CoRR, 2019

RWTH ASR Systems for LibriSpeech: Hybrid vs Attention.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Language Modeling with Deep Transformers.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

On the Choice of Modeling Unit for Sequence-to-Sequence Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

A Comparison of Transformer and LSTM Encoder Decoder Models for ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Training Language Models for Long-Span Cross-Sentence Evaluation.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Improved Training of End-to-end Attention Models for Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Investigation on Estimation of Sentence Probability by Combining Forward, Backward and Bi-directional LSTM-RNNs.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Prediction of LSTM-RNN Full Context States as a Subtask for N-Gram Feedforward Language Models.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

RADMM: Recurrent Adaptive Mixture Model with Applications to Domain Robust Language Modeling.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
The 2016 RWTH Keyword Search System for Low-Resource Languages.
Proceedings of the Speech and Computer - 19th International Conference, 2017

Investigations on byte-level convolutional neural networks for language modeling in low resource speech recognition.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Automatic Speech Recognition Based on Neural Networks.
Proceedings of the Speech and Computer - 18th International Conference, 2016

LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Investigation on log-linear interpolation of multi-domain neural network language model.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Bag-of-words input for long history representation in neural network-based language models for speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

On efficient training of word classes and their application to recurrent neural network language models.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2014
The RWTH English lecture recognition system.
Proceedings of the IEEE International Conference on Acoustics, 2014


  Loading...