How Transformers Solve Propositional Logic Problems: A Mechanistic Analysis.
CoRR, 2024
Causal Language Modeling Can Elicit Search and Reasoning Capabilities on Logic Puzzles.
CoRR, 2024
LayerNAS: Neural Architecture Search in Polynomial Complexity.
CoRR, 2023
The Power of External Memory in Increasing Predictive Model Capacity.
CoRR, 2023
Alternating Updates for Efficient Transformers.
CoRR, 2023
On the Benefits of Learning to Route in Mixture-of-Experts Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Large Language Models with Controllable Working Memory.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
Sketching based Representations for Robust Image Classification with Provable Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
A Theoretical View on Sparsely Activated Networks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes.
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Green Technology Development and Adoption: Competition, Regulation, and Uncertainty - A Global Game Approach.
Manag. Sci., 2021
Sketch based Memory for Neural Networks.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021
Improving Sampling Accuracy of Stochastic Gradient MCMC Methods via Non-uniform Subsampling of Gradients.
CoRR, 2020
Back and forth error compensation and correction method for linear hyperbolic systems with application to the Maxwell's equations.
J. Comput. Phys. X, 2019