2025
Theoretical Benefit and Limitation of Diffusion Language Model.
CoRR, February, 2025
Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning.
CoRR, February, 2025
A Foundational Generative Model for Breast Ultrasound Image Analysis.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, January, 2025
Boosting meta-training with base class information for robust few-shot learning.
Eng. Appl. Artif. Intell., 2025
2024
3D Molecular Generation via Virtual Dynamics.
Trans. Mach. Learn. Res., 2024
How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs.
CoRR, 2024
Knowledge-driven AI-generated data for accurate and interpretable breast ultrasound diagnoses.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Online Pseudo-Zeroth-Order Training of Neuromorphic Spiking Neural Networks.
CoRR, 2024
Let the Code LLM Edit Itself When You Edit the Code.
CoRR, 2024
DPO Meets PPO: Reinforced Token Optimization for RLHF.
CoRR, 2024
Boosting Meta-Training with Base Class Information for Few-Shot Learning.
CoRR, 2024
DOF: Accelerating High-order Differential Operators with Forward Propagation.
CoRR, 2024
End-to-End Crystal Structure Prediction from Powder X-Ray Diffraction.
CoRR, 2024
LarvSeg: Exploring Image Classification Data for Large Vocabulary Semantic Segmentation via Category-Wise Attentive Classifier.
Proceedings of the Pattern Recognition and Computer Vision - 7th Chinese Conference, 2024
Bridging Geometric States via Geometric Diffusion Bridge.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
REST: Retrieval-Based Speculative Decoding.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024
Do Efficient Transformers Really Save Computation?
Proceedings of the Forty-first International Conference on Machine Learning, 2024
GeoMFormer: A General Architecture for Geometric Molecular Representation Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Temporal Spiking Neural Networks with Synaptic Delay for Graph Reasoning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Beyond Weisfeiler-Lehman: A Quantitative Framework for GNN Expressiveness.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024
2023
CORE: Common Random Reconstruction for Distributed Optimization with Provable Low Communication Complexity.
CoRR, 2023
Forward Laplacian: A New Computational Framework for Neural Network-based Variational Monte Carlo.
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Highly Accurate Quantum Chemical Property Prediction with Uni-Mol+.
CoRR, 2023
Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
A Complete Expressiveness Hierarchy for Subgraph GNNs via Subgraph Weisfeiler-Lehman Tests.
Proceedings of the International Conference on Machine Learning, 2023
Rethinking the Expressive Power of GNNs via Graph Biconnectivity.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Denoising Masked Autoencoders Help Robust Classification.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
One Transformer Can Understand Both 2D & 3D Molecular Data.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023
Learning Physics-Informed Neural Networks without Stacked Back-propagation.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023
Robustness-Aware Word Embedding Improves Certified Robustness to Adversarial Word Substitutions.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
Denoising Masked AutoEncoders are Certifiable Robust Vision Learners.
CoRR, 2022
Is L<sup>2</sup> Physics-Informed Loss Always Suitable for Training Physics-Informed Neural Network?
CoRR, 2022
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals.
CoRR, 2022
Benchmarking Graphormer on Large-Scale Molecular Modeling Datasets.
CoRR, 2022
Rethinking Lipschitz Neural Networks and Certified Robustness: A Boolean Function Perspective.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Online Training Through Time for Spiking Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Is $L^2$ Physics Informed Loss Always Suitable for Training Physics Informed Neural Network?
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Your Transformer May Not be as Powerful as You Expect.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
HousE: Knowledge Graph Embedding with Householder Parameterization.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the International Conference on Machine Learning, 2022
Boosting the Certified Robustness of L-infinity Distance Nets.
Proceedings of the Tenth International Conference on Learning Representations, 2022
Two Coupled Rejection Metrics Can Tell Adversarial Examples Apart.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Finding the Dominant Winning Ticket in Pre-Trained Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022
2021
Can Vision Transformers Perform Convolution?
CoRR, 2021
First Place Solution of KDD Cup 2021 & OGB Large-Scale Challenge Graph Prediction Track.
CoRR, 2021
Do Transformers Really Perform Bad for Graph Representation?
CoRR, 2021
Adversarial Training with Rectified Rejection.
CoRR, 2021
Transformers with Competitive Ensembles of Independent Mechanisms.
CoRR, 2021
LazyFormer: Self Attention with Lazy Update.
CoRR, 2021
Less is More: Pre-training a Strong Siamese Encoder Using a Weak Decoder.
CoRR, 2021
Revisiting Language Encoding in Learning Multilingual Representations.
CoRR, 2021
Towards Certifying 𝓁<sub>∞</sub> Robustness using Neural Networks with 𝓁<sub>∞</sub>-dist Neurons.
CoRR, 2021
Do Transformers Really Perform Badly for Graph Representation?
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
The Open Catalyst Challenge 2021: Competition Report.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, 2021
Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons.
Proceedings of the 38th International Conference on Machine Learning, 2021
How could Neural Networks understand Programs?
Proceedings of the 38th International Conference on Machine Learning, 2021
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training.
Proceedings of the 38th International Conference on Machine Learning, 2021
Rethinking Positional Encoding in Language Pre-training.
Proceedings of the 9th International Conference on Learning Representations, 2021
Taking Notes on the Fly Helps Language Pre-Training.
Proceedings of the 9th International Conference on Learning Representations, 2021
Less is More: Pretrain a Strong Siamese Encoder for Dense Text Retrieval Using a Weak Decoder.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
2020
Taking Notes on the Fly Helps BERT Pre-training.
CoRR, 2020
Transferred Discrepancy: Quantifying the Difference Between Representations.
CoRR, 2020
MC-BERT: Efficient Language Pre-Training via a Meta Controller.
CoRR, 2020
I4R: Promoting Deep Reinforcement Learning by the Indicator for Expressive Representations.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020
On Layer Normalization in the Transformer Architecture.
Proceedings of the 37th International Conference on Machine Learning, 2020
Incorporating BERT into Neural Machine Translation.
Proceedings of the 8th International Conference on Learning Representations, 2020
MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius.
Proceedings of the 8th International Conference on Learning Representations, 2020
Invertible Image Rescaling.
Proceedings of the Computer Vision - ECCV 2020, 2020
2019
Defective Convolutional Layers Learn Robust CNNs.
CoRR, 2019
On the Anomalous Generalization of GANs.
CoRR, 2019
Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View.
CoRR, 2019
Adversarially Robust Generalization Just Requires More Unlabeled Data.
CoRR, 2019
A Gram-Gauss-Newton Method Learning Overparameterized Deep Neural Networks for Regression Problems.
CoRR, 2019
Microsoft Research Asia's Systems for WMT19.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Fourth Conference on Machine Translation, 2019
Fast Structured Decoding for Sequence Models.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
Deliberation Learning for Image-to-Image Translation.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
Towards a Deep and Unified Understanding of Deep Neural Models in NLP.
Proceedings of the 36th International Conference on Machine Learning, 2019
Efficient Training of BERT by Progressively Stacking.
Proceedings of the 36th International Conference on Machine Learning, 2019
Multilingual Neural Machine Translation with Knowledge Distillation.
Proceedings of the 7th International Conference on Learning Representations, 2019
Representation Degeneration Problem in Training Natural Language Generation Models.
Proceedings of the 7th International Conference on Learning Representations, 2019
Machine Translation With Weakly Paired Documents.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019
Multilingual Neural Machine Translation with Language Clustering.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019
Hint-Based Training for Non-Autoregressive Machine Translation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019
Tied Transformers: Neural Machine Translation with Shared Encoder and Decoder.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
Non-Autoregressive Machine Translation with Auxiliary Regularization.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
Sentence-Wise Smooth Regularization for Sequence to Sequence Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
2018
Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018
FRAGE: Frequency-Agnostic Word Representation.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018
Dense Information Flow for Neural Machine Translation.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018
Towards Binary-Valued Gates for Robust LSTM Training.
Proceedings of the 35th International Conference on Machine Learning, 2018
Beyond Error Propagation in Neural Machine Translation: Characteristics of Language Also Matter.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018
Double Path Networks for Sequence to Sequence Learning.
Proceedings of the 27th International Conference on Computational Linguistics, 2018
2017
Scale Effects in Web Search.
Proceedings of the Web and Internet Economics - 13th International Conference, 2017
Decoding with Value Networks for Neural Machine Translation.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017
2016
Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves.
CoRR, 2016
Dual Learning for Machine Translation.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016
2014
Generalized second price auction with probabilistic broad match.
Proceedings of the ACM Conference on Economics and Computation, 2014
2013
Online learning for auction mechanism in bandit setting.
Decis. Support Syst., 2013
A Theoretical Analysis of NDCG Type Ranking Measures
CoRR, 2013
A Game-Theoretic Machine Learning Approach for Revenue Maximization in Sponsored Search.
Proceedings of the IJCAI 2013, 2013
A Theoretical Analysis of NDCG Type Ranking Measures.
Proceedings of the COLT 2013, 2013