2025
Tensor neural networks for high-dimensional Fokker-Planck equations.
Neural Networks, 2025
Aligning Large Language Models with Human Opinions through Persona Selection and Value-Belief-Norm Reasoning.
Proceedings of the 31st International Conference on Computational Linguistics, 2025
2024
Multi-view class incremental learning.
Inf. Fusion, February, 2024
Tackling the curse of dimensionality with physics-informed neural networks.
Neural Networks, 2024
Functional Risk Minimization.
CoRR, 2024
Effortless Efficiency: Low-Cost Pruning of Diffusion Models.
CoRR, 2024
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training.
CoRR, 2024
Investigating Layer Importance in Large Language Models.
CoRR, 2024
State-space models are accurate and efficient neural operators for dynamical systems.
CoRR, 2024
LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias of LLMs.
CoRR, 2024
Self-Evaluation as a Defense Against Adversarial Attacks on LLMs.
CoRR, 2024
Single Character Perturbations Break LLM Alignment.
CoRR, 2024
Tackling the Curse of Dimensionality in Fractional and Tempered Fractional PDEs with Physics-Informed Neural Networks.
CoRR, 2024
Score-fPINN: Fractional Score-Based Physics-Informed Neural Networks for High-Dimensional Fokker-Planck-Levy Equations.
CoRR, 2024
Learning diverse attacks on large language models for robust red-teaming and safety tuning.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models.
CoRR, 2024
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning.
CoRR, 2024
Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models.
CoRR, 2024
Accelerating Greedy Coordinate Gradient via Probe Sampling.
CoRR, 2024
AdaMergeX: Cross-Lingual Transfer with Large Language Models via Adaptive Adapter Merging.
CoRR, 2024
Score-Based Physics-Informed Neural Networks for High-Dimensional Fokker-Planck Equations.
CoRR, 2024
The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline.
CoRR, 2024
Can AI Be as Creative as Humans?
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Stochastic Taylor Derivative Estimator: Efficient amortization for arbitrary differential operators.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Memory-Efficient Gradient Unrolling for Large-Scale Bi-level Optimization.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
How do Large Language Models Handle Multilingualism?
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Deep Regression Representation Learning with Topology.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright BreachesWithout Adjusting Finetuning Pipeline.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
The Surprising Effectiveness of Skip-Tuning in Diffusion Sampling.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Drug Discovery with Dynamic Goal-aware Fragments.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Unsupervised Concept Discovery Mitigates Spurious Correlations.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Towards Robust Out-of-Distribution Generalization Bounds via Sharpness.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Scalable and Effective Implicit Graph Neural Networks on Large Graphs.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Towards 3D Molecule-Text Interpretation in Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Self-Supervised Dataset Distillation for Transfer Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Simple Hierarchical Planning with Diffusion.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Multi-expert Prompting Improves Reliability, Safety and Usefulness of Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Reasoning Robustness of LLMs to Adversarial Typographical Errors.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models.
Proceedings of the Computer Vision - ECCV 2024, 2024
VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Prompt Optimization via Adversarial In-Context Learning.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
ProtT3: Protein-to-Text Generation for Text-based Protein Understanding.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
ReactXT: Understanding Molecular "Reaction-ship" via Reaction-Contextualized Molecule-Text Pretraining.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Towards Continual Learning Desiderata via HSIC-Bottleneck Orthogonalization and Equiangular Embedding.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
Augmented Physics-Informed Neural Networks (APINNs): A gating network-based soft domain decomposition methodology.
Eng. Appl. Artif. Intell., November, 2023
Combined scaling for zero-shot transfer learning.
,
,
,
,
,
,
,
,
,
,
,
Neurocomputing, October, 2023
Single-Pass Contrastive Learning Can Work for Both Homophilic and Heterophilic Graph.
Trans. Mach. Learn. Res., 2023
Hutchinson Trace Estimation for High-Dimensional and High-Order Physics-Informed Neural Networks.
CoRR, 2023
Prompt Optimization via Adversarial In-Context Learning.
CoRR, 2023
Learning Unorthogonalized Matrices for Rotation Estimation.
CoRR, 2023
Probabilistic Copyright Protection Can Fail for Text-to-Image Generative Models.
CoRR, 2023
Bias-Variance Trade-off in Physics-Informed Neural Networks with Randomized Smoothing for High-Dimensional PDEs.
CoRR, 2023
Investigating Copyright Issues of Diffusion Models under Practical Scenarios.
CoRR, 2023
ChOiRe: Characterizing and Predicting Human Opinions with Chain of Opinion Reasoning.
CoRR, 2023
AttributionLab: Faithfulness of Feature Attribution Under Controllable Environments.
CoRR, 2023
A Dual-Perspective Approach to Evaluating Feature Attribution Methods.
CoRR, 2023
IF2Net: Innately Forgetting-Free Networks for Continual Learning.
CoRR, 2023
Automatic Model Selection with Large Language Models for Reasoning.
CoRR, 2023
Decomposition Enhances Reasoning via Self-Evaluation Guided Decoding.
CoRR, 2023
Last-Layer Fairness Fine-tuning is Simple and Effective for Neural Networks.
CoRR, 2023
An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization.
CoRR, 2023
MixupE: Understanding and improving Mixup from directional derivative perspective.
Proceedings of the Uncertainty in Artificial Intelligence, 2023
Self-Evaluation Guided Beam Search for Reasoning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
An Information Theory Perspective on Variance-Invariance-Covariance Regularization.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
PICProp: Physics-Informed Confidence Propagation for Uncertainty Quantification.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Sketch-Based Anomaly Detection in Streaming Graphs.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023
Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation.
Proceedings of the International Conference on Machine Learning, 2023
Discrete Key-Value Bottleneck.
Proceedings of the International Conference on Machine Learning, 2023
Auxiliary Learning as an Asymmetric Bargaining Game.
Proceedings of the International Conference on Machine Learning, 2023
GFlowOut: Dropout with Generative Flow Networks.
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the International Conference on Machine Learning, 2023
How Does Information Bottleneck Help Deep Learning?
Proceedings of the International Conference on Machine Learning, 2023
D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Self-Supervised Set Representation Learning for Unsupervised Meta-Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Self-Distillation for Further Pre-training of Transformers.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Simplicial Embeddings in Self-Supervised Learning and Downstream Classification.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Automatic Model Selection with Large Language Models for Reasoning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Understanding and Improving Neural Active Learning on Heteroskedastic Distributions.
Proceedings of the ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland, 2023
Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization for Heterogeneous Representational Coarseness.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
When Do Extended Physics-Informed Neural Networks (XPINNs) Improve Generalization?
SIAM J. Sci. Comput., 2022
Interpolation consistency training for semi-supervised learning.
Neural Networks, 2022
Interpolated Adversarial Training: Achieving robust neural networks without sacrificing too much accuracy.
Neural Networks, 2022
Understanding Dynamics of Nonlinear Representation Learning and Its Application.
Neural Comput., 2022
Meta-learning PINN loss functions.
J. Comput. Phys., 2022
Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions.
Neurocomputing, 2022
MixupE: Understanding and Improving Mixup from Directional Derivative Perspective.
CoRR, 2022
Neural Active Learning on Heteroskedastic Distributions.
CoRR, 2022
Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning.
CoRR, 2022
Simplicial Embeddings in Self-Supervised Learning and Downstream Classification.
CoRR, 2022
Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization.
CoRR, 2022
ExpertNet: A Symbiosis of Classification and Clustering.
CoRR, 2022
Training Free Graph Neural Networks for Graph Matching.
CoRR, 2022
MemStream: Memory-Based Streaming Anomaly Detection.
Proceedings of the WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25, 2022
MGNNI: Multiscale Graph Neural Networks with Implicit Layers.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Set-based Meta-Interpolation for Few-Task Meta-Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Discrete Compositional Representations as an Abstraction for Goal Conditioned Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
When and How Mixup Improves Calibration.
Proceedings of the International Conference on Machine Learning, 2022
Multi-Task Learning as a Bargaining Game.
Proceedings of the International Conference on Machine Learning, 2022
Robustness Implies Generalization via Data-Dependent Generalization Bounds.
Proceedings of the International Conference on Machine Learning, 2022
2021
MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift.
CoRR, 2021
CAC: A Clustering Based Framework for Classification.
CoRR, 2021
Discrete-Valued Neural Communication.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
EIGNN: Efficient Infinite-Depth Graph Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Adversarial Training Helps Transfer Learning via Better Representations.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Noether Networks: meta-learning useful conserved quantities.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth.
Proceedings of the 38th International Conference on Machine Learning, 2021
Towards Domain-Agnostic Contrastive Learning.
Proceedings of the 38th International Conference on Machine Learning, 2021
How Does Mixup Help With Robustness and Generalization?
Proceedings of the 9th International Conference on Learning Representations, 2021
On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers.
Proceedings of the 9th International Conference on Learning Representations, 2021
How Shrinking Gradient Noise Helps the Performance of Neural Networks.
Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), 2021
GraphMix: Improved Training of GNNs for Semi-Supervised Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
A Recipe for Global Convergence Guarantee in Deep Neural Networks.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
Adaptive activation functions accelerate convergence in deep and physics-informed neural networks.
J. Comput. Phys., 2020
Towards Domain-Agnostic Contrastive Learning.
CoRR, 2020
Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time.
CoRR, 2020
Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020
Elimination of All Bad Local Minima in Deep Learning.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020
2019
Depth with nonlinearity creates no bad local minima in ResNets.
Neural Networks, 2019
Every Local Minimum Value Is the Global Minimum Value of Induced Model in Nonconvex Machine Learning.
Neural Comput., 2019
Effect of Depth and Width on Local Minima in Deep Learning.
Neural Comput., 2019
Locally adaptive activation functions with slope recovery term for deep and physics-informed neural networks.
CoRR, 2019
A Stochastic First-Order Method for Ordered Empirical Risk Minimization.
CoRR, 2019
Every Local Minimum is a Global Minimum of an Induced Model.
CoRR, 2019
Eliminating all bad Local Minima from Loss Landscapes without even adding an Extra Unit.
CoRR, 2019
Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes.
Proceedings of the 57th Annual Allerton Conference on Communication, 2019
2018
Generalization in Machine Learning via Analytical Learning Theory.
CoRR, 2018
Theory of Deep Learning III: explaining the non-overfitting puzzle.
CoRR, 2018
Deep Semi-Random Features for Nonlinear Function Approximation.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018
2017
Generalization in Deep Learning.
CoRR, 2017
Depth Creates No Bad Local Minima.
CoRR, 2017
2016
Global Continuous Optimization with Error Bound and Fast Convergence.
J. Artif. Intell. Res., 2016
Streaming Normalization: Towards Simpler and More Biologically-plausible Normalizations for Online and Recurrent Learning.
CoRR, 2016
Deep Learning without Poor Local Minima.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016
Bounded Optimal Exploration in MDP.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016
2015
Application of Bayesian nonparametric models to the uncertainty and sensitivity analysis of source term in a BWR severe accident.
Reliab. Eng. Syst. Saf., 2015
Bayesian Optimization with Exponential Convergence.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015
2013
A Greedy Approximation of Bayesian Reinforcement Learning with Probably Optimistic Transition Model
CoRR, 2013
Prior-Free Exploration Bonus for and beyond Near Bayes-Optimal Behavior.
Proceedings of the IJCAI 2013, 2013