Nhat Ho

IEEE Trans. Inf. Theory, September, 2024

Statistical and Computational Complexities of BFGS Quasi-Newton Method for Generalized Linear Models.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

A Diffusion Process Perspective on Posterior Contraction Rates for Parameters.

[BibT_eX]

[DOI]

SIAM J. Math. Data Sci., 2024

On the Computational and Statistical Complexity of Over-parameterized Matrix Sensing.

[BibT_eX]

[DOI]

Jiacheng Zhuo

Jeongyeol Kwon

Constantine Caramanis

J. Mach. Learn. Res., 2024

X-Drive: Cross-modality consistent multi-sensor data synthesis for driving scenarios.

[BibT_eX]

[DOI]

CoRR, 2024

Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture of Experts.

[BibT_eX]

[DOI]

CoRR, 2024

Quadratic Gating Functions in Mixture of Experts: A Statistical Insight.

[BibT_eX]

[DOI]

CoRR, 2024

On Barycenter Computation: Semi-Unbalanced Optimal Transport-based Method on Gaussians.

[BibT_eX]

[DOI]

CoRR, 2024

Leveraging Hierarchical Taxonomies in Prompt-based Continual Learning.

[BibT_eX]

[DOI]

CoRR, 2024

Improving Generalization with Flat Hilbert Bayesian Inference.

[BibT_eX]

[DOI]

CoRR, 2024

On Expert Estimation in Hierarchical Mixture of Experts: Beyond Softmax Gating Functions.

[BibT_eX]

[DOI]

CoRR, 2024

LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts.

[BibT_eX]

[DOI]

CoRR, 2024

Backdoor Attack in Prompt-Based Continual Learning.

[BibT_eX]

[DOI]

Trang Nguyen

Anh Tran

CoRR, 2024

A Primal-Dual Framework for Transformers and Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2024

Statistical Advantages of Perturbing Cosine Router in Sparse Mixture of Experts.

[BibT_eX]

[DOI]

CoRR, 2024

Mixture of Experts Meets Prompt-Based Continual Learning.

[BibT_eX]

[DOI]

CoRR, 2024

Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts.

[BibT_eX]

[DOI]

Alessandro Rinaldo

CoRR, 2024

Borrowing Strength in Distributionally Robust Optimization via Hierarchical Dirichlet Processes.

[BibT_eX]

[DOI]

Nicola Bariletto

CoRR, 2024

Marginal Fairness Sliced Wasserstein Barycenter.

[BibT_eX]

[DOI]

Hai Nguyen

CoRR, 2024

Hierarchical Hybrid Sliced Wasserstein: A Scalable Metric for Heterogeneous Joint Distributions.

[BibT_eX]

[DOI]

CoRR, 2024

FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion.

[BibT_eX]

[DOI]

CoRR, 2024

On Least Squares Estimation in Softmax Gating Mixture of Experts.

[BibT_eX]

[DOI]

Alessandro Rinaldo

CoRR, 2024

CompeteSMoE - Effective Training of Sparse Mixture of Experts via Competition.

[BibT_eX]

[DOI]

CoRR, 2024

Bayesian Nonparametrics Meets Data-Driven Robust Optimization.

[BibT_eX]

[DOI]

Nicola Bariletto

CoRR, 2024

Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Feature Model.

[BibT_eX]

[DOI]

CoRR, 2024

Sliced Wasserstein with Random-Path Projecting Directions.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

On Least Square Estimation in Softmax Gating Mixture of Experts.

[BibT_eX]

[DOI]

Alessandro Rinaldo

Proceedings of the Forty-first International Conference on Machine Learning, 2024

A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?

[BibT_eX]

[DOI]

Pedram Akbarian

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Improving Computational Complexity in Statistical Models with Local Curvature Information.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Features Model.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Sliced Wasserstein Estimation with Control Variates.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Quasi-Monte Carlo for 3D Sliced Wasserstein.

[BibT_eX]

[DOI]

Nicola Bariletto

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Diffeomorphic Mesh Deformation via Efficient Optimal Transport for Cortical Surface Reconstruction.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Beyond Vanilla Variational Autoencoders: Detecting Posterior Collapse in Conditional and Hierarchical Variational Autoencoders.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Fast Approximation of the Generalized Sliced-Wasserstein Distance.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Integrating Efficient Optimal Transport and Functional Maps for Unsupervised Shape Correspondence Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

On Parameter Estimation in Deviated Gaussian Mixture of Experts.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023

On the Out of Distribution Robustness of Foundation Models in Medical Image Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

Privacy Preserving Bayesian Federated Learning in Heterogeneous Settings.

[BibT_eX]

[DOI]

Disha Makhija

Joydeep Ghosh

CoRR, 2023

Posterior Collapse in Linear Conditional and Hierarchical Variational Autoencoders.

[BibT_eX]

[DOI]

CoRR, 2023

Diffeomorphic Deformation via Sliced Wasserstein Distance Optimization for Cortical Surface Reconstruction.

[BibT_eX]

[DOI]

CoRR, 2023

Demystifying Softmax Gating in Gaussian Mixture of Experts.

[BibT_eX]

[DOI]

TrungTin Nguyen

CoRR, 2023

Control Variate Sliced Wasserstein Estimators.

[BibT_eX]

[DOI]

CoRR, 2023

Markovian Sliced Wasserstein Distances: Beyond Independent Projections.

[BibT_eX]

[DOI]

Tongzheng Ren

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Demystifying Softmax Gating Function in Gaussian Mixture of Experts.

[BibT_eX]

[DOI]

TrungTin Nguyen

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Energy-Based Sliced Wasserstein Distance.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Designing Robust Transformers using Robust Kernel Density Estimation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Minimax Optimal Rate for Parameter Estimation in Multivariate Deviated Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Self-Attention Amortized Distributional Projection Optimization for Sliced Wasserstein Point-Cloud Reconstruction.

[BibT_eX]

[DOI]

Dang Nguyen

Proceedings of the International Conference on Machine Learning, 2023

Revisiting Over-smoothing and Over-squashing Using Ollivier-Ricci Curvature.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

On Excess Mass Behavior in Gaussian Mixture Models with Orlicz-Wasserstein Distances.

[BibT_eX]

[DOI]

Aritra Guha

XuanLong Nguyen

Proceedings of the International Conference on Machine Learning, 2023

Neural Collapse in Deep Linear Networks: From Balanced to Imbalanced Data.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Hierarchical Sliced Wasserstein Distance.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

A Primal-Dual Framework for Transformers and Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

On Cross-Layer Alignment for Model Fusion of Heterogeneous Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

A Probabilistic Framework for Pruning Transformers Via a Finite Admixture of Keys.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Global-Local Regularization Via Distributional Robustness.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Joint Self-Supervised Image-Volume Representation Learning with Intra-inter Contrastive Clustering.

[BibT_eX]

[DOI]

Duy M. H. Nguyen

Hoang Nguyen

Truong Thanh Nhat Mai

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

On the Efficiency of Entropic Regularized Algorithms for Optimal Transport.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2022

On the Complexity of Approximating Multimarginal Optimal Transport.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2022

Convergence Rates for Gaussian Mixtures of Experts.

[BibT_eX]

[DOI]

Chiao-Yu Yang

J. Mach. Learn. Res., 2022

Revisiting Over-smoothing and Over-squashing using Ollivier's Ricci Curvature.

[BibT_eX]

[DOI]

CoRR, 2022

Improving Multi-task Learning via Seeking Task-based Flat Regions.

[BibT_eX]

[DOI]

CoRR, 2022

Robustify Transformers with Robust Kernel Density Estimation.

[BibT_eX]

[DOI]

CoRR, 2022

Improving Generative Flow Networks with Path Regularization.

[BibT_eX]

[DOI]

CoRR, 2022

Hierarchical Sliced Wasserstein Distance.

[BibT_eX]

[DOI]

CoRR, 2022

Transformer with Fourier Integral Attentions.

[BibT_eX]

[DOI]

CoRR, 2022

Efficient Forecasting of Large Scale Hierarchical Time Series via Multilevel Clustering.

[BibT_eX]

[DOI]

CoRR, 2022

Federated Self-supervised Learning for Heterogeneous Clients.

[BibT_eX]

[DOI]

Disha Makhija

Joydeep Ghosh

CoRR, 2022

Beyond EM Algorithm on Over-specified Two-Component Location-Scale Gaussian Mixtures.

[BibT_eX]

[DOI]

CoRR, 2022

An Exponentially Increasing Step-size for Parameter Estimation in Statistical Models.

[BibT_eX]

[DOI]

CoRR, 2022

Global-Local Regularization Via Distributional Robustness.

[BibT_eX]

[DOI]

CoRR, 2022

Improving Computational Complexity in Statistical Models with Second-Order Information.

[BibT_eX]

[DOI]

CoRR, 2022

Stochastic Multiple Target Sampling Gradient Descent.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Improving Transformer with an Admixture of Attention Heads.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Amortized Projection Optimization for Sliced Wasserstein Generative Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Revisiting Sliced Wasserstein on Images: From Vectorization to Convolution.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

FourierFormer: Transformer Meets Generalized Fourier Integral Theorem.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Beyond black box densities: Parameter learning for the deviated components.

[BibT_eX]

[DOI]

Dat Do

XuanLong Nguyen

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Improving Mini-batch Optimal Transport via Partial Transportation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

On Transportation of Mini-batches: A Hierarchical Approach.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Improving Transformers with Probabilistic Attention Keys.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Refined Convergence Rates for Maximum Likelihood Estimation under Finite Mixture Models.

[BibT_eX]

[DOI]

Tudor A. Manole

Proceedings of the International Conference on Machine Learning, 2022

Architecture Agnostic Federated Learning for Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Entropic Gromov-Wasserstein between Gaussian Distributions.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Towards Statistical and Computational Complexities of Polyak Step Size Gradient Descent.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

On Multimarginal Partial Optimal Transport: Equivalent Forms and Computational Complexity.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

On Structured Filtering-Clustering: Global Error Bound and Optimal First-Order Algorithms.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Weak Separation in Mixture Models and Implications for Principal Stratification.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

On efficient multilevel Clustering via Wasserstein distances.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2021

Model Fusion of Heterogeneous Neural Networks via Cross-Layer Alignment.

[BibT_eX]

[DOI]

CoRR, 2021

On Label Shift in Domain Adaptation via Wasserstein Distance.

[BibT_eX]

[DOI]

CoRR, 2021

Transformer with a Mixture of Gaussian Keys.

[BibT_eX]

[DOI]

CoRR, 2021

Entropic Gromov-Wasserstein between Gaussian Distributions.

[BibT_eX]

[DOI]

CoRR, 2021

An Efficient Mini-batch Method via Partial Transportation.

[BibT_eX]

[DOI]

CoRR, 2021

On Multimarginal Partial Optimal Transport: Equivalent Forms and Computational Complexity.

[BibT_eX]

[DOI]

CoRR, 2021

Statistical Analysis from the Fourier Integral Theorem.

[BibT_eX]

[DOI]

Stephen G. Walker

CoRR, 2021

Improving Bayesian Inference in Deep Neural Networks with Variational Structured Dropout.

[BibT_eX]

[DOI]

CoRR, 2021

On Robust Optimal Transport: Computational Complexity, Low-rank Approximation, and Barycenter Computation.

[BibT_eX]

[DOI]

CoRR, 2021

BoMb-OT: On Batch of Mini-batches Optimal Transport.

[BibT_eX]

[DOI]

CoRR, 2021

Structured Dropout Variational Inference for Bayesian Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

On Robust Optimal Transport: Computational Complexity and Barycenter Computation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

LAMDA: Label Matching Deep Domain Adaptation.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Improving Relational Regularized Autoencoders with Spherical Sliced Fused Gromov Wasserstein.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Distributional Sliced-Wasserstein and Applications to Generative Modeling.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Point-set Distances for Learning Representations of 3D Point Clouds.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Flow-based Alignment Approaches for Probability Measures in Different Spaces.

[BibT_eX]

[DOI]

Tam Le

Makoto Yamada

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

On the Minimax Optimality of the EM Algorithm for Learning Two-Component Mixed Linear Regression.

[BibT_eX]

[DOI]

Jeongyeol Kwon

Constantine Caramanis

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

Instability, Computational Efficiency and Statistical Accuracy.

[BibT_eX]

[DOI]

CoRR, 2020

Revisiting Fixed Support Wasserstein Barycenter: Computational Hardness and Efficient Algorithms.

[BibT_eX]

[DOI]

CoRR, 2020

Fixed-Support Wasserstein Barycenters: Computational Hardness and Fast Algorithm.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Projection Robust Wasserstein Distance and Riemannian Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

On Unbalanced Optimal Transport: An Analysis of Sinkhorn Algorithm.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Fast Algorithms for Computational Optimal Transport and Wasserstein Barycenter.

[BibT_eX]

[DOI]

Wenshuo Guo

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Sharp Analysis of Expectation-Maximization for Weakly Identifiable Models.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019

Singularity Structures and Impacts on Parameter Estimation in Finite Mixtures of Distributions.

[BibT_eX]

[DOI]

XuanLong Nguyen

SIAM J. Math. Data Sci., 2019

Sampling for Bayesian Mixture Models: MCMC with Polynomial-Time Mixing.

[BibT_eX]

[DOI]

CoRR, 2019

On Scalable Variant of Wasserstein Barycenter.

[BibT_eX]

[DOI]

CoRR, 2019

Computationally Efficient Tree Variants of Gromov-Wasserstein.

[BibT_eX]

[DOI]

Tam Le

Makoto Yamada

CoRR, 2019

On the Acceleration of the Sinkhorn and Greenkhorn Algorithms for Optimal Transport.

[BibT_eX]

[DOI]

CoRR, 2019

Posterior Distribution for the Number of Clusters in Dirichlet Process Mixture Models.

[BibT_eX]

[DOI]

Chiao-Yu Yang

CoRR, 2019

Accelerated Primal-Dual Coordinate Descent for Computational Optimal Transport.

[BibT_eX]

[DOI]

Wenshuo Guo

CoRR, 2019

Global Error Bounds and Linear Convergence for Gradient-Based Algorithms for Trend Filtering and 𝓁<sub>1</sub>-Convex Clustering.

[BibT_eX]

[DOI]

CoRR, 2019

Challenges with EM in application to weakly identifiable mixture models.

[BibT_eX]

[DOI]

CoRR, 2019

On Efficient Optimal Transport: An Analysis of Greedy and Accelerated Mirror Descent Algorithms.

[BibT_eX]

[DOI]