Tong Zhang

Orcid: 0000-0002-5511-2558

Affiliations:
  • University of Illinois Urbana-Champaign, IL, USA
  • Hong Kong University of Science and Technology, China (former)
  • Tencent AI Lab, Shenzhen, China (former)
  • Rutgers University, Department of Statistics, NJ, USA (former)
  • Baidu Inc. Beijing, China (former)
  • Yahoo (former)
  • IBM T. J. Watson Research Center, Yorktown Heights, NY, USA (former)
  • Stanford University, CA, USA (PhD)


According to our database1, Tong Zhang authored at least 430 papers between 1995 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
On the Unreasonable Effectiveness of Federated Averaging with Heterogeneous Data.
Trans. Mach. Learn. Res., 2024

Nonsmooth Optimization over the Stiefel Manifold and Beyond: Proximal Gradient Method and Recent Variants.
SIAM Rev., 2024

Personalized Visual Instruction Tuning.
CoRR, 2024

Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic.
CoRR, 2024

Traversing Pareto Optimal Policies: Provably Efficient Multi-Objective Reinforcement Learning.
CoRR, 2024

SmartQuant: CXL-based AI Model Store in Support of Runtime Configurable Weight Quantization.
CoRR, 2024

TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts.
CoRR, 2024

ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting.
CoRR, 2024

Large Batch Analysis for Adagrad Under Anisotropic Smoothness.
CoRR, 2024

VeraCT Scan: Retrieval-Augmented Fake News Detection with Justifiable Reasoning.
CoRR, 2024

Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs.
CoRR, 2024

Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference.
CoRR, 2024

RLHF Workflow: From Reward Modeling to Online RLHF.
CoRR, 2024

Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm.
CoRR, 2024

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning.
CoRR, 2024

On the Benefits of Over-parameterization for Out-of-Distribution Generalization.
CoRR, 2024

Do CLIPs Always Generalize Better than ImageNet Models?
CoRR, 2024

An Improved Analysis of Langevin Algorithms with Prior Diffusion for Non-Log-Concave Sampling.
CoRR, 2024

EntailE: Introducing Textual Entailment in Commonsense Knowledge Graph Completion.
CoRR, 2024

A Theoretical Analysis of Nash Learning from Human Feedback under General KL-Regularized Preference.
CoRR, 2024

The Instinctive Bias: Spurious Images lead to Hallucination in MLLMs.
CoRR, 2024

The Surprising Harmfulness of Benign Overfitting for Adversarial Robustness.
CoRR, 2024

MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance.
CoRR, 2024

AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets.
CoRR, 2024

Short-Term Load Forecasting Using Regularized Greedy Forest-Based Ensemble Model.
IEEE Access, 2024

PipeNet: Question Answering with Semantic Pruning over Knowledge Graphs.
Proceedings of the 13th Joint Conference on Lexical and Computational Semantics, 2024

Towards Better Generalization in Open-Domain Question Answering by Mitigating Context Memorization.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

R-Tuning: Instructing Large Language Models to Say 'I Don't Know'.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: System Demonstrations, 2024

MEDL-U: Uncertainty-aware 3D Automatic Annotation based on Evidential Deep Learning.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Faster Sampling via Stochastic Gradient Proximal Sampler.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

The Non-linear F-Design and Applications to Interactive Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Spurious Feature Diversification Improves Out-of-distribution Generalization.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Reverse Diffusion Monte Carlo.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Towards Robust Offline Reinforcement Learning under Diverse Data Corruption.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Mitigating the Alignment Tax of RLHF.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

An Incremental Unified Framework for Small Defect Inspection.
Proceedings of the Computer Vision - ECCV 2024, 2024

Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization.
Proceedings of the Computer Vision - ECCV 2024, 2024

Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo.
Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Active Prompting with Chain-of-Thought for Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment.
Trans. Mach. Learn. Res., 2023

Black-Box Prompt Learning for Pre-trained Language Models.
Trans. Mach. Learn. Res., 2023

Multi-Consensus Decentralized Accelerated Gradient Descent.
J. Mach. Learn. Res., 2023

Gibbs Sampling from Human Feedback: A Provable KL- constrained Framework for RLHF.
CoRR, 2023

R-Tuning: Teaching Large Language Models to Refuse Unknown Questions.
CoRR, 2023

Plum: Prompt Learning using Metaheuristic.
CoRR, 2023

PerceptionGPT: Effectively Fusing Visual Perception into LLM.
CoRR, 2023

MEDL-U: Uncertainty-aware 3D Automatic Annotator based on Evidential Deep Learning.
CoRR, 2023

Mitigating the Alignment Tax of RLHF.
CoRR, 2023

Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning.
CoRR, 2023

Monte Carlo Sampling without Isoperimetry: A Reverse Diffusion Approach.
CoRR, 2023

What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?
CoRR, 2023

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment.
CoRR, 2023

Environment Invariant Linear Least Squares.
CoRR, 2023

Provable Particle-based Primal-Dual Algorithm for Mixed Nash Equilibrium.
CoRR, 2023

Active Prompting with Chain-of-Thought for Large Language Models.
CoRR, 2023

Probabilistic Bilevel Coreset Selection.
CoRR, 2023

Hashtag-Guided Low-Resource Tweet Classification.
Proceedings of the ACM Web Conference 2023, 2023

Corruption-Robust Offline Reinforcement Learning with General Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Posterior Sampling for Competitive RL: Function Approximation and Partial Observation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Double Randomized Underdamped Langevin with Dimension-Independent Convergence Guarantee.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Inconsistency, Instability, and Generalization Gap of Deep Neural Network Training.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Theoretical Analysis of Optimistic Proximal Policy Optimization in Linear Markov Decision Processes.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

ZipKV: In-Memory Key-Value Store with Built-In Data Compression.
Proceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management, 2023

Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes.
Proceedings of the International Conference on Machine Learning, 2023

What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?
Proceedings of the International Conference on Machine Learning, 2023

Generalized Polyak Step Size for First Order Optimization with Momentum.
Proceedings of the International Conference on Machine Learning, 2023

Beyond Uniform Lipschitz Condition in Differentially Private Optimization.
Proceedings of the International Conference on Machine Learning, 2023

On the Convergence of Federated Averaging with Cyclic Client Participation.
Proceedings of the International Conference on Machine Learning, 2023

Learning in POMDPs is Sample-Efficient with Hindsight Observability.
Proceedings of the International Conference on Machine Learning, 2023

Particle-based Variational Inference with Preconditioned Functional Gradient Flow.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

DetGPT: Detect What You Need via Reasoning.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Doolittle: Benchmarks and Corpora for Academic Writing Formalization.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

VOQL: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

Catalyst Acceleration of Error Compensated Methods Leads to Better Communication Complexity.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models' Memories.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Covariate-Shift Generalization via Random Sample Weighting.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning.
SIAM J. Math. Data Sci., June, 2022

Convex Formulation of Overparameterized Deep Neural Networks.
IEEE Trans. Inf. Theory, 2022

A stochastic extra-step quasi-Newton method for nonsmooth nonconvex optimization.
Math. Program., 2022

When is the Convergence Time of Langevin Algorithms Dimension Independent? A Composite Optimization Viewpoint.
J. Mach. Learn. Res., 2022

Weakly Supervised Disentangled Generative Causal Representation Learning.
J. Mach. Learn. Res., 2022

ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT.
CoRR, 2022

Normalizing Flow with Variational Latent Representation.
CoRR, 2022

GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond.
CoRR, 2022

Asymptotic Statistical Analysis of f-divergence GAN.
CoRR, 2022

Dimension Independent Generalization of DP-SGD for Overparameterized Smooth Convex Optimization.
CoRR, 2022

Black-box Prompt Learning for Pre-trained Language Models.
CoRR, 2022

Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Probabilistic Bilevel Coreset Selection.
Proceedings of the International Conference on Machine Learning, 2022

Sparse Invariant Risk Minimization.
Proceedings of the International Conference on Machine Learning, 2022

Model Agnostic Sample Reweighting for Out-of-Distribution Learning.
Proceedings of the International Conference on Machine Learning, 2022

Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets.
Proceedings of the International Conference on Machine Learning, 2022

A Theoretical Analysis on Independence-driven Importance Weighting for Covariate-shift Generalization.
Proceedings of the International Conference on Machine Learning, 2022

A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games.
Proceedings of the International Conference on Machine Learning, 2022

Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint.
Proceedings of the International Conference on Machine Learning, 2022

Achieving Minimax Rates in Pool-Based Batch Active Learning.
Proceedings of the International Conference on Machine Learning, 2022

Eigencurve: Optimal Learning Rate Schedule for SGD on Quadratic Objectives with Skewed Hessian Spectrums.
Proceedings of the Tenth International Conference on Learning Representations, 2022

HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

MICO: A Multi-alternative Contrastive Learning Framework for Commonsense Knowledge Representation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Semi-supervised Monocular 3D Object Detection by Multi-view Consistency.
Proceedings of the Computer Vision - ECCV 2022, 2022

Bayesian Invariant Risk Minimization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Exploring Geometric Consistency for Monocular 3D Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior Sampling.
Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

Minimax Regret Optimization for Robust Machine Learning under Distribution Shift.
Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

Multilingual Word Sense Disambiguation with Unified Sense Representation.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Exploiting Hybrid Semantics of Relation Paths for Multi-hop Question Answering over Knowledge Graphs.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Rare and Zero-shot Word Sense Disambiguation using Z-Reweighting.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Frequency-Aware Contrastive Learning for Neural Machine Translation.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Local-Global Memory Neural Network for Medication Prediction.
IEEE Trans. Neural Networks Learn. Syst., 2021

Finite-Sample Analysis for Decentralized Batch Multiagent Reinforcement Learning With Networked Agents.
IEEE Trans. Autom. Control., 2021

Mathematical Models of Overparameterized Neural Networks.
Proc. IEEE, 2021

A Framework of Composite Functional Gradient Methods for Generative Adversarial Models.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

DeEPCA: Decentralized Exact PCA with Linear Convergence Rate.
J. Mach. Learn. Res., 2021

Why Stable Learning Works? A Theory of Covariate Shift Generalization.
CoRR, 2021

A Field Guide to Federated Optimization.
CoRR, 2021

Near Optimal Stochastic Algorithms for Finite-Sum Unbalanced Convex-Concave Minimax Optimization.
CoRR, 2021

Adder Neural Networks.
CoRR, 2021

ZEN 2.0: Continue Training and Adaption for N-gram Enhanced Text Encoders.
CoRR, 2021

Geometry-aware data augmentation for monocular 3D object detection.
CoRR, 2021

Efficient Neural Network Training via Forward and Backward Propagation Sparsification.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Error Compensated Distributed SGD Can Be Accelerated.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Multi-Hop Transformer for Document-Level Machine Translation.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

DiffMG: Differentiable Meta Graph Search for Heterogeneous Graph Neural Networks.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

G-DetKD: Towards General Distillation Framework for Object Detectors via Contrastive and Semantic-guided Feature Imitation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Improving Event Detection by Exploiting Label Hierarchy.
Proceedings of the IEEE International Conference on Acoustics, 2021

Effective Sparsification of Neural Networks With Global Sparsity Constraint.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Joint-DetNAS: Upgrade Your Detector With NAS, Pruning and Dynamic Distillation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Involution: Inverting the Inherence of Convolution for Visual Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Few-Shot Human Motion Transfer by Personalized Geometry and Texture Modeling.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

TransNAS-Bench-101: Improving Transferability and Generalizability of Cross-Task Neural Architecture Search.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks.
Proceedings of the Conference on Learning Theory, 2021

Point, Disambiguate and Copy: Incorporating Bilingual Dictionaries for Neural Machine Translation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Taming Pre-trained Language Models with N-gram Representations for Low-Resource Domain Adaptation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

TILGAN: Transformer-based Implicit Latent GAN for Diverse and Coherent Text Generation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020
Proximal Gradient Method for Nonsmooth Optimization over the Stiefel Manifold.
SIAM J. Optim., 2020

End-to-End Active Object Tracking and Its Real-World Deployment via Reinforcement Learning.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

Accelerated dual-averaging primal-dual method for composite convex minimization.
Optim. Methods Softw., 2020

MAP Inference Via ℓ <sub>2</sub>-Sphere Linear Program Reformulation.
Int. J. Comput. Vis., 2020

PMGT-VR: A decentralized proximal-gradient algorithmic framework with variance reduction.
CoRR, 2020

Multi-modal AsynDGAN: Learn From Distributed Medical Image Data without Sharing Private Information.
CoRR, 2020

VEGA: Towards an End-to-End Configurable AutoML Pipeline.
CoRR, 2020

Propagation Model Search for Graph Neural Networks.
CoRR, 2020

Disentangled Generative Causal Representation Learning.
CoRR, 2020

CorrAttack: Black-box Adversarial Attack with Structured Search.
CoRR, 2020

Bidirectional Generative Modeling Using Adversarial Gradient Estimation.
CoRR, 2020

Mean-Field Analysis of Two-Layer Neural Networks: Non-Asymptotic Rates and Generalization Bounds.
CoRR, 2020

Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems.
CoRR, 2020

Decentralized Accelerated Proximal Gradient Descent.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Model Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

How to Characterize The Landscape of Overparameterized Convolutional Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Stable Learning via Differentiated Variable Decorrelation.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Guided Learning of Nonconvex Models through Successive Functional Gradient Optimization.
Proceedings of the 37th International Conference on Machine Learning, 2020

Black-Box Adversarial Attack with Transferable Model-based Embedding.
Proceedings of the 8th International Conference on Learning Representations, 2020

Improving Constituency Parsing with Span Attention.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

CATCH: Context-Based Meta Reinforcement Learning for Transferrable Architecture Search.
Proceedings of the Computer Vision - ECCV 2020, 2020

Leveraging Human Prior Knowledge to Learn Sense Representations.
Proceedings of the ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020, 2020

MiLeNAS: Efficient Neural Architecture Search via Mixed-Level Reformulation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Synthetic Learning: Learn From Distributed Asynchronized Discriminator GAN Without Sharing Medical Image Data.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Improving Chinese Word Segmentation with Wordhood Memory Networks.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-way Attentions of Auto-analyzed Knowledge.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Stable Learning via Sample Reweighting.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Utilizing Second Order Information in Minibatch Stochastic Variance Reduced Proximal Iterations.
J. Mach. Learn. Res., 2019

Layer-Wise Learning Strategy for Nonparametric Tensor Product Smoothing Spline Regression and Graphical Models.
J. Mach. Learn. Res., 2019

Robust Frequent Directions with Application in Online Learning.
J. Mach. Learn. Res., 2019

Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python.
J. Mach. Learn. Res., 2019

Fast Generalized Matrix Regression with Applications in Machine Learning.
CoRR, 2019

Multi-objective Neural Architecture Search via Predictive Network Performance Optimization.
CoRR, 2019

Over Parameterized Two-level Neural Networks Can Learn Near Optimal Feature Representations.
CoRR, 2019

Mirror Natural Evolution Strategies.
CoRR, 2019

DeepSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression.
CoRR, 2019

DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression.
CoRR, 2019

MAP Inference via L2-Sphere Linear Program Reformulation.
CoRR, 2019

Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation Learning.
CoRR, 2019

Graph-guided multi-task sparse learning model: a method for identifying antigenic variants of influenza A(H3N2) virus.
Bioinform., 2019

Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation Learning.
IEEE Access, 2019

Divergence-Augmented Policy Optimization.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

A Hybrid Character Representation for Chinese Event Detection.
Proceedings of the International Joint Conference on Neural Networks, 2019

DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-pass Error-Compensated Compression.
Proceedings of the 36th International Conference on Machine Learning, 2019

NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks.
Proceedings of the 36th International Conference on Machine Learning, 2019

Grid-Wise Control for Multi-Agent Reinforcement Learning in Video Game AI.
Proceedings of the 36th International Conference on Machine Learning, 2019

DHER: Hindsight Experience Replay for Dynamic Goals.
Proceedings of the 7th International Conference on Learning Representations, 2019

Efficient Decision-Based Black-Box Adversarial Attacks on Face Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Sharp Analysis for Nonconvex SGD Escaping from Saddle Points.
Proceedings of the Conference on Learning Theory, 2019

Sentiment Analysis Using Autoregressive Language Modeling and Broad Learning System.
Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine, 2019

Reinforced Training Data Selection for Domain Adaptation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Neural Machine Translation with Adequacy-Oriented Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Dynamic Layer Aggregation for Neural Machine Translation with Routing-by-Agreement.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Bayesian Model Averaging With Exponentiated Least Squares Loss.
IEEE Trans. Inf. Theory, 2018

Learning to Remember Translation History with a Continuous Cache.
Trans. Assoc. Comput. Linguistics, 2018

Near-optimal stochastic approximation for online principal component estimation.
Math. Program., 2018

An Ensemble Approach for Detecting Anomalous User Behaviors.
Int. J. Softw. Eng. Knowl. Eng., 2018

Hessian-Aware Zeroth-Order Optimization for Black-Box Adversarial Attack.
CoRR, 2018

Finite-Sample Analyses for Fully Decentralized Multi-Agent Reinforcement Learning.
CoRR, 2018

Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space.
CoRR, 2018

Fully Implicit Online Learning.
CoRR, 2018

TStarBots: Defeating the Cheating Level Builtin AI in StarCraft II in the Full Game.
CoRR, 2018

A convex formulation for high-dimensional sparse sliced inverse regression.
CoRR, 2018

Diffusion Approximations for Online Principal Component Estimation and Global Convergence.
CoRR, 2018

Incorporating Pseudo-Parallel Data for Quantifiable Sequence Editing.
CoRR, 2018

Decentralization Meets Quantization.
CoRR, 2018

Fine-grained Video Attractiveness Prediction Using Multimodal Deep Learning on a Large Real-world Dataset.
Proceedings of the Companion of the The Web Conference 2018 on The Web Conference 2018, 2018

Gradient Sparsification for Communication-Efficient Distributed Optimization.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Exponentially Weighted Imitation Learning for Batched Historical Data.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Communication Compression for Decentralized Training.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Stochastic Primal-Dual Method for Empirical Risk Minimization with O(1) Per-Iteration Complexity.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Stochastic Expectation Maximization with Variance Reduction.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Sketched Follow-The-Regularized-Leader for Online Factorization Machine.
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents.
Proceedings of the 35th International Conference on Machine Learning, 2018

Safe Element Screening for Submodular Function Minimization.
Proceedings of the 35th International Conference on Machine Learning, 2018

Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization.
Proceedings of the 35th International Conference on Machine Learning, 2018

Graphical Nonconvex Optimization via an Adaptive Convex Relaxation.
Proceedings of the 35th International Conference on Machine Learning, 2018

An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method.
Proceedings of the 35th International Conference on Machine Learning, 2018

End-to-end Active Object Tracking via Reinforcement Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

Composite Functional Gradient Learning of Generative Adversarial Models.
Proceedings of the 35th International Conference on Machine Learning, 2018

Candidates vs. Noises Estimation for Large Multi-Class Classification Problem.
Proceedings of the 35th International Conference on Machine Learning, 2018

Modeling Localness for Self-Attention Networks.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

QuaSE: Sequence Editing under Quantifiable Guidance.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Multi-Head Attention with Disagreement Regularization.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Exploiting Deep Representations for Neural Machine Translation.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Super-Identity Convolutional Neural Network for Face Hallucination.
Proceedings of the Computer Vision - ECCV 2018, 2018

Orthogonal Deep Features Decomposition for Age-Invariant Face Recognition.
Proceedings of the Computer Vision - ECCV 2018, 2018

Modeling Varying Camera-IMU Time Offset in Optimization-Based Visual-Inertial Odometry.
Proceedings of the Computer Vision - ECCV 2018, 2018

Unsupervised Image-to-Image Translation with Stacked Cycle-Consistent Adversarial Networks.
Proceedings of the Computer Vision - ECCV 2018, 2018

Recurrent Fusion Network for Image Captioning.
Proceedings of the Computer Vision - ECCV 2018, 2018

Neural Stereoscopic Image Style Transfer.
Proceedings of the Computer Vision - ECCV 2018, 2018

Video Re-localization.
Proceedings of the Computer Vision - ECCV 2018, 2018

Translating Pro-Drop Languages With Reconstruction Models.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Sparseness Analysis in the Pretraining of Deep Neural Networks.
IEEE Trans. Neural Networks Learn. Syst., 2017

Hierarchical Contextual Attention Recurrent Neural Network for Map Query Suggestion.
IEEE Trans. Knowl. Data Eng., 2017

A General Distributed Dual Coordinate Optimization Framework for Regularized Loss Minimization.
J. Mach. Learn. Res., 2017

Gradient Hard Thresholding Pursuit.
J. Mach. Learn. Res., 2017

Candidates v.s. Noises Estimation for Large Multi-Class Classification Problem.
CoRR, 2017

Improved Optimization of Finite Sums with Minibatch Stochastic Variance Reduced Proximal Iterations.
CoRR, 2017

On Quadratic Convergence of DC Proximal Newton Algorithm for Nonconvex Sparse Learning in High Dimensions.
CoRR, 2017

On Quadratic Convergence of DC Proximal Newton Algorithm in Nonconvex Sparse Learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Diffusion Approximations for Online Principal Component Estimation and Global Convergence.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Projection-free Distributed Online Learning in Networks.
Proceedings of the 34th International Conference on Machine Learning, 2017

Efficient Distributed Learning with Sparsity.
Proceedings of the 34th International Conference on Machine Learning, 2017

Deep Pyramid Convolutional Neural Networks for Text Categorization.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization.
Math. Program., 2016

Towards More Efficient SPSD Matrix Approximation and CUR Matrix Decomposition.
J. Mach. Learn. Res., 2016

A General Distributed Dual Coordinate Optimization Framework for Regularized Loss Minimization.
CoRR, 2016

Convolutional Neural Networks for Text Categorization: Shallow Word-level vs. Deep Character-level.
CoRR, 2016

Supervised and Semi-Supervised Text Categorization using One-Hot LSTM for Region Embeddings.
CoRR, 2016

Local Uncertainty Sampling for Large-Scale Multi-Class Logistic Regression.
CoRR, 2016

Learning Additive Exponential Family Graphical Models via \ell_{2, 1}-norm Regularized M-Estimation.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Exact Recovery of Hard Thresholding Pursuit.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Fast Component Pursuit for Large-Scale Inverse Covariance Estimation.
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016

Generalized Hierarchical Sparse Model for Arbitrary-Order Interactive Antigenic Sites Identification in Flu Virus Data.
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016

Sparse Nonlinear Regression: Parameter Estimation under Nonconvexity.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Supervised and Semi-Supervised Text Categorization using LSTM for Region Embeddings.
Proceedings of the 33nd International Conference on Machine Learning, 2016

2015
Fundamentals of Predictive Text Mining, Second Edition
Texts in Computer Science, Springer, ISBN: 978-1-4471-6750-1, 2015

Learning sparse low-threshold linear classifiers.
J. Mach. Learn. Res., 2015

Sparse Nonlinear Regression: Parameter Estimation and Asymptotic Inference.
CoRR, 2015

Improved Analyses of the Randomized Power Method and Block Lanczos Method.
CoRR, 2015

Towards More Efficient Nystrom Approximation and CUR Matrix Decomposition.
CoRR, 2015

Semi-Supervised Learning with Multi-View Embedding: Theory and Application with Convolutional Neural Networks.
CoRR, 2015

Crowd Fraud Detection in Internet Advertising.
Proceedings of the 24th International Conference on World Wide Web, 2015

Local Smoothness in Variance Reduced Optimization.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Quartz: Randomized Dual Coordinate Ascent with Arbitrary Sampling.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Effective Use of Word Order for Text Categorization with Convolutional Neural Networks.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Stochastic Optimization with Importance Sampling for Regularized Loss Minimization.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Adaptive Stochastic Alternating Direction Method of Multipliers.
Proceedings of the 32nd International Conference on Machine Learning, 2015

2014
Partial Gaussian Graphical Model Estimation.
IEEE Trans. Inf. Theory, 2014

A Proximal Stochastic Gradient Method with Progressive Variance Reduction.
SIAM J. Optim., 2014

Learning Nonlinear Functions Using Regularized Greedy Forest.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Random Design Analysis of Ridge Regression.
Found. Comput. Math., 2014

Pathwise Coordinate Optimization for Sparse Learning: Algorithm and Theory.
CoRR, 2014

Accelerating Minibatch Stochastic Gradient Descent using Stratified Sampling.
CoRR, 2014

Stochastic Optimization with Importance Sampling.
CoRR, 2014

Adjusting Leverage Scores by Row Weighting: A Practical Approach to Coherent Matrix Completion.
CoRR, 2014

Randomized Dual Coordinate Ascent with Arbitrary Sampling.
CoRR, 2014

Sparse Recovery with Very Sparse Compressed Counting.
CoRR, 2014

Batch-Mode Active Learning via Error Bound Minimization.
Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, 2014

Gradient boosting factorization machines.
Proceedings of the Eighth ACM Conference on Recommender Systems, 2014

Efficient mini-batch training for stochastic optimization.
Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014

Gradient Hard Thresholding Pursuit for Sparsity-Constrained Optimization.
Proceedings of the 31th International Conference on Machine Learning, 2014

A Convergence Rate Analysis for LogitBoost, MART and Their Variant.
Proceedings of the 31th International Conference on Machine Learning, 2014

Communication-Efficient Distributed Optimization using an Approximate Newton-type Method.
Proceedings of the 31th International Conference on Machine Learning, 2014

Compressed Counting Meets Compressed Sensing.
Proceedings of The 27th Conference on Learning Theory, 2014

2013
A Proximal-Gradient Homotopy Method for the Sparse Least-Squares Problem.
SIAM J. Optim., 2013

Truncated power method for sparse eigenvalue problems.
J. Mach. Learn. Res., 2013

Stochastic dual coordinate ascent methods for regularized loss.
J. Mach. Learn. Res., 2013

Accelerating Stochastic Alternating Direction Method of Multipliers with Adaptive Subgradient.
CoRR, 2013

Aggregation of Affine Estimators.
CoRR, 2013

High-dimensional Joint Sparsity Random Effects Model for Multi-task Learning.
Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, 2013

Accelerated Mini-Batch Stochastic Dual Coordinate Ascent.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Accelerating Stochastic Gradient Descent using Predictive Variance Reduction.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes.
Proceedings of the 30th International Conference on Machine Learning, 2013

2012
A spectral algorithm for learning Hidden Markov Models.
J. Comput. Syst. Sci., 2012

Analysis of a randomized approximation scheme for matrix multiplication
CoRR, 2012

Proximal Stochastic Dual Coordinate Ascent
CoRR, 2012

Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization
CoRR, 2012

Deviation Optimal Learning using Greedy Q-aggregation
CoRR, 2012

AntigenMap 3D: an online antigenic cartography resource.
Bioinform., 2012

Selective Labeling via Error Bound Minimization.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

A Proximal-Gradient Homotopy Method for the L1-Regularized Least-Squares Problem.
Proceedings of the 29th International Conference on Machine Learning, 2012

2011
Sparse Recovery With Orthogonal Matching Pursuit Under RIP.
IEEE Trans. Inf. Theory, 2011

Adaptive Forward-Backward Greedy Algorithm for Learning Sparse Representations.
IEEE Trans. Inf. Theory, 2011

Robust Matrix Decomposition With Sparse Corruptions.
IEEE Trans. Inf. Theory, 2011

Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation.
PLoS Comput. Biol., 2011

Learning with Structured Sparsity.
J. Mach. Learn. Res., 2011

A tail inequality for quadratic forms of subgaussian random vectors
CoRR, 2011

An Analysis of Random Design Linear Regression
CoRR, 2011

Dimension-free tail inequalities for sums of random matrices.
CoRR, 2011

Efficient Optimal Learning for Contextual Bandits.
Proceedings of the UAI 2011, 2011

Learning to Search Efficiently in High Dimensions.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Greedy Model Averaging.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Spectral Methods for Learning Multivariate Latent Tree Structure.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

2010
Fundamentals of Predictive Text Mining.
Texts in Computer Science 41, Springer, ISBN: 978-1-84996-226-1, 2010

Fundamental Statistical Techniques.
Proceedings of the Handbook of Natural Language Processing, Second Edition., 2010

Trading Accuracy for Sparsity in Optimization Problems with Sparsity Constraints.
SIAM J. Optim., 2010

A Computational Framework for Influenza Antigenic Cartography.
PLoS Comput. Biol., 2010

Analysis of Multi-stage Convex Relaxation for Sparse Regularization.
J. Mach. Learn. Res., 2010

Robust Matrix Decomposition with Outliers
CoRR, 2010

Deep Coding Network.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Agnostic Active Learning Without Constraints.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Improved Local Coordinate Coding using Local Tangents.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Image Classification Using Super-Vector Coding of Local Image Descriptors.
Proceedings of the Computer Vision - ECCV 2010, 2010

2009
Classifying search queries using the Web as a source of knowledge.
ACM Trans. Web, 2009

On the Consistency of Feature Selection using Greedy Least Squares Regression.
J. Mach. Learn. Res., 2009

Sparse Online Learning via Truncated Gradient.
J. Mach. Learn. Res., 2009

Nonlinear Learning using Local Coordinate Coding.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Multi-Label Prediction via Compressed Sensing.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Learning nonlinear dynamic models.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

2008
Graph-Based Semi-Supervised Learning and Spectral Kernel Design.
IEEE Trans. Inf. Theory, 2008

Statistical Analysis of Bayes Optimal Subset Ranking.
IEEE Trans. Inf. Theory, 2008

An Online Relevant Set Algorithm for Statistical Machine Translation.
IEEE Trans. Speech Audio Process., 2008

Multi-stage Convex Relaxation for Learning with Sparse Regularization.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Adaptive Forward-Backward Greedy Algorithm for Sparse Learning with Linear Models.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

2007
A block bigram prediction model for statistical machine translation.
ACM Trans. Speech Lang. Process., 2007

On the Effectiveness of Laplacian Normalization for Graph Semi-supervised Learning.
J. Mach. Learn. Res., 2007

Robust classification of rare queries using web knowledge.
Proceedings of the SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007

A General Boosting Method and its Application to Learning Ranking Functions for Web Search.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Two-view feature generation model for semi-supervised learning.
Proceedings of the Machine Learning, 2007

Margin Based Active Learning.
Proceedings of the Learning Theory, 20th Annual Conference on Learning Theory, 2007

2006
Information-theoretic upper and lower bounds for statistical estimation.
IEEE Trans. Inf. Theory, 2006

Learning on Graph with Laplacian Regularization.
Proceedings of the Advances in Neural Information Processing Systems 19, 2006

Linear prediction models with graph regularization for web-page categorization.
Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006

Subset Ranking Using Regression.
Proceedings of the Learning Theory, 19th Annual Conference on Learning Theory, 2006

Effectiveness of Meeting Outcomes in Virtual vs. Face-to-Face Teams: A Comparison Study in China.
Proceedings of the Connecting the Americas. 12th Americas Conference on Information Systems, 2006

A Discriminative Global Training Algorithm for Statistical MT.
Proceedings of the ACL 2006, 2006

2005
Learning Bounds for Kernel Regression Using Effective Data Dimensionality.
Neural Comput., 2005

A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data.
J. Mach. Learn. Res., 2005

TREC 2005 Genomics Track Experiments at IBM Watson.
Proceedings of the Fourteenth Text REtrieval Conference, 2005

Analysis of Spectral Kernel Design based Semi-supervised Learning.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Localized Upper and Lower Bounds for Some Estimation Problems.
Proceedings of the Learning Theory, 18th Annual Conference on Learning Theory, 2005

Data Dependent Concentration Bounds for Sequential Prediction Algorithms.
Proceedings of the Learning Theory, 18th Annual Conference on Learning Theory, 2005

A Localized Prediction Model for Statistical Machine Translation.
Proceedings of the ACL 2005, 2005

A High-Performance Semi-Supervised Learning Method for Text Chunking.
Proceedings of the ACL 2005, 2005

2004
Statistical Analysis of Some Multi-Category Large Margin Classification Methods.
J. Mach. Learn. Res., 2004

Text categorization for a comprehensive time-dependent benchmark.
Inf. Process. Manag., 2004

Focused named entity recognition using machine learning.
Proceedings of the SIGIR 2004: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2004

Class-size Independent Generalization Analsysis of Some Discriminative Multi-Category Classification.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Support Vector Classification with Input Data Uncertainty.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Column-generation boosting methods for mixture of kernels.
Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004

Chinese Named Entity Recognition Based on Multilevel Linguistic Features.
Proceedings of the Natural Language Processing, 2004

Solving large scale linear prediction problems using stochastic gradient descent algorithms.
Proceedings of the Machine Learning, 2004

On the Convergence of MDL Density Estimation.
Proceedings of the Learning Theory, 17th Annual Conference on Learning Theory, 2004

2003
Sequential greedy approximation for certain convex optimization problems.
IEEE Trans. Inf. Theory, 2003

Leave-One-Out Bounds for Kernel Methods.
Neural Comput., 2003

Generalization Error Bounds for Bayesian Mixture Algorithms.
J. Mach. Learn. Res., 2003

Greedy Algorithms for Classification -- Consistency, Convergence Rates, and Adaptivity.
J. Mach. Learn. Res., 2003

Learning Bounds for a Generalized Family of Bayesian Posterior Distributions.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

An Infinity-sample Theory for Multi-category Large Margin Classification.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

On the Convergence of Boosting Procedures.
Proceedings of the Machine Learning, 2003

HowtogetaChineseName(Entity): Segmentation and Combination Issues.
Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2003

Named Entity Recognition through Classifier Combination.
Proceedings of the Seventh Conference on Natural Language Learning, 2003

A Robust Risk Minimization based Named Entity Recognition System.
Proceedings of the Seventh Conference on Natural Language Learning, 2003

Updating an NLP system to fit new domains: an empirical study on the sentence segmentation problem.
Proceedings of the Seventh Conference on Natural Language Learning, 2003

2002
Two-Sided Arnoldi and Nonsymmetric Lanczos Algorithms.
SIAM J. Matrix Anal. Appl., 2002

Approximation Bounds for Some Sparse Kernel Regression Algorithms.
Neural Comput., 2002

On the Dual Formulation of Regularized Linear Systems with Convex Risks.
Mach. Learn., 2002

Recommender Systems Using Linear Classifier.
J. Mach. Learn. Res., 2002

Text Chunking based on a Generalization of Winnow.
J. Mach. Learn. Res., 2002

Covering Number Bounds of Certain Regularized Linear Function Classes.
J. Mach. Learn. Res., 2002

On the Consistency of Instantaneous Rigid Motion Estimation.
Int. J. Comput. Vis., 2002

A decision-tree-based symbolic rule induction system for text categorization.
IBM Syst. J., 2002

Experiments in high-dimensional text categorization.
Proceedings of the SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002

Effective Dimension and Generalization of Kernel Learning.
Proceedings of the Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, 2002

Data-Dependent Bounds for Bayesian Mixture Methods.
Proceedings of the Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, 2002

Statistical Behavior and Consistency of Support Vector Machines, Boosting, and Beyond.
Proceedings of the Machine Learning, 2002

The Consistency of Greedy Algorithms for Classification.
Proceedings of the Computational Learning Theory, 2002

2001
Rank-One Approximation to High Order Tensors.
SIAM J. Matrix Anal. Appl., 2001

Text Categorization Based on Regularized Linear Classification Methods.
Inf. Retr., 2001

An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods.
AI Mag., 2001

Empirical Study of Recommender Systems Using Linear Classifiers.
Proceedings of the Knowledge Discovery and Data Mining, 2001

A General Greedy Approximation Algorithm with Applications.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

Generalization Performance of Some Learning Problems in Hilbert Functional Spaces.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

Some Sparse Approximation Bounds for Regression Problems.
Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28, 2001

A Leave-One-out Cross Validation Bound for Kernel Methods with Applications in Learning.
Proceedings of the Computational Learning Theory, 2001

A Sequential Approximation Bound for Some Sample-Dependent Convex Optimization Problems with Applications in Learning.
Proceedings of the Computational Learning Theory, 2001

Text Chunking using Regularized Winnow.
Proceedings of the Association for Computational Linguistic, 2001

2000
Regularized Winnow Methods.
Proceedings of the Advances in Neural Information Processing Systems 13, 2000

Convergence of Large Margin Separable Linear Classification.
Proceedings of the Advances in Neural Information Processing Systems 13, 2000

Active learning using adaptive resampling.
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 2000

1999
Some Theoretical Results Concerning the Convergence of Compositions of Regularized Linear Functions.
Proceedings of the Advances in Neural Information Processing Systems 12, [NIPS Conference, Denver, Colorado, USA, November 29, 1999

Fast, Robust, and Consistent Camera Motion Estimation.
Proceedings of the 1999 Conference on Computer Vision and Pattern Recognition (CVPR '99), 1999

Theoretical Analysis of a Class of Randomized Regularization Methods.
Proceedings of the Twelfth Annual Conference on Computational Learning Theory, 1999

1998
Methods for computational and statistical estimation with applications.
PhD thesis, 1998

On the Homotopy Method for Perturbed Symmetric Generalized Eigenvalue Problems.
SIAM J. Sci. Comput., 1998

A Linear Algorithm for Optimal Context Clustering with Application to Bi-level Image Coding.
Proceedings of the 1998 IEEE International Conference on Image Processing, 1998

Compression by Model Combination.
Proceedings of the Data Compression Conference, 1998

1997
A progressive Ziv-Lempel algorithm for image compression.
Proceedings of the Compression and Complexity of SEQUENCES 1997, 1997

1996
Optimal Surface Smoothing as Filter Design.
Proceedings of the Computer Vision, 1996

1995
Densities of Self-Similar Measures on the Line.
Exp. Math., 1995


  Loading...