Wei Han

Orcid: 0000-0002-4201-9645

Affiliations:

Google
University of Illinois at Urbana-Champaign, Department of Electrical and Computer Engineering, Beckman Institute, Urbana, IL, USA

According to our database¹, Wei Han authored at least 65 papers between 2014 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

Two are better than one: Context window extension with multi-grained self-injection.

[BibT_eX]

[DOI]

CoRR, 2024

INSTRAUG: Automatic Instruction Augmentation for Multimodal Instruction Fine-tuning.

[BibT_eX]

[DOI]

Wei Han

Hui Chen

Soujanya Poria

CoRR, 2024

Self-Adaptive Sampling for Accurate Video Question Answering on Image Text Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

RoboVQA: Multimodal Long-Horizon Reasoning for Robotics.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Retrieval Augmented End-to-End Spoken Dialog Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Dialogue Relation Extraction with Document-Level Heterogeneous Graph Attention Networks.

[BibT_eX]

[DOI]

Cogn. Comput., March, 2023

SLM: Bridge the thin gap between speech and text foundation models.

[BibT_eX]

[DOI]

CoRR, 2023

Multimodal Modeling For Spoken Language Identification.

[BibT_eX]

[DOI]

CoRR, 2023

SAS Video-QA: Self-Adaptive Sampling for Efficient Video Question-Answering.

[BibT_eX]

[DOI]

CoRR, 2023

AudioPaLM: A Large Language Model That Can Speak and Listen.

[BibT_eX]

[DOI]

CoRR, 2023

Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding.

[BibT_eX]

[DOI]

CoRR, 2023

Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction.

[BibT_eX]

[DOI]

Sharifah Mahani Aljunied

Soujanya Poria

Lidong Bing

CoRR, 2023

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages.

[BibT_eX]

[DOI]

CoRR, 2023

Noise2Music: Text-conditioned Music Generation with Diffusion Models.

[BibT_eX]

[DOI]

Christian Havnø Frank

CoRR, 2023

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

Label Aware Speech Representation Learning For Language Identification.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Speech Aware Dialog System Technology Challenge (DSTC11).

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Accelerating RNN-T Training and Inference Using CTC Guidance.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Efficient Domain Adaptation for Speech Foundation Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

SLM: Bridge the Thin Gap Between Speech and Text Foundation Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2022

Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data.

[BibT_eX]

[DOI]

CoRR, 2022

Unsupervised Data Selection via Discrete Speech Representation for ASR.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Universal Paralinguistic Speech Representations Using self-Supervised Conformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Improving The Latency And Quality Of Cascaded Encoders.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

SAT: Improving Semi-Supervised Text Classification with Simple Instance-Adaptive Self-Training.

[BibT_eX]

[DOI]

Hui Chen

Wei Han

Soujanya Poria

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

SANCL: Multimodal Review Helpfulness Prediction with Selective Attention and Natural Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Computational Linguistics, 2022

DoubleMix: Simple Interpolation-Based Data Augmentation for Text Classification.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Computational Linguistics, 2022

2021

Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models.

[BibT_eX]

[DOI]

CoRR, 2021

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Exploring Targeted Universal Adversarial Perturbations to End-to-End ASR Models.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Bridging the Gap Between Streaming and Non-Streaming ASR Systems by Distilling Ensembles of CTC and RNN-T Models.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis.

[BibT_eX]

[DOI]

Louis-Philippe Morency

Soujanya Poria

Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

A Better and Faster end-to-end Model for Streaming ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Improving Streaming Automatic Speech Recognition with Non-Streaming Model Distillation on Unsupervised Data.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis.

[BibT_eX]

[DOI]

Wei Han

Hui Chen

Soujanya Poria

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

w2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2020

Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling.

[BibT_eX]

[DOI]

CoRR, 2020

Improved Noisy Student Training for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Conformer: Convolution-augmented Transformer for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Streaming Object Detection for 3-D Point Clouds.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Scalability in Perception for Autonomous Driving: Waymo Open Dataset.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Learning compact neural network representations with structural priors

[BibT_eX]

[DOI]

Wei Han

PhD thesis, 2019

StarNet: Targeted Computation for Object Detection in Point Clouds.

[BibT_eX]

[DOI]

CoRR, 2019

A Comparison of End-to-End Models for Long-Form Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Learning 3D-FilterMap for Deep Convolutional Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2018

3D-FilterMap: A Compact Architecture for Deep Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Image Super-Resolution via Dual-State Recurrent Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Dilated Recurrent Neural Networks.

[BibT_eX]

[DOI]

Mark A. Hasegawa-Johnson

Thomas S. Huang

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

Balanced Two-Stage Residual Networks for Image Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

2016

Robust Single Image Super-Resolution via Deep Networks With Sparse Prior.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2016

Seq-NMS for Video Object Detection.

[BibT_eX]

[DOI]

CoRR, 2016

2015

Deeply Improved Sparse Coding for Image Super-Resolution.

[BibT_eX]

[DOI]

CoRR, 2015

An Analysis of Unsupervised Pre-training in Light of Recent Advances.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Conference on Learning Representations, 2015

Heterogeneous Network Embedding via Deep Architectures.

[BibT_eX]

[DOI]

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015

Deep Networks for Image Super-Resolution with Sparse Prior.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Self-tuned deep super resolution.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015

2014

Multimedia Classification.

[BibT_eX]

[DOI]

Proceedings of the Data Classification: Algorithms and Applications, 2014

Wei Han

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...