Wei Han

Orcid: 0000-0002-4201-9645

Affiliations:
  • Google
  • University of Illinois at Urbana-Champaign, Department of Electrical and Computer Engineering, Beckman Institute, Urbana, IL, USA


According to our database1, Wei Han authored at least 65 papers between 2014 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Two are better than one: Context window extension with multi-grained self-injection.
CoRR, 2024

INSTRAUG: Automatic Instruction Augmentation for Multimodal Instruction Fine-tuning.
CoRR, 2024

Self-Adaptive Sampling for Accurate Video Question Answering on Image Text Models.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024


Retrieval Augmented End-to-End Spoken Dialog Models.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Dialogue Relation Extraction with Document-Level Heterogeneous Graph Attention Networks.
Cogn. Comput., March, 2023

SLM: Bridge the thin gap between speech and text foundation models.
CoRR, 2023

Multimodal Modeling For Spoken Language Identification.
CoRR, 2023

SAS Video-QA: Self-Adaptive Sampling for Efficient Video Question-Answering.
CoRR, 2023

AudioPaLM: A Large Language Model That Can Speak and Listen.
CoRR, 2023

Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding.
CoRR, 2023

Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction.
CoRR, 2023

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages.
CoRR, 2023

Noise2Music: Text-conditioned Music Generation with Diffusion Models.
CoRR, 2023

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

Label Aware Speech Representation Learning For Language Identification.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Speech Aware Dialog System Technology Challenge (DSTC11).
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Accelerating RNN-T Training and Inference Using CTC Guidance.
Proceedings of the IEEE International Conference on Acoustics, 2023

Efficient Domain Adaptation for Speech Foundation Models.
Proceedings of the IEEE International Conference on Acoustics, 2023

SLM: Bridge the Thin Gap Between Speech and Text Foundation Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition.
IEEE J. Sel. Top. Signal Process., 2022

Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data.
CoRR, 2022

Unsupervised Data Selection via Discrete Speech Representation for ASR.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Universal Paralinguistic Speech Representations Using self-Supervised Conformers.
Proceedings of the IEEE International Conference on Acoustics, 2022


MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

SAT: Improving Semi-Supervised Text Classification with Simple Instance-Adaptive Self-Training.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

SANCL: Multimodal Review Helpfulness Prediction with Selective Attention and Natural Contrastive Learning.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

DoubleMix: Simple Interpolation-Based Data Augmentation for Text Classification.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

2021
Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models.
CoRR, 2021

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Exploring Targeted Universal Adversarial Perturbations to End-to-End ASR Models.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Bridging the Gap Between Streaming and Non-Streaming ASR Systems by Distilling Ensembles of CTC and RNN-T Models.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis.
Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling.
Proceedings of the 9th International Conference on Learning Representations, 2021

FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Better and Faster end-to-end Model for Streaming ASR.
Proceedings of the IEEE International Conference on Acoustics, 2021

Improving Streaming Automatic Speech Recognition with Non-Streaming Model Distillation on Unsupervised Data.
Proceedings of the IEEE International Conference on Acoustics, 2021

Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

w2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition.
CoRR, 2020

Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling.
CoRR, 2020

Improved Noisy Student Training for Automatic Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Conformer: Convolution-augmented Transformer for Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Streaming Object Detection for 3-D Point Clouds.
Proceedings of the Computer Vision - ECCV 2020, 2020

Scalability in Perception for Autonomous Driving: Waymo Open Dataset.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Learning compact neural network representations with structural priors
PhD thesis, 2019

StarNet: Targeted Computation for Object Detection in Point Clouds.
CoRR, 2019

A Comparison of End-to-End Models for Long-Form Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Learning 3D-FilterMap for Deep Convolutional Neural Networks.
CoRR, 2018

3D-FilterMap: A Compact Architecture for Deep Convolutional Neural Networks.
Proceedings of the 6th International Conference on Learning Representations, 2018

Image Super-Resolution via Dual-State Recurrent Networks.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Dilated Recurrent Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017


Balanced Two-Stage Residual Networks for Image Super-Resolution.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

2016
Robust Single Image Super-Resolution via Deep Networks With Sparse Prior.
IEEE Trans. Image Process., 2016

Seq-NMS for Video Object Detection.
CoRR, 2016

2015
Deeply Improved Sparse Coding for Image Super-Resolution.
CoRR, 2015

An Analysis of Unsupervised Pre-training in Light of Recent Advances.
Proceedings of the 3rd International Conference on Learning Representations, 2015

Heterogeneous Network Embedding via Deep Architectures.
Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015

Deep Networks for Image Super-Resolution with Sparse Prior.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Self-tuned deep super resolution.
Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015

2014
Multimedia Classification.
Proceedings of the Data Classification: Algorithms and Applications, 2014


  Loading...