We stand with Ukraine

We stand with Ukraine

Kyu Jeong Han

According to our database¹, Kyu Jeong Han authored at least 46 papers between 2002 and 2023.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2023

Wav2Seq: Pre-Training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages.

[BibT_eX]

[DOI]

,

,

Shinji Watanabe

,

,

,

Kilian Q. Weinberger

,

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

A review of speaker diarization: Recent advances with deep learning.

[BibT_eX]

[DOI]

,

,

Dimitrios Dimitriadis

,

,

Shinji Watanabe

,

Shrikanth Narayanan

Comput. Speech Lang., 2022

E-Branchformer: Branchformer with Enhanced Merging for Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

Prashant Sridhar

,

,

Shinji Watanabe

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

On the Use of External Data for Spoken Named Entity Recognition.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Performance-Efficiency Trade-Offs in Unsupervised Pre-Training for Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

Kilian Q. Weinberger

,

Proceedings of the IEEE International Conference on Acoustics, 2022

SLUE: New Benchmark Tasks For Spoken Language Understanding Evaluation on Natural Speech.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2022

SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

Shinji Watanabe

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Leveraging Pre-Trained Language Model for Speech Sentiment Analysis.

[BibT_eX]

[DOI]

,

,

,

,

Shinji Watanabe

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Multi-Mode Transformer Transducer with Stochastic Future Context.

[BibT_eX]

[DOI]

,

,

Prashant Sridhar

,

,

Shinji Watanabe

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Multistream CNN for Robust Acoustic Modeling.

[BibT_eX]

[DOI]

,

,

Venkata Krishna Naveen Tadala

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap.

[BibT_eX]

[DOI]

,

,

,

Shrikanth Narayanan

IEEE Signal Process. Lett., 2020

ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition.

[BibT_eX]

[DOI]

,

,

Jeremy Wohlwend

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2019

Speaker Diarization with Lexical Information.

[BibT_eX]

[DOI]

,

,

,

,

,

Panayiotis G. Georgiou

,

Shrikanth Narayanan

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Survey Talk: When Attention Meets Speech Applications: Speech & Speaker Recognition Perspective.

[BibT_eX]

[DOI]

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Multi-Stride Self-Attention for Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention with Dilated 1D Convolutions.

[BibT_eX]

[DOI]

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

The CAPIO 2017 Conversational Speech Recognition System.

[BibT_eX]

[DOI]

,

Akshay Chandrashekaran

,

,

CoRR, 2018

Densely Connected Networks for Conversational Speech Recognition.

[BibT_eX]

[DOI]

,

Akshay Chandrashekaran

,

,

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017

Deep Learning-Based Telephony Speech Recognition in the Wild.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016

Semi-Supervised Speaker Adaptation for In-Vehicle Speech Recognition with Deep Neural Networks.

[BibT_eX]

[DOI]

,

,

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2014

Robust language identification using convolutional neural network features.

[BibT_eX]

[DOI]

Sriram Ganapathy

,

,

,

Mohamed Kamal Omar

,

Maarten Van Segbroeck

,

Shrikanth S. Narayanan

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2013

Automatic speaker age and gender recognition using acoustic and prosodic level information fusion.

[BibT_eX]

[DOI]

,

,

Shrikanth S. Narayanan

Comput. Speech Lang., 2013

TRAP language identification system for RATS phase II evaluation.

[BibT_eX]

[DOI]

,

Sriram Ganapathy

,

,

Mohamed Kamal Omar

,

Shrikanth S. Narayanan

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012

Frame-based phonotactic Language Identification.

[BibT_eX]

[DOI]

,

Jason W. Pelecanos

Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Keyword-conditioned phone N-gram modeling with contextual information for speaker verification.

[BibT_eX]

[DOI]

,

Jason W. Pelecanos

,

Mohamed Kamal Omar

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Forensically inspired approaches to automatic speaker recognition.

[BibT_eX]

[DOI]

,

Mohamed Kamal Omar

,

Jason W. Pelecanos

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Multimodal Speaker Segmentation and Identification in Presence of Overlapped Speech Segments.

[BibT_eX]

[DOI]

,

,

Panayiotis G. Georgiou

,

Shrikanth S. Narayanan

J. Multim., 2010

Robust Multimodal Person Recognition Using Low-Complexity Audio-Visual Feature Fusion Approaches.

[BibT_eX]

[DOI]

,

,

Shrikanth S. Narayanan

Int. J. Semantic Comput., 2010

A cluster-profile representation of emotion using agglomerative hierarchical clustering.

[BibT_eX]

[DOI]

,

,

,

Shrikanth S. Narayanan

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Combining five acoustic level modeling methods for automatic speaker age and gender recognition.

[BibT_eX]

[DOI]

,

,

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

A variable frame length and rate algorithm based on the spectral kurtosis measure for speaker verification.

[BibT_eX]

[DOI]

,

,

,

Shrikanth S. Narayanan

,

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

An improved cluster model selection method for agglomerative hierarchical speaker clustering using incremental Gaussian mixture models.

[BibT_eX]

[DOI]

,

Shrikanth S. Narayanan

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

2009

A Low-Complexity Dynamic Face-Voice Feature Fusion Approach to Multimodal Person Recognition.

[BibT_eX]

[DOI]

,

,

Shrikanth S. Narayanan

Proceedings of the 11th IEEE International Symposium on Multimedia, 2009

Signature cluster model selection for incremental Gaussian mixture cluster modeling in agglomerative hierarchical speaker clustering.

[BibT_eX]

[DOI]

,

Shrikanth S. Narayanan

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Improved speaker diarization of meeting speech with recurrent selection of representative speech segments and participant interaction pattern modeling.

[BibT_eX]

[DOI]

,

Shrikanth S. Narayanan

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

2008

Strategies to Improve the Robustness of Agglomerative Hierarchical Clustering Under Data Source Variation for Speaker Diarization.

[BibT_eX]

[DOI]

,

,

Shrikanth S. Narayanan

IEEE Trans. Speech Audio Process., 2008

The SAIL speaker diarization system for analysis of spontaneous meetings.

[BibT_eX]

[DOI]

,

Panayiotis G. Georgiou

,

Shrikanth S. Narayanan

Proceedings of the International Workshop on Multimedia Signal Processing, 2008

Multimodal Speaker Segmentation in Presence of Overlapped Speech Segments.

[BibT_eX]

[DOI]

,

,

Panayiotis G. Georgiou

,

Shrikanth S. Narayanan

Proceedings of the Tenth IEEE International Symposium on Multimedia (ISM2008), 2008

Agglomerative hierarchical speaker clustering using incremental Gaussian mixture cluster modeling.

[BibT_eX]

[DOI]

,

Shrikanth S. Narayanan

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Novel inter-cluster distance measure combining GLR and ICR for improved agglomerative hierarchical speaker clustering.

[BibT_eX]

[DOI]

,

Shrikanth S. Narayanan

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

A robust stopping criterion for agglomerative hierarchical clustering in a speaker diarization system.

[BibT_eX]

[DOI]

,

Shrikanth S. Narayanan

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Robust speaker clustering strategies to data source variation for improved speaker diarization.

[BibT_eX]

[DOI]

,

,

Shrikanth S. Narayanan

Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2004

Robust speech recognition over packet networks: an overview.

[BibT_eX]

[DOI]

Naveen Srinivasamurthy

,

,

Shrikanth S. Narayanan

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

A distributed speech recognition system in multi-user environments.

[BibT_eX]

[DOI]

,

Shrikanth S. Narayanan

,

Naveen Srinivasamurthy

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

2002

Iterative decoding of a differential space-time block code with low complexity.

[BibT_eX]

[DOI]

,

Proceedings of the 55th IEEE Vehicular Technology Conference, 2002

Loading...