Xiaodan Zhuang

According to our database1, Xiaodan Zhuang authored at least 47 papers between 2007 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Focused Discriminative Training For Streaming CTC-Trained Automatic Speech Recognition Models.
CoRR, 2024

Optimizing Byte-level Representation for End-to-end ASR.
CoRR, 2024

2023
Approximate Nearest Neighbour Phrase Mining for Contextual Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Variable Attention Masking for Configurable Transformer Transducer Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

2021
Frame-Level Specaugment for Deep Convolutional Neural Networks in Hybrid ASR Systems.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

2020
SNDCNN: Self-Normalizing Deep CNNs with Scaled Exponential Linear Units for Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Exploring Retraining-free Speech Recognition for Intra-sentential Code-switching.
Proceedings of the IEEE International Conference on Acoustics, 2019

2017
Toward a General Distributed Messaging Framework for Online Transaction Processing Applications.
IEEE Access, 2017

Improving DNN Bluetooth Narrowband Acoustic Models by Cross-Bandwidth and Cross-Lingual Initialization.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2014
Improving speech-based PTSD detection via multi-view learning.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Effective representations for leveraging language content in multimedia event detection.
Proceedings of the IEEE International Conference on Acoustics, 2014

Text detection and recognition in natural scenes and consumer videos.
Proceedings of the IEEE International Conference on Acoustics, 2014

Text Classification via iVector Based Feature Representation.
Proceedings of the 11th IAPR International Workshop on Document Analysis Systems, 2014

Zero-Shot Event Detection Using Multi-modal Fusion of Weakly Supervised Concepts.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Compact unsupervised EEG response representation for emotion recognition.
Proceedings of IEEE-EMBS International Conference on Biomedical and Health Informatics, 2014

2013
Saliency-maximized audio visualization and efficient audio-visual browsing for faster-than-real-time human acoustic event detection.
ACM Trans. Appl. Percept., 2013

Scene image categorization and video event detection using Naive Bayes Nearest Neighbor.
Proceedings of the 2013 IEEE Workshop on Applications of Computer Vision, 2013

BBN VISER TRECVID 2013 Multimedia Event Detection and Multimedia Event Recounting Systems.
Proceedings of the 2013 TREC Video Retrieval Evaluation, 2013

Compact bag-of-words visual representation for effective linear classification.
Proceedings of the ACM Multimedia Conference, 2013

Probabilistic trainable segmenter for call center audio using multiple features.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Audio self organized units for high-level event detection.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012
BBNVISER : BBN VISER TRECVID 2012 Multimedia Event Detection and Multimedia Event Recounting Systems.
Proceedings of the 2012 TREC Video Retrieval Evaluation, 2012

Compact Audio Representation for Event Detection in Consumer Media.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Robust Event Detection From Spoken Content In Consumer Domain Videos.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Improving faster-than-real-time human acoustic event detection by saliency-maximized audio visualization.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Multi-channel Shape-Flow Kernel Descriptors for Robust Video Event Detection and Retrieval.
Proceedings of the Computer Vision - ECCV 2012, 2012

Multimodal feature fusion for robust event detection in web videos.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
Efficient Object Localization with Variation-Normalized Gaussianized Vectors.
Proceedings of the Intelligent Video Event Analysis and Understanding, 2011

Modeling audio and visual cues for real-world event detection
PhD thesis, 2011


Unlabeled data and other marginals.
Proceedings of the 2011 Symposium on Machine Learning in Speech and Language Processing, 2011

Synthesizing visual speech trajectory with minimum generation error.
Proceedings of the IEEE International Conference on Acoustics, 2011

Improving acoustic event detection using generalizable visual features and multi-modality modeling.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Real-world acoustic event detection.
Pattern Recognit. Lett., 2010

Novel Gaussianized vector representation for improved natural scene categorization.
Pattern Recognit. Lett., 2010

A minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversion.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

FSM-based pronunciation modeling using articulatory phonological code.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

2009
Articulatory phonological code for word classification.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Acoustic fall detection using Gaussian mixture models and GMM supervectors.
Proceedings of the IEEE International Conference on Acoustics, 2009

Long-time span acoustic activity analysis from far-field sensors in smart homes.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
SIFT-Bag kernel for video event analysis.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

The entropy of the articulatory phonological code: recognizing gestures from tract variables.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Face age estimation using patch-based hidden Markov model supervectors.
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

A novel Gaussianized vector representation for natural scene categorization.
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

Feature analysis and selection for acoustic event detection.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
HMM-Based Acoustic Event Detection with AdaBoost Feature Selection.
Proceedings of the Multimodal Technologies for Perception of Humans, 2007

Multichannel and Multimodality Person Identification.
Proceedings of the Multimodal Technologies for Perception of Humans, 2007


  Loading...