Xiaodan Zhuang

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Variable Attention Masking for Configurable Transformer Transducer Speech Recognition.

[BibT_eX]

[DOI]

Pawel Swietojanski

Stefan Braun

Dogan Can

Thiago Fraga da Silva

Proceedings of the IEEE International Conference on Acoustics, 2023

2021

Frame-Level Specaugment for Deep Convolutional Neural Networks in Hybrid ASR Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

2020

SNDCNN: Self-Normalizing Deep CNNs with Scaled Exponential Linear Units for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Exploring Retraining-free Speech Recognition for Intra-sentential Code-switching.

[BibT_eX]

[DOI]

Sabato Marco Siniscalchi

Proceedings of the IEEE International Conference on Acoustics, 2019

2017

Toward a General Distributed Messaging Framework for Online Transaction Processing Applications.

[BibT_eX]

[DOI]

IEEE Access, 2017

Improving DNN Bluetooth Narrowband Acoustic Models by Cross-Bandwidth and Cross-Lingual Initialization.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2014

Improving speech-based PTSD detection via multi-view learning.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Effective representations for leveraging language content in multimedia event detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Text detection and recognition in natural scenes and consumer videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Text Classification via iVector Based Feature Representation.

[BibT_eX]

[DOI]

Proceedings of the 11th IAPR International Workshop on Document Analysis Systems, 2014

Zero-Shot Event Detection Using Multi-modal Fusion of Weakly Supervised Concepts.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Compact unsupervised EEG response representation for emotion recognition.

[BibT_eX]

[DOI]

Viktor Rozgic

Michael Crystal

Proceedings of IEEE-EMBS International Conference on Biomedical and Health Informatics, 2014

2013

Saliency-maximized audio visualization and efficient audio-visual browsing for faster-than-real-time human acoustic event detection.

[BibT_eX]

[DOI]

Shiv Naga Prasad Vitaladevuni

ACM Trans. Appl. Percept., 2013

Scene image categorization and video event detection using Naive Bayes Nearest Neighbor.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE Workshop on Applications of Computer Vision, 2013

BBN VISER TRECVID 2013 Multimedia Event Detection and Multimedia Event Recounting Systems.

[BibT_eX]

[DOI]

Proceedings of the 2013 TREC Video Retrieval Evaluation, 2013

Compact bag-of-words visual representation for effective linear classification.

[BibT_eX]

[DOI]

Proceedings of the ACM Multimedia Conference, 2013

Probabilistic trainable segmenter for call center audio using multiple features.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Audio self organized units for high-level event detection.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012

BBNVISER : BBN VISER TRECVID 2012 Multimedia Event Detection and Multimedia Event Recounting Systems.

[BibT_eX]

[DOI]

Proceedings of the 2012 TREC Video Retrieval Evaluation, 2012

Compact Audio Representation for Event Detection in Consumer Media.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Robust Event Detection From Spoken Content In Consumer Domain Videos.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Improving faster-than-real-time human acoustic event detection by saliency-maximized audio visualization.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Multi-channel Shape-Flow Kernel Descriptors for Robust Video Event Detection and Retrieval.

[BibT_eX]

[DOI]

Shiv Naga Prasad Vitaladevuni

Proceedings of the Computer Vision - ECCV 2012, 2012

Multimodal feature fusion for robust event detection in web videos.

[BibT_eX]

[DOI]

Shiv Naga Prasad Vitaladevuni

Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011

Efficient Object Localization with Variation-Normalized Gaussianized Vectors.

[BibT_eX]

[DOI]

Proceedings of the Intelligent Video Event Analysis and Understanding, 2011

Modeling audio and visual cues for real-world event detection

[BibT_eX]

[DOI]

PhD thesis, 2011

BBN VISER TRECVID 2011 Multimedia Event Detection System.

[BibT_eX]

[DOI]

Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

Unlabeled data and other marginals.

[BibT_eX]

[DOI]

Jui-Ting Huang

Proceedings of the 2011 Symposium on Machine Learning in Speech and Language Processing, 2011

Synthesizing visual speech trajectory with minimum generation error.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Improving acoustic event detection using generalizable visual features and multi-modality modeling.

[BibT_eX]

[DOI]

Po-Sen Huang

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Real-world acoustic event detection.

[BibT_eX]

[DOI]

Pattern Recognit. Lett., 2010

Novel Gaussianized vector representation for improved natural scene categorization.

[BibT_eX]

[DOI]

Hao Tang

Pattern Recognit. Lett., 2010

A minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversion.

[BibT_eX]

[DOI]

Lijuan Wang

Frank K. Soong

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

FSM-based pronunciation modeling using articulatory phonological code.

[BibT_eX]

[DOI]

Chi Hu

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

2009

Articulatory phonological code for word classification.

[BibT_eX]

[DOI]

Hosung Nam

Louis Goldstein

Elliot Saltzman

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Acoustic fall detection using Gaussian mixture models and GMM supervectors.

[BibT_eX]

[DOI]

Jing Huang

Gerasimos Potamianos

Proceedings of the IEEE International Conference on Acoustics, 2009

Long-time span acoustic activity analysis from far-field sensors in smart homes.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

SIFT-Bag kernel for video event analysis.

[BibT_eX]

[DOI]

Proceedings of the 16th International Conference on Multimedia 2008, 2008

The entropy of the articulatory phonological code: recognizing gestures from tract variables.

[BibT_eX]

[DOI]

Hosung Nam

Louis M. Goldstein

Elliot Saltzman

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Face age estimation using patch-based hidden Markov model supervectors.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

A novel Gaussianized vector representation for natural scene categorization.

[BibT_eX]

[DOI]

Hao Tang

Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

Feature analysis and selection for acoustic event detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

HMM-Based Acoustic Event Detection with AdaBoost Feature Selection.

[BibT_eX]

[DOI]

Proceedings of the Multimodal Technologies for Perception of Humans, 2007

Multichannel and Multimodality Person Identification.

[BibT_eX]

[DOI]