Qin Jin

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

M4MM '22: 1st International Workshop on Methodologies for Multimedia.

[BibT_eX]

[DOI]

Xavier Alameda-Pineda

Vincent Oria

Laura Toni

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

PIC'22: 4th Person in Context Workshop.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Muskits: an End-to-end Music Processing Toolkit for Singing Voice Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Memobert: Pre-Training Model with Prompt-Based Learning for Multimodal Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Training Strategies for Automatic Song Writing: A Unified Framework Perspective.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Leveraging Trust Relations to Improve Academic Patent Recommendation.

[BibT_eX]

[DOI]

Proceedings of the 55th Hawaii International Conference on System Sciences, 2022

MovieUN: A Dataset for Movie Understanding and Narrating.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Few-Shot Action Recognition with Hierarchical Matching and Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Multi-Task Learning Framework for Emotion Recognition In-the-Wild.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

Unifying Event Detection and Captioning as Sequence Generation via Pre-training.

[BibT_eX]

[DOI]

Qi Zhang

Yuqing Song

Proceedings of the Computer Vision - ECCV 2022, 2022

TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

VRDFormer: End-to-End Video Visual Relation Detection with Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Valence and Arousal Estimation based on Multimodal Temporal-Aware Features for Videos in the Wild.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

DialogueEIN: Emotion Interaction Network for Dialogue Affective Analysis.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Computational Linguistics, 2022

M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database.

[BibT_eX]

[DOI]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Image Difference Captioning with Pre-training and Contrastive Learning.

[BibT_eX]

[DOI]

Linli Yao

Weiying Wang

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Pre-Trained Models: Past, Present and Future.

[BibT_eX]

[DOI]

CoRR, 2021

Team RUC_AIM3 Technical Report at ActivityNet 2021: Entities Object Localization.

[BibT_eX]

[DOI]

CoRR, 2021

WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training.

[BibT_eX]

[DOI]

CoRR, 2021

Pre-trained models: Past, present and future.

[BibT_eX]

[DOI]

AI Open, 2021

Efficient Proposal Generation with U-shaped Network for Temporal Sentence Grounding.

[BibT_eX]

[DOI]

Ludan Ruan

Proceedings of the MMAsia '21: ACM Multimedia Asia, Gold Coast, Australia, December 1, 2021

Multimodal Fusion Strategies for Physiological-emotion Analysis.

[BibT_eX]

[DOI]

Proceedings of the MuSe '21: Proceedings of the 2nd on Multimodal Sentiment Analysis Challenge, 2021

Question-controlled Text-aware Image Captioning.

[BibT_eX]

[DOI]

Anwen Hu

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Product-oriented Machine Translation with Cross-modal Cross-lingual Pre-training.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

MMPT'21: International Joint Workshop on Multi-Modal Pre-Training for Multimedia Understanding.

[BibT_eX]

[DOI]

Yong Rui

Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

Speech Emotion Recognition via Multi-Level Cross-Modal Distillation.

[BibT_eX]

[DOI]

Ruichen Li

Jinming Zhao

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Sequence-To-Sequence Singing Voice Synthesis With Perceptual Entropy Loss.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Language Resource Efficient Learning for Captioning.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Towards Diverse Paragraph Captioning for Untrimmed Videos.

[BibT_eX]

[DOI]

Yuqing Song

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Missing Modality Imagination Network for Emotion Recognition with Uncertain Missing Modalities.

[BibT_eX]

[DOI]

Jinming Zhao

Ruichen Li

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

MMGCN: Multimodal Fusion via Deep Graph Convolution Network for Emotion Recognition in Conversation.

[BibT_eX]

[DOI]

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020

The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020).

[BibT_eX]

[DOI]

CoRR, 2020

Team RUC_AIM3 Technical Report at Activitynet 2020 Task 2: Exploring Sequential Events Detection for Dense Video Captioning.

[BibT_eX]

[DOI]

CoRR, 2020

YouMakeup VQA Challenge: Towards Fine-grained Action Understanding in Domain-Specific Videos.

[BibT_eX]

[DOI]

CoRR, 2020

RUC_AIM3 at TRECVID 2020: Ad-hoc Video Search & Video to Text Description.

[BibT_eX]

[DOI]

Proceedings of the 2020 TREC Video Retrieval Evaluation, 2020

VideoIC: A Video Interactive Comments Dataset and Multimodal Multitask Learning for Comments Generation.

[BibT_eX]

[DOI]

Weiying Wang

Jieting Chen

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Semi-supervised Multi-modal Emotion Recognition with Cross-Modal Distribution Matching.

[BibT_eX]

[DOI]

Jingjun Liang

Ruichen Li

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Multi-modal Fusion for Video Sentiment Analysis.

[BibT_eX]

[DOI]

Proceedings of the MuSe'20: Proceedings of the 1st International on Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop, 2020

ICECAP: Information Concentrated Entity-aware Image Captioning.

[BibT_eX]

[DOI]

Anwen Hu

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Context-Aware Goodness of Pronunciation for Computer-Assisted Pronunciation Training.

[BibT_eX]

[DOI]

Jiatong Shi

Nan Huo

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Skeleton-Based Interactive Graph Network For Human Object Interaction Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Fine-Grained Video-Text Retrieval With Hierarchical Graph Reasoning.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Say As You Wish: Fine-Grained Control of Image Caption Generation With Abstract Scene Graphs.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Better Captioning With Sequence-Level Exploration.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Generating Video Descriptions With Latent Topic Guidance.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2019

Integrating Temporal and Spatial Attentions for VATEX Video Captioning Challenge 2019.

[BibT_eX]

[DOI]

CoRR, 2019

Activitynet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos.

[BibT_eX]

[DOI]

CoRR, 2019

RUC_AIM3 at TRECVID 2019: Video to Text.

[BibT_eX]

[DOI]

Proceedings of the 2019 TREC Video Retrieval Evaluation, 2019

Visual Relation Detection with Multi-Level Attention.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Relation Understanding in Videos.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Adversarial Domain Adaption for Multi-Cultural Dimensional Emotion Recognition in Dyadic Interactions.

[BibT_eX]

[DOI]

Proceedings of the 9th International on Audio/Visual Emotion Challenge and Workshop, 2019

Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Unpaired Cross-lingual Image Caption Generation with Self-Supervised Rewards.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

RUC at MediaEval 2019: Video Memorability Prediction Based on Visual Textual and Concept Related Features.

[BibT_eX]

[DOI]

Proceedings of the Working Notes Proceedings of the MediaEval 2019 Workshop, 2019

Speech Emotion Recognition in Dyadic Dialogues with Attentive Interaction Modeling.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

From Words to Sentences: A Progressive Learning Approach for Zero-resource Machine Translation with Visual Pivots.

[BibT_eX]

[DOI]

Jianlong Fu

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Cross-culture Multimodal Emotion Recognition with Adversarial Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

YouMakeup: A Large-Scale Domain-Specific Multimodal Dataset for Fine-Grained Semantic Comprehension.

[BibT_eX]

[DOI]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Semi-supervised Multimodal Emotion Recognition with Improved Wasserstein GANs.

[BibT_eX]

[DOI]

Jingjun Liang

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Unsupervised Bilingual Lexicon Induction from Mono-Lingual Multimodal Data.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

RUC+CMU: System Report for Dense Captioning Events in Videos.

[BibT_eX]

[DOI]

CoRR, 2018

Informedia @ TRECVID 2018: Ad-hoc Video Search, Video to Text Description, Activities in Extended video.

[BibT_eX]

[DOI]

Proceedings of the 2018 TREC Video Retrieval Evaluation, 2018

Multimodal Dimensional and Continuous Emotion Recognition in Dyadic Video Interactions.

[BibT_eX]

[DOI]

Jinming Zhao

Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

iMakeup: Makeup Instructional Video Dataset for Fine-Grained Dense Video Captioning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Multi-modal Multi-cultural Dimensional Continues Emotion Recognition in Dyadic Interactions.

[BibT_eX]

[DOI]

Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop, 2018

Session details: Deep-2 (Recognition).

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Class-aware Self-Attention for Audio Event Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

RUC at MediaEval 2018: Visual and Textual Features Exploration for Predicting Media Memorability.

[BibT_eX]

[DOI]

Proceedings of the Working Notes Proceedings of the MediaEval 2018 Workshop, 2018

2017

Group division based on common weights in cross efficiency evaluation.

[BibT_eX]

[DOI]

Mengying Zhang

Hongwei Liu

Int. J. Inf. Decis. Sci., 2017

Informedia @ TRECVID 2017.

[BibT_eX]

[DOI]

Proceedings of the 2017 TREC Video Retrieval Evaluation, 2017

Knowing Yourself: Improving Video Caption via In-depth Recap.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on Multimedia Conference, 2017

Multimodal Multi-task Learning for Dimensional and Continuous Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, Mountain View, CA, USA, October 23, 2017

Video Captioning with Guidance of Multimodal Latent Topics.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on Multimedia Conference, 2017

Generating Video Descriptions with Topic Guidance.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, 2017

RUC at MediaEval 2017: Predicting Media Interestingness Task.

[BibT_eX]

[DOI]

Proceedings of the Working Notes Proceedings of the MediaEval 2017 Workshop co-located with the Conference and Labs of the Evaluation Forum (CLEF 2017), 2017

Emotion recognition with multimodal features and temporal models.

[BibT_eX]

[DOI]

Proceedings of the 19th ACM International Conference on Multimodal Interaction, 2017

Facial Action Units Detection with Multi-Features and -AUs Fusion.

[BibT_eX]

[DOI]

Xinrui Li

Proceedings of the 12th IEEE International Conference on Automatic Face & Gesture Recognition, 2017

2016

Boosting Recommendation in Unexplored Categories by User Price Preference.

[BibT_eX]

[DOI]

ACM Trans. Inf. Syst., 2016

The Study of the Entrepreneurial Leadership Style of Real Estate Industry in China: Based on the Content Analysis of Microblog.

[BibT_eX]

[DOI]

Mengying Zhang

Hongwei Liu

Int. J. Knowl. Based Organ., 2016

Coordinate the Express Delivery Supply Chain with Option Contracts.

[BibT_eX]

[DOI]

Mengying Zhang

Int. J. Inf. Syst. Supply Chain Manag., 2016

A hybrid approach based on stochastic competitive Hopfield neural network and efficient genetic algorithm for frequency assignment problem.

[BibT_eX]

[DOI]

Appl. Soft Comput., 2016

Informedia @ TRECVID 2016.

[BibT_eX]

[DOI]

Proceedings of the 2016 TREC Video Retrieval Evaluation, 2016

Improving Image Captioning by Concept-Based Sentence Reranking.

[BibT_eX]

[DOI]

Xirong Li

Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

History Rhyme: Searching Historic Events by Multimedia Knowledge.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Detecting Violence in Video using Subclasses.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Describing Videos using Multi-modal Fusion.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Semantic Image Profiling for Historic Events: Linking Images to Phrases.

[BibT_eX]

[DOI]

Yifan Xiong

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Multi-modal Conditional Attention Fusion for Dimensional Emotion Prediction.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Video Description Generation using Audio and Visual Cues.

[BibT_eX]

[DOI]

Junwei Liang

Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

RUC at MediaEval 2016 Emotional Impact of Movies Task: Fusion of Multimodal Features.

[BibT_eX]

[DOI]

Proceedings of the Working Notes Proceedings of the MediaEval 2016 Workshop, 2016

RUC at MediaEval 2016: Predicting Media Interestingness Task.

[BibT_eX]

[DOI]

Yujie Dian

Proceedings of the Working Notes Proceedings of the MediaEval 2016 Workshop, 2016

Generating Natural Video Descriptions via Multimodal Processing.

[BibT_eX]

[DOI]

Junwei Liang

Xiaozhu Lin

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Video emotion recognition in the wild based on fusion of multimodal features.

[BibT_eX]

[DOI]

Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016

Violent Scene Detection Using Convolutional Neural Networks and Deep Audio Features.

[BibT_eX]

[DOI]

Guankun Mu

Haibing Cao

Proceedings of the Pattern Recognition - 7th Chinese Conference, 2016

Emotion Recognition in Videos via Fusing Multimodal Features.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition - 7th Chinese Conference, 2016

2015

Exploitation and Exploration Balanced Hierarchical Summary for Landmark Images.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2015

Persistent B+-Trees in Non-Volatile Main Memory.

[BibT_eX]

[DOI]

Shimin Chen

Proc. VLDB Endow., 2015

基于声学特征的语言情感识别 (Speech Emotion Recognition Based on Acoustic Features).

[BibT_eX]

[DOI]

计算机科学, 2015

Lead curve detection in drawings with complex cross-points.

[BibT_eX]

[DOI]

Neurocomputing, 2015

Image Profiling for History Events on the Fly.

[BibT_eX]

[DOI]

Yong Yu

Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Multi-modal Dimensional Emotion Recognition using Recurrent Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge, 2015

Semantic Concept Annotation For User Generated Videos Using Soundtracks.

[BibT_eX]

[DOI]

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

RUCMM at MediaEval 2015 Affective Impact of Movies Task: Fusion of Audio and Visual Cues.

[BibT_eX]

[DOI]

Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Detecting semantic concepts in consumer videos using audio.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Speech emotion recognition with acoustic and lexical features.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

RUC-Tencent at ImageCLEF 2015: Concept Detection, Localization and Sentence Generation.

[BibT_eX]

[DOI]

Proceedings of the Working Notes of CLEF 2015, 2015

Improving emotion classification on Chinese microblog texts with auxiliary cross-domain data.

[BibT_eX]

[DOI]

Huimin Wu

Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

2014

Special Issue on "Hybrid intelligence for growing internet and its applications".

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2014

A guided Hopfield evolutionary algorithm with local search for maximum clique problem.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Conference on Systems, Man, and Cybernetics, 2014

Does product recommendation meet its waterloo in unexplored categories?: no, price comes to help.

[BibT_eX]

[DOI]

Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014

Semantic Concept Annotation of Consumer Videos at Frame-Level Using Audio.

[BibT_eX]

[DOI]

Proceedings of the Advances in Multimedia Information Processing - PCM 2014, 2014

Adaptive Tag Selection for Image Annotation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Multimedia Information Processing - PCM 2014, 2014

Emotion Classification of Chinese Microblog Text via Fusion of BoW and eVector Feature Representations.

[BibT_eX]

[DOI]

Chengxin Li

Huimin Wu

Proceedings of the Natural Language Processing and Chinese Computing, 2014

Speech emotion classification using acoustic features.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Structure Perturbation Optimization for Hopfield-Type Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2014, 2014

Renmin University of China at ImageCLEF 2014 Scalable Concept Image Annotation.

[BibT_eX]

[DOI]

Proceedings of the Working Notes for CLEF 2014 Conference, 2014

An overview of robustness related issues in speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013

Tell me what happened here in history.

[BibT_eX]

[DOI]

Proceedings of the ACM Multimedia Conference, 2013

Renmin University of China at ImageCLEF 2013 Scalable Concept Image Annotation.

[BibT_eX]

[DOI]

Proceedings of the Working Notes for CLEF 2013 Conference , 2013

2012

Event-based Video Retrieval Using Audio.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011

Informedia@TRECVID 2011: Surveillance Event Detection.

[BibT_eX]

[DOI]

Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

Investigation of Cross-Show Speaker Diarization.

[BibT_eX]

[DOI]

Qian Yang

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Analysis of Dialectal Influence in Pan-Arabic ASR.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Harmonic Structure Transform for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

2010

Modeling Prosody for Speaker Recognition: Why Estimating Pitch May Be a Red Herring.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010

The 2010 CMU GALE speech-to-text system.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Speaker identification with distant microphone speech.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Speaker identification using warped MVDR cepstral features.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Improving speaker segmentation via speaker identification and text segmentation.

[BibT_eX]

[DOI]

Runxin Li

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

The I4U system in NIST 2008 speaker recognition evaluation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

Modeling instantaneous intonation for speaker identification using the fundamental frequency variation spectrum.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

Voice convergin: Speaker de-identification by voice transformation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

Detecting bandlimited audio in broadcast television shows.

[BibT_eX]

[DOI]

Mark C. Fuhs

Proceedings of the IEEE International Conference on Acoustics, 2009

Speaker de-identification via voice transformation.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008

Robust far-field speaker identification under mismatched conditions.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

The CMU-interACT 2008 Mandarin transcription system.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Is voice transformation a threat to speaker identification?

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Far-Field Speaker Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2007

Whispering Speaker Identification.

[BibT_eX]

[DOI]

Szu-Chen Stan Jou

Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Multi-modal Person Identification in a Smart Environment.

[BibT_eX]

[DOI]

Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

ISL Person Identification Systems in the CLEAR 2007 Evaluations.

[BibT_eX]

[DOI]

Proceedings of the Multimodal Technologies for Perception of Humans, 2007

2006

Far-Field Speaker Recognition.

[BibT_eX]

[DOI]

Yue Pan

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

ISL Person Identification Systems in the CLEAR Evaluations.

[BibT_eX]

[DOI]

Hazim Kemal Ekenel

Proceedings of the Multimodal Technologies for Perception of Humans, 2006

2005

CMU Informedia's TRECVID 2005 Skirmishes.

[BibT_eX]

[DOI]

Proceedings of the 2005 TREC Video Retrieval Evaluation, 2005

2004

Issues in meeting transcription - the ISL meeting transcription system.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Crosscorrelation-based multispeaker speech activity detection.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Speaker segmentation and clustering in meetings.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

The 2003 ISL rich transcription system for conversational telephony speech.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003

The SuperSID project: exploiting high-level information for high-accuracy speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Phonetic speaker recognition using maximum-likelihood binary-decision tree models.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Combining cross-stream and time dimensions in phonetic speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002

Phonetic speaker identification.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Speaker identification using multilingual phone strings.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2002

Improvements in Non-Verbal Cue Identification Using Multilingual Phone Strings.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Speech-to-Speech Translation: Algorithms and Systems@ACL 2002, 2002

2000

A na ve de-lambing method for speaker identification.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Application of LDA to speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1998

A high-performance text-independent speaker identification system based on BCDM.

[BibT_eX]

[DOI]