Javier Hernando

According to our database1, Javier Hernando authored at least 142 papers between 1989 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
BSC-UPC at EmoSPeech-IberLEF2024: Attention Pooling for Emotion Recognition.
Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2024) co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2024), 2024

Double Multi-Head Attention Multimodal System for Odyssey 2024 Speech Emotion Recognition Challenge.
Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Language modelling for speaker diarization in telephonic interviews.
Comput. Speech Lang., 2023

2022
Work-Efficient Parallel Non-Maximum Suppression Kernels.
Comput. J., 2022

Speaker Characterization by means of Attention Pooling.
Proceedings of the 6th International Conference, 2022

2021
The AXIOM Project: IoT on Heterogeneous Embedded Platforms.
IEEE Des. Test, 2021

Double Multi-Head Attention for Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2021

Self-supervised Deep Learning Approaches to Speaker Recognition: A Ph.D. Thesis Overview.
Proceedings of the Fifth International Conference, 2021

2020
Deep Learning in Speaker Recognition.
Proceedings of the Development and Analysis of Deep Learning Architectures, 2020

The UPC Speaker Verification System Submitted to VoxCeleb Speaker Recognition Challenge 2020 (VoxSRC-20).
CoRR, 2020

End-to-end User Recognition using Touchscreen Biometrics.
CoRR, 2020

Self-Attention Encoding and Pooling for Speaker Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Unsupervised Training of Siamese Networks for Speaker Verification.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

I-Vector Transformation Using K-Nearest Neighbors for Speaker Verification.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Auto-Encoding Nearest Neighbor i-Vectors for Speaker Verification.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Self Multi-Head Attention for Speaker Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

DNN Speaker Embeddings Using Autoencoder Pre-Training.
Proceedings of the 27th European Signal Processing Conference, 2019

2018
The use of long-term features for GMM- and i-vector-based speaker diarization systems.
EURASIP J. Audio Speech Music. Process., 2018

Restricted Boltzmann machines for vector representation of speech in speaker recognition.
Comput. Speech Lang., 2018

UPC Multimodal Speaker Diarization System for the 2018 Albayzin Challenge.
Proceedings of the Fourth International Conference, 2018

Restricted Boltzmann Machine Vectors for Speaker Clustering.
Proceedings of the Fourth International Conference, 2018

2017
Deep Learning Backend for Single and Multisession i-Vector Speaker Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

LSTM Neural Network-Based Speaker Segmentation Using Acoustic and Language Modelling.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017


2016
The AXIOM software layers.
Microprocess. Microsystems, 2016

Short- and Long-Term Speech Features for Hybrid HMM-i-Vector based Speaker Diarization System.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

From Features to Speaker Vectors by means of Restricted Boltzmann Machine Adaptation.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

UPC System for the 2016 MediaEval Multimodal Person Discovery in Broadcast TV task.
Proceedings of the Working Notes Proceedings of the MediaEval 2016 Workshop, 2016

The CAMOMILE Collaborative Annotation Platform for Multi-modal, Multi-lingual and Multi-media Documents.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Improving i-Vector and PLDA Based Speaker Clustering with Long-Term Features.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Deep Neural Networks for i-Vector Language Identification of Short Utterances in Cars.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Work-efficient parallel non-maximum suppression for embedded GPU architectures.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Hidden Markov Models.
Proceedings of the Encyclopedia of Biometrics, Second Edition, 2015

A deep analysis on age estimation.
Pattern Recognit. Lett., 2015

Deep Learning for Single and Multi-Session i-Vector Speaker Recognition.
CoRR, 2015

UPC System for the 2015 MediaEval Multimodal Person Discovery in Broadcast TV task.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Using voice-quality measurements with prosodic and spectral features for speaker diarization.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Restricted Boltzmann Machine supervectors for speaker recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Feature classification by means of deep belief networks for speaker recognition.
Proceedings of the 23rd European Signal Processing Conference, 2015

2014
3D Joint Speaker Position and Orientation Tracking with Particle Filters.
Sensors, 2014

i-Vector Modeling with Deep Belief Networks for Multi-Session Speaker Recognition.
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Deep belief networks for i-vector based speaker recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

Global Impostor Selection for DBNs in Multi-session i-Vector Speaker Recognition.
Proceedings of the Advances in Speech and Language Technologies for Iberian Languages, 2014

2012
Simultaneous Speech Detection With Spatial Features for Speaker Diarization.
IEEE Trans. Speech Audio Process., 2012

Speaker overlap detection with prosodic features for speaker diarisation.
IET Signal Process., 2012

Speaker Diarization of Broadcast News in Albayzin 2010 Evaluation Campaign.
EURASIP J. Audio Speech Music. Process., 2012

On the use of agglomerative and spectral clustering in speaker diarization of meetings.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

GCC-PHAT based Head Orientation Estimation.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Accelerating Boosting-Based Face Detection on GPUs.
Proceedings of the 41st International Conference on Parallel Processing, 2012

A novel method for low-constrained iris boundary localization.
Proceedings of the 5th IAPR International Conference on Biometrics, 2012

2011
Acoustic Event Detection Based on Feature-Level Fusion of Audio and Video Modalities.
EURASIP J. Adv. Signal Process., 2011

The Detection of Overlapping Speech with Prosodic Features for Speaker Diarization.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Real-time GPU-based face detection in HD video sequences.
Proceedings of the IEEE International Conference on Computer Vision Workshops, 2011

Two-source acoustic event detection and localization: Online implementation in a Smart-room.
Proceedings of the 19th European Signal Processing Conference, 2011

2010
Overlap detection for speaker diarization by fusing spectral and spatial features.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

2009
Automatic Speech Recognition.
Proceedings of the Computers in the Human Interaction Loop, 2009

Multimodal Person Identification.
Proceedings of the Computers in the Human Interaction Loop, 2009

Hidden Markov Models.
Proceedings of the Encyclopedia of Biometrics, 2009

Improving detection of acoustic events using audiovisual data and feature level fusion.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Audiovisual event detection towards scene understanding.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009

Eigenfeatures and Supervectors in Feature and Score Fusion for SVM Face and Speaker Verification.
Proceedings of the Biometric ID Management and Multimodal Communication, 2009

2008
Multimodal identification and localization of users in a smart environment.
J. Multimodal User Interfaces, 2008

Audiovisual Head Orientation Estimation with Particle Filtering in Multisensor Scenarios.
EURASIP J. Adv. Signal Process., 2008

How vulnerable are prosodic features to professional imitators?
Proceedings of the Odyssey 2008: The Speaker and Language Recognition Workshop, 2008

Agatha: Multimodal Biometric Authentication Platform in Large-Scale Databases.
Proceedings of the ISSE 2008, 2008

Speaker orientation estimation based on hybridation of GCC-PHAT and HLBR.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Clustering initialization based on spatial information for speaker diarization of meetings.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Robustness of prosodic features to voice imitation.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Bi-Gaussian score equalization in an audio-visual SVM-based person verification system.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Multimodal real-time focus of attention estimation in SmartRooms.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2008

2007
Acoustic Beamforming for Speaker Diarization of Meetings.
IEEE Trans. Speech Audio Process., 2007

Assessment of On-Line Model Quality and Threshold Estimation in Speaker Verification.
IEICE Trans. Inf. Syst., 2007

On the use of genuine-impostor statistical information for score fusion in multimodal biometrics/sur l'usage de l'information statistioue client-imposteur pour la fusion des scores en biométrie multimodale.
Ann. des Télécommunications, 2007

On the Effect of Score Equalization in SVM Multimodal Biometric Systems.
Proceedings of the SECRYPT 2007, 2007

Jitter and shimmer measurements for speaker recognition.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Audio-based approaches to head orientation estimation in a smart-room.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Score Equalization in SVM Multimodal Fusion for Person Recognition.
Proceedings of the E-business and Telecommunications - 4th International Conference, 2007

Histogram Equalization in SVM Multimodal Person Verification.
Proceedings of the Advances in Biometrics, International Conference, 2007

Multimodal Head Orientation Towards Attention Tracking in Smartrooms.
Proceedings of the IEEE International Conference on Acoustics, 2007

Automatic Weighting for the Combination of TDOA and Acoustic Features in Speaker Diarization for Meetings.
Proceedings of the IEEE International Conference on Acoustics, 2007

Model Complexity Selection and Cross-Validation EM Training for Robust Speaker Diarization.
Proceedings of the IEEE International Conference on Acoustics, 2007

Multispeaker Localization and Tracking in Intelligent Environments.
Proceedings of the Multimodal Technologies for Perception of Humans, 2007

Robust Speaker Identification for Meetings: UPC CLEAR'07 Meeting Room Evaluation System.
Proceedings of the Multimodal Technologies for Perception of Humans, 2007

Speaker Diarization for Conference Room: The UPC RT07s Evaluation System.
Proceedings of the Multimodal Technologies for Perception of Humans, 2007

2006
Architecture and dialogue design for a voice operated information system.
Appl. Intell., 2006

Person Verification by Fusion of Prosodic, Voice Spectral and Facial Parameters.
Proceedings of the SECRYPT 2006, 2006

Threshold Estimation with Continuously Trained Models in Speaker Verification.
Proceedings of the Odyssey 2006: The Speaker and Language Recognition Workshop, 2006

Jacobian Adaptation with Continuous Noise Estimation for Real Speaker Verification Applications.
Proceedings of the Odyssey 2006: The Speaker and Language Recognition Workshop, 2006

Hybrid Speech/non-speech detector applied to Speaker Diarization of Meetings.
Proceedings of the Odyssey 2006: The Speaker and Language Recognition Workshop, 2006

Automatic Cluster Complexity and Quantity Selection: Towards Robust Speaker Diarization.
Proceedings of the Machine Learning for Multimodal Interaction, 2006

On the fusion of prosody, voice spectrum and face features for multimodal person verification.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

On the use of Jacobian adaptation in real speaker verification applications.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Friends and enemies: a novel initialization for speaker diarization.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Audio person tracking in a smart-room environment.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Purity Algorithms for Speaker Diarization of Meetings Data.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Audio, Video and Multimodal Person Identification in a Smart Room.
Proceedings of the Multimodal Technologies for Perception of Humans, 2006

UPC Audio, Video and Multimodal Person Tracking Systems in the Clear Evaluation Campaign.
Proceedings of the Multimodal Technologies for Perception of Humans, 2006

2005
Detection of confusable words in automatic speech recognition.
IEEE Signal Process. Lett., 2005

Improved Jacobian Adaptation for Robust Speaker Verification.
IEICE Trans. Inf. Syst., 2005

Weighting Scores to Improve Speaker-Dependent Threshold Estimation in Text-Dependent Speaker Verification.
Proceedings of the Nonlinear Analyses and Algorithms for Speech Processing, 2005

Variance reduction by using separate genuine- impostor statistics in multimodal biometrics.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Effect of head orientation on the speaker localization performance in smart-room environment.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Automatic Speech Activity Detection, Source Localization, and Speech Recognition on the Chil Seminar Corpus.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

A New On-Line Model Quality Evaluation Method for Speaker Verification.
Proceedings of the Audio- and Video-Based Biometric Person Authentication, 2005

2004
Inter-Phone and Inter-Word Distances for Confusability Prediction in Speech Recognition.
Proces. del Leng. Natural, 2004

Applying speaker verification to certificate revocation.
Proceedings of the Odyssey 2004: The Speaker and Language Recognition Workshop, Toledo, Spain, May 31, 2004

On the use of score pruning in speaker verification for speaker dependent threshold estimation.
Proceedings of the Odyssey 2004: The Speaker and Language Recognition Workshop, Toledo, Spain, May 31, 2004

Model quality evaluation during enrolment for speaker verification.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Word confusability prediction in automatic speech recognition.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Jacobian adaptation with improved noise reference for speaker verification.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Speech enhancement and recognition by integrating adaptive beamforming and wiener filtering.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

2003
Covariation and weighting of harmonically decomposed streams for ASR.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Jacobian adaptation based on the frequency-filtered spectral energies.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Dialogue Management in an Automatic Meteorological Information System.
Proceedings of the Developments in Applied Artificial Intelligence, 2003

VoiceXML in a Real Automatic Meteorological Information System.
Proceedings of the Berliner XML Tage 2003, 13.-15. Oktober 2003 in Berlin, 2003

Automatic Estimation of a Priori Speaker Dependent Thresholds in Speaker Verification.
Proceedings of the Audio-and Video-Based Biometrie Person Authentication, 2003

2002
Sistema de Información Meteorológica Automática por Teléfono ATTEMPS.
Proces. del Leng. Natural, 2002

ACIMET: access to meteorological information by telephone.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

2001
Time and frequency filtering of filter-bank energies for robust HMM speech recognition.
Speech Commun., 2001

A VQ speaker identification system in car environment for personalized infotainment.
Proceedings of the 2001: A Speaker Odyssey, 2001

Speaker identification for car infotainment applications.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000
On the use of filter-bank energies driven from the autocorrelation sequence for noisy speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999
Comparison of time & frequency filtering and cepstral-time matrix approaches in ASR.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1998
Speaker verification on the polycost database using frequency filtered spectral energies.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

1997
Linear prediction of the one-sided autocorrelation sequence for noisy speech recognition.
IEEE Trans. Speech Audio Process., 1997

Speech recognition in a noisy car environment based on LP of the one-sided autocorrelation sequence and robust similarity measuring techniques.
Speech Commun., 1997

CDHMM speaker recognition by means of frequency filtering of filter-bank energies.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Robust speech parameters located in the frequency domain.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Maximum likelihood weighting of dynamic speech features for CDHMM speech recognition.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996
Frequency and time filtering of filter-bank energies for HMM speech recognition.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Third-order cumulant-based wiener filtering algorithm applied to robust speech recognition.
Proceedings of the 8th European Signal Processing Conference, 1996

1995
Robust hos-based techniques applied to speech recognition and enhancement.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

On the decorrelation of filter-bank energies in speech recognition.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

1994
Some fast higher order AR estimation techniques applied to parametric wiener filtering.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Speaker identification in noisy conditions using linear prediction of the one-sided autocorrelation sequence.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Speech recognition in noisy car environment based on OSALPC representation and robust similarity measuring techniques.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993
Multiple multilabeling to improve HMM-based speech recognition in noise.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

1992
On the AR modelling of the one-sided autocorrelation sequence for noisy speech recognition.
Proceedings of the Second International Conference on Spoken Language Processing, 1992

1991
A comparative study of parameters and distances for noisy speech recognition.
Proceedings of the Second European Conference on Speech Communication and Technology, 1991

Pitch determination using the cepstrum of the one-sided autocorrelation sequence.
Proceedings of the 1991 International Conference on Acoustics, 1991

1989
Modeling of the analytic spectrum for speech recognition.
Proceedings of the First European Conference on Speech Communication and Technology, 1989


  Loading...