Naomi Harte

Orcid: 0000-0002-9274-209X

According to our database1, Naomi Harte authored at least 100 papers between 1996 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
The limits of the Mean Opinion Score for speech synthesis evaluation.
Comput. Speech Lang., March, 2024

Smiling in the Face and Voice of Avatars and Robots: Evidence for a 'Smiling McGurk Effect'.
IEEE Trans. Affect. Comput., 2024

Joint Speech-Text Embeddings for Multitask Speech Processing.
IEEE Access, 2024

2023
Listener sensitivity to deviating obstruents in WaveNet.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Sp1NY: A Quick and Flexible Speech Visualisation Tool in Python.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Query Based Acoustic Summarization for Podcasts.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Learnable Frontends That Do Not Learn: Quantifying Sensitivity To Filterbank Initialisation.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Deep Multi-Scale Feature Learning for Defocus Blur Estimation.
IEEE Trans. Image Process., 2022

Comparison of discrete transforms for deep-neural-networks-based speech enhancement.
IET Signal Process., 2022

Taris: An online speech recognition framework with sequence to sequence neural networks for both audio-only and audio-visual speech.
Comput. Speech Lang., 2022

Fine Grained Spoken Document Summarization Through Text Segmentation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

To smile or not to smile: The effect of mismatched emotional expressions in a Human-Robot cooperative task.
Proceedings of the 31st IEEE International Conference on Robot and Human Interactive Communication, 2022

RoomReader: A Multimodal Corpus of Online Multiparty Conversational Interactions.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Learnable Acoustic Frontends in Bird Activity Detection.
Proceedings of the 17th International Workshop on Acoustic Signal Enhancement, 2022

Production characteristics of obstruents in WaveNet and older TTS systems.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Back to the Future: Extending the Blizzard Challenge 2013.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021
The Effect of Audio-Visual Smiles on Social Influence in a Cooperative Human-Agent Interaction Task.
ACM Trans. Comput. Hum. Interact., 2021

Mind your p's and k's - Comparing obstruents across TTS voices of the Blizzard Challenge 2013.
Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Learning to Count Words in Fluent Speech Enables Online Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Low Resource Species Agnostic Bird Activity Detection.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2021

Dimensional perception of a 'smiling McGurk effect'.
Proceedings of the 9th International Conference on Affective Computing and Intelligent Interaction, 2021

2020
How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

AV Taris: Online Audio-Visual Speech Recognition.
CoRR, 2020

Investigation of Auditory Nerve Model Based Analysis for Vocoded Speech Synthesis.
Proceedings of the Twelfth International Conference on Quality of Multimedia Experience, 2020

Should we Hard-Code the Recurrence Concept or Learn it Instead ? Exploring the Transformer Architecture for Audio-Visual Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Can Auditory Nerve Models Tell us What's Different About WaveNet Vocoded Speech?
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Cogans For Unsupervised Visual Speech Adaptation To New Speakers.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Neural Generation of Dialogue Response Timings.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
The Effect of Multimodal Emotional Expression and Agent Appearance on Trust in Human-Agent Interaction.
Proceedings of the Motion, Interaction and Games, 2019

Mapping Theoretical and Methodological Perspectives for Understanding Speech Interface Interactions.
Proceedings of the Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, 2019

2018
A longitudinal database of Irish political speech with annotations of speaker ability.
Lang. Resour. Evaluation, 2018

Perception and prediction of speaker appeal - A single speaker study.
Comput. Speech Lang., 2018

Investigating Speech Features for Continuous Turn-Taking Prediction Using LSTMs.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Survival at the Museum: A Cooperation Experiment with Emotionally Expressive Virtual Characters.
Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018

Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition.
Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018

Multimodal Continuous Turn-Taking Prediction Using Multiscale RNNs.
Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018

Can DNNs Learn to Lipread Full Sentences?
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

The Impact of Reduced Video Quality on Visual Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Voice Activity Detection Using Neurograms.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Objective Assessment of Perceptual Audio Quality Using ViSQOLAudio.
IEEE Trans. Broadcast., 2017

Speech emotion classification using combined neurogram and INTERSPEECH 2010 paralinguistic challenge features.
IET Signal Process., 2017

Detecting conversational gaze aversion using unsupervised learning.
Proceedings of the 25th European Signal Processing Conference, 2017

Automatic frequency feature extraction for bird species delimitation.
Proceedings of the 25th European Signal Processing Conference, 2017

Towards Lipreading Sentences with Active Appearance Models.
Proceedings of the 14th International Conference on Auditory-Visual Speech Processing, 2017

Thin slicing to predict viewer impressions of TED Talks.
Proceedings of the 14th International Conference on Auditory-Visual Speech Processing, 2017

2016
Bitrate classification of twice-encoded audio using objective quality features.
Proceedings of the Eighth International Conference on Quality of Multimedia Experience, 2016

YIN-Bird: Improved Pitch Tracking for Bird Vocalisations.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Closing Remarks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Discussion.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Poster Overview Presentations.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Introduction.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015
TCD-TIMIT: An Audio-Visual Corpus of Continuous Speech.
IEEE Trans. Multim., 2015

ViSQOL: an objective speech quality model.
EURASIP J. Audio Speech Music. Process., 2015

TCD-VoIP, a research database of degraded speech for assessing quality in VoIP applications.
Proceedings of the Seventh International Workshop on Quality of Multimedia Experience, 2015

Quantifying difference in vocalizations of bird populations.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Measuring and monitoring speech quality for voice over IP with POLQA, viSQOL and p.563.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2014
Assessment of Audio/Video synchronisation in streaming media.
Proceedings of the Sixth International Workshop on Quality of Multimedia Experience, 2014

Perceived Audio Quality for Streaming Stereo Music.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Building a Database of Political Speech: Does Culture Matter in Charisma Annotations?
Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge, 2014

Effect of long-term ageing on i-vector speaker verification.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2013
Speaker verification in score-ageing-quality classification space.
Comput. Speech Lang., 2013

Auditory detectability of vocal ageing and its effect on forensic automatic speaker recognition.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Eigenageing compensation for speaker verification.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Monitoring the effects of temporal clipping on voIP speech quality.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Identifying new bird species from differences in birdsong.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Robustness of speech quality metrics to background noise and network degradations: Comparing ViSQOL, PESQ and POLQA.
Proceedings of the IEEE International Conference on Acoustics, 2013

Late integration of features for acoustic emotion recognition.
Proceedings of the 21st European Signal Processing Conference, 2013

2012
Algorithms for the Digital Restoration of Torn Films.
IEEE Trans. Image Process., 2012

Speech intelligibility prediction using a Neurogram Similarity Index Measure.
Speech Commun., 2012

Improving underwater visibility using vignetting correction.
Proceedings of the Visual Information Processing and Communication III, 2012

ViSQOL: The Virtual Speech Quality Objective Listener.
Proceedings of the IWAENC 2012 - International Workshop on Acoustic Signal Enhancement, Proceedings, RWTH Aachen University, Germany, September 4th, 2012

Compensating for Ageing and Quality variation in Speaker Verification.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Improved Speech Intelligibility with a Chimaera Hearing Aid Algorithm.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Phoneme-to-viseme Mapping for Visual Speech Recognition.
Proceedings of the ICPRAM 2012, 2012

Speaker verification with long-term ageing data.
Proceedings of the 5th IAPR International Conference on Biometrics, 2012

2011
Viseme definitions comparison for visual-only speech recognition.
Proceedings of the 19th European Signal Processing Conference, 2011

An extended multiresolution approach to mouth specific AAM fitting for Speech Recognition.
Proceedings of the 19th European Signal Processing Conference, 2011

Simulated performance intensity functions.
Proceedings of the 33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2011

Effects of Long-Term Ageing on Speaker Verification.
Proceedings of the Biometrics and ID Management, 2011

2010
Speech intelligibility from image processing.
Speech Commun., 2010

Auditory Features Revisited for Robust Speech Recognition.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

A comparison of auditory features for robust speech recognition.
Proceedings of the 18th European Signal Processing Conference, 2010

Evaluating sensorineural hearing loss with an auditory nerve model using a mean structural similarity measure.
Proceedings of the 18th European Signal Processing Conference, 2010

2009
On Parsing Visual Sequences with the Hidden Markov Model.
EURASIP J. Image Video Process., 2009

Error metrics for impaired auditory nerve responses of different phoneme groups.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

2008
Pathological Motion Detection for Robust Missing Data Treatment.
EURASIP J. Adv. Signal Process., 2008

Action Recognition in Multimedia Streams.
Proceedings of the Multimodal Processing and Interaction, Audio, Video, Text, 2008

2007
Rotation Detection using the Curl Equation.
Proceedings of the International Conference on Image Processing, 2007

Automated Segmentation of Torn Frames using the Graph Cuts Technique.
Proceedings of the International Conference on Image Processing, 2007

2006
Pathological Motion Detection for Robust Missing Data Treatment in Degraded Archived Media.
Proceedings of the International Conference on Image Processing, 2006

Exploiting Voicing Cues for Contrast Enhanced Frequency Shaping of Speech for Impaired Listeners.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Automated removal of overshoot artefact from images.
Proceedings of the 14th European Signal Processing Conference, 2006

2005
Towards a hardware realization of time-frequency source separation of speech.
Proceedings of the 2005 European Conference on Circuit Theory and Design, 2005

1999
Combined temporal and spectral multi-resolution phonetic modelling.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Discriminative spectral-temporal multiresolution features for speech recognition.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Joint recognition and segmentation using phonetically derived features and a hybrid phoneme model.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Multi-resolution cepstral features for phoneme recognition across speech sub-bands.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

A novel model for phoneme recognition using phonetically derived features.
Proceedings of the 9th European Signal Processing Conference, 1998

1997
Multi-resolution phonetic/segmental features and models for HMM-based speech recognition.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996
Dynamic features for segmental speech recognition.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996


  Loading...