Philip N. Garner

Orcid: 0000-0002-0814-1348

According to our database1, Philip N. Garner authored at least 109 papers between 1993 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Exploring neural oscillations during speech perception via surrogate gradient spiking neural networks.
CoRR, 2024

Bayesian Parameter-Efficient Fine-Tuning for Overcoming Catastrophic Forgetting.
CoRR, 2024

2023
Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding.
CoRR, 2023

An investigation into the adaptability of a diffusion-based TTS model.
CoRR, 2023

Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes.
Proceedings of the IEEE International Joint Conference on Biometrics, 2023

The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

2022
Investigating a neural all pass warp in modern TTS applications.
Speech Commun., 2022

Surrogate Gradient Spiking Neural Networks as Encoders for Large Vocabulary Continuous Speech Recognition.
CoRR, 2022

Conversational Speech Recognition Needs Data? Experiments with Austrian German.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Low-Level Physiological Implications of End-to-End Learning for Speech Recognition.
Proceedings of the Interspeech 2022, 2022

Bayesian Recurrent Units and the Forward-Backward Algorithm.
Proceedings of the Interspeech 2022, 2022

2021
A Bayesian Approach to Recurrence in Neural Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Modeling Dialectal Variation for Swiss German Automatic Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

A Bayesian Interpretation of the Light Gated Recurrent Unit.
Proceedings of the IEEE International Conference on Acoustics, 2021

Learning to Translate Low-Resourced Swiss German Dialectal Speech into Standard German Text.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

An Evaluation Benchmark for Automatic Speech Recognition of German-English Code-Switching.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
A $t$-Distribution Based Operator for Enhancing Out of Distribution Robustness of Neural Network Classifiers.
IEEE Signal Process. Lett., 2020

2019
Unbiased Semi-Supervised LF-MMI Training Using Dropout.
Proceedings of the Interspeech 2019, 2019

Self-Attention for Speech Emotion Recognition.
Proceedings of the Interspeech 2019, 2019

An Investigation of Multilingual ASR Using End-to-end LF-MMI.
Proceedings of the IEEE International Conference on Acoustics, 2019

Empirical Evaluation and Combination of Punctuation Prediction Models Applied to Broadcast News.
Proceedings of the IEEE International Conference on Acoustics, 2019

An End-to-end Network to Synthesize Intonation Using a Generalized Command Response Model.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Cross-lingual adaptation of a CTC-based multilingual acoustic model.
Speech Commun., 2018

Intonation modelling using a muscle model and perceptually weighted matching pursuit.
Speech Commun., 2018

A Variational Prosody Model for the decomposition and synthesis of speech prosody.
CoRR, 2018

Context-Aware Attention Mechanism for Speech Emotion Recognition.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Fast Language Adaptation Using Phonological Information.
Proceedings of the Interspeech 2018, 2018

A Neural Model to Predict Parameters for a Generalized Command Response Model of Intonation.
Proceedings of the Interspeech 2018, 2018

2017
Multilingual Training and Cross-lingual Adaptation on CTC-based Acoustic Model.
CoRR, 2017

An Investigation of Deep Neural Networks for Multilingual Speech Recognition Training and Adaptation.
Proceedings of the Interspeech 2017, 2017


2016
Composition of Deep and Spiking Neural Networks for Very Low Bit Rate Speech Coding.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Investigating Spectral Amplitude Modulation Phase Hierarchy Features in Speech Synthesis.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Emphasis recreation for TTS using intonation atoms.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Design of a Speech Corpus for Research on Cross-Lingual Prosody Transfer.
Proceedings of the Speech and Computer - 18th International Conference, 2016

An Agonist-Antagonist Pitch Production Model.
Proceedings of the Speech and Computer - 18th International Conference, 2016

Investigating Cross-lingual Multi-level Adaptive Networks: The Importance of the Correlation of Source and Target Languages.
Proceedings of the 13th International Conference on Spoken Language Translation, 2016

Probabilistic Amplitude Demodulation Features in Speech Synthesis for Improving Prosody.
Proceedings of the Interspeech 2016, 2016

The SIWIS Database: A Multilingual Speech Database with Acted Emphasis.
Proceedings of the Interspeech 2016, 2016

PhonVoc: A Phonetic and Phonological Vocoding Toolkit.
Proceedings of the Interspeech 2016, 2016

Sound Pattern Matching for Automatic Prosodic Event Detection.
Proceedings of the Interspeech 2016, 2016

Modeling unvoiced sounds in statistical parametric speech synthesis with a continuous vocoder.
Proceedings of the 24th European Signal Processing Conference, 2016

2015
Incremental Syllable-Context Phonetic Vocoding.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Ad hoc microphone array calibration: Euclidean distance matrix completion algorithm and theoretical guarantees.
Signal Process., 2015

Spatial Sound Localization via Multipath Euclidean Distance Matrix Recovery.
IEEE J. Sel. Top. Signal Process., 2015

Exploiting foreign resources for DNN-based ASR.
EURASIP J. Audio Speech Music. Process., 2015

DNN-Based Speech Synthesis: Importance of Input Features and Training Data.
Proceedings of the Speech and Computer - 17th International Conference, 2015

Weighted correlation based atom decomposition intonation modelling.
Proceedings of the INTERSPEECH 2015, 2015

Robust microphone placement for source localization from noisy distance measurements.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Atom decomposition-based intonation modelling.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Phonological vocoding using artificial neural networks.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Using out-of-language data to improve an under-resourced speech recognizer.
Speech Commun., 2014

Enhanced diffuse field model for ad hoc microphone array calibration.
Signal Process., 2014

Combining Vocal Tract Length Normalization With Hierarchical Linear Transformations.
IEEE J. Sel. Top. Signal Process., 2014

Swiss French Regional Accent Identification.
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Automatic speech recognition and translation of a Swiss German dialect: Walliserdeutsch.
Proceedings of the INTERSPEECH 2014, 2014

Stress and accent transmission in HMM-based syllable-context very low bit rate speech coding.
Proceedings of the INTERSPEECH 2014, 2014

ROCKIT: Roadmap for Conversational Interaction Technologies.
Proceedings of the 2014 Workshop on Roadmapping the Future of Multimodal Interaction Research including Business Opportunities and Challenges, 2014

Ad-hoc microphone array calibration from partial distance measurements.
Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

2013
Applying Multi- and Cross-Lingual Stochastic Phone Space Transformations to Non-Native Speech Recognition.
IEEE Trans. Speech Audio Process., 2013

A Simple Continuous Pitch Estimation Algorithm.
IEEE Signal Process. Lett., 2013

Crosslingual tandem-SGMM: exploiting out-of-language data for acoustic model and feature level adaptation.
Proceedings of the INTERSPEECH 2013, 2013

Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture.
Proceedings of the INTERSPEECH 2013, 2013

Euclidean distance matrix completion for ad-hoc microphone array calibration.
Proceedings of the 18th International Conference on Digital Signal Processing, 2013

Accent adaptation using Subspace Gaussian Mixture Models.
Proceedings of the IEEE International Conference on Acoustics, 2013

On the (UN)importance of the contextual factors in HMM-based speech synthesis and coding.
Proceedings of the IEEE International Conference on Acoustics, 2013

Evaluating intra- and crosslingual adaptation for non-native speech recognition in a bilingual environment.
Proceedings of the IEEE 4th International Conference on Cognitive Infocommunications, 2013

Impact of deep MLP architecture on different acoustic modeling techniques for under-resourced speech recognition.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012
Vocal Tract Length Normalization for Statistical Parametric Speech Synthesis.
IEEE Trans. Speech Audio Process., 2012

Transcribing Meetings With the AMIDA Systems.
IEEE Trans. Speech Audio Process., 2012

Boosting under-resourced speech recognizers by exploiting out-of-language data - case study on Afrikaans.
Proceedings of the Third Workshop on Spoken Language Technologies for Under-resourced Languages, 2012

MediaParl: Bilingual mixed language accented speech database.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Combining cepstral normalization and cochlear implant-like speech processing for microphone array-based speech recognition.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Comparing different acoustic modeling techniques for multilingual boosting.
Proceedings of the INTERSPEECH 2012, 2012

Microphone array beampattern characterization for hands-free speech applications.
Proceedings of the IEEE 7th Sensor Array and Multichannel Signal Processing Workshop, 2012

Combining vocal tract length normalization with hierarchial linear transformations.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Using KL-divergence and multilingual information to improve ASR for under-resourced languages.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Bayesian approaches to uncertainty in speech processing.
PhD thesis, 2011

Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition.
Speech Commun., 2011

A Just-in-Time Document Retrieval System for Dialogues or Monologues.
Proceedings of the SIGDIAL 2011 Conference, 2011

Improving Non-Native ASR Through Stochastic Multilingual Phoneme Space Transformations.
Proceedings of the INTERSPEECH 2011, 2011

A Speech-based Just-in-Time Retrieval System using Semantic Search.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, 2011

2010
Speaker adaptation and the evaluation of speaker similarity in the EMIME speech-to-speech translation project.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Implementation of VTLN for statistical speech synthesis.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

English spoken term detection in multilingual recordings.
Proceedings of the INTERSPEECH 2010, 2010

Hands free audio analysis from home entertainment.
Proceedings of the INTERSPEECH 2010, 2010

The AMIDA 2009 meeting transcription system.
Proceedings of the INTERSPEECH 2010, 2010

Tracter: a lightweight dataflow framework.
Proceedings of the INTERSPEECH 2010, 2010

Sparse component analysis for speech recognition in multi-speaker environment.
Proceedings of the INTERSPEECH 2010, 2010

VTLN adaptation for statistical speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2010

Automatic temporal alignment of AV data with confidence estimation.
Proceedings of the IEEE International Conference on Acoustics, 2010


2009
Beamforming With a Maximum Negentropy Criterion.
IEEE Trans. Speech Audio Process., 2009

Real-time ASR from meetings.
Proceedings of the INTERSPEECH 2009, 2009

SNR features for automatic speech recognition.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
Maximum kurtosis beamforming with the generalized sidelobe canceller.
Proceedings of the INTERSPEECH 2008, 2008

Silence models in weighted finite-state transducers.
Proceedings of the INTERSPEECH 2008, 2008

Filter bank design based on minimization of individual aliasing terms for minimum mutual information subband adaptive beamforming.
Proceedings of the IEEE International Conference on Acoustics, 2008

2004
A differential spectral voice activity detector.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2001
SpokenContent representation in MPEG-7.
IEEE Trans. Circuits Syst. Video Technol., 2001

2000
Representation and linking mechanisms for audio in MPEG-7.
Signal Process. Image Commun., 2000

Spoken content metadata and MPEG-7.
Proceedings of the ACM Multimedia 2000 Workshops, Los Angeles, CA, USA, October 30, 2000

1998
On the robust incorporation of formant features into hidden Markov models for automatic speech recognition.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1997
On topic identification and dialogue move recognition.
Comput. Speech Lang., 1997

Using formant frequencies in speech recognition.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

A keyword selection strategy for dialogue move recognition and multi-class topic identification.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996
A theory of word frequencies and its application to dialogue move recognition.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Source position estimation using radial basis functions.
Proceedings of the 13th International Conference on Pattern Recognition, 1996

1993
Towards Sonar Based Perception and Modelling for Unmanned Untethered Underwater Vehicles.
Proceedings of the 1993 IEEE International Conference on Robotics and Automation, 1993


  Loading...