Ozlem Kalinli

According to our database1, Ozlem Kalinli authored at least 67 papers between 2007 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Efficient Streaming LLM for Speech Recognition.
CoRR, 2024

M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses.
CoRR, 2024

Faster Speech-LLaMA Inference with Multi-token Prediction.
CoRR, 2024

Towards measuring fairness in speech recognition: Fair-Speech dataset.
CoRR, 2024

Token-Weighted RNN-T for Learning from Flawed Data.
CoRR, 2024

AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of a Multilingual ASR Model.
Proceedings of the IEEE International Conference on Acoustics, 2024

Recovering from Privacy-Preserving Masking with Large Language Models.
Proceedings of the IEEE International Conference on Acoustics, 2024

Contextual Biasing of Named-Entities with Large Language Models.
Proceedings of the IEEE International Conference on Acoustics, 2024

TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-Device ASR Models.
Proceedings of the IEEE International Conference on Acoustics, 2024

Correction Focused Language Model Training For Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

End-to-End Speech Recognition Contextualization with Large Language Models.
Proceedings of the IEEE International Conference on Acoustics, 2024

Effective Internal Language Model Training and Fusion for Factorized Transducer Model.
Proceedings of the IEEE International Conference on Acoustics, 2024

Prompting Large Language Models with Speech Recognition Abilities.
Proceedings of the IEEE International Conference on Acoustics, 2024

Forgetting Private Textual Sequences in Language Models Via Leave-One-Out Ensemble.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data.
CoRR, 2023

Augmenting text for spoken language understanding with Large Language Models.
CoRR, 2023

Recovering from Privacy-Preserving Masking with Large Language Models.
CoRR, 2023

Towards Selection of Text-to-speech Data to Augment ASR Training.
CoRR, 2023

Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Multi-Head State Space Model for Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Learning ASR Pathways: A Sparse Multilingual ASR Model.
Proceedings of the IEEE International Conference on Acoustics, 2023

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities.
Proceedings of the IEEE International Conference on Acoustics, 2023

Anchored Speech Recognition with Neural Transducers.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improving fast-slow Encoder based Transducer with Streaming Deliberation.
Proceedings of the IEEE International Conference on Acoustics, 2023

Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers.
Proceedings of the IEEE International Conference on Acoustics, 2023

Joint Federated Learning and Personalization for on-Device ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition.
CoRR, 2022

Learning ASR pathways: A sparse multilingual ASR model.
CoRR, 2022

Learning a Dual-Mode Speech Recognition Model VIA Self-Pruning.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Scaling ASR Improves Zero and Few Shot Learning.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Streaming parallel transducer beam search with fast slow cascaded encoders.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Deliberation Model for On-Device Spoken Language Understanding.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Federated Domain Adaptation for ASR with Full Self-Supervision.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Omni-Sparsity DNN: Fast Sparsity Optimization for On-Device Streaming E2E ASR Via Supernet.
Proceedings of the IEEE International Conference on Acoustics, 2022

Streaming Transformer Transducer based Speech Recognition Using Non-Causal Convolution.
Proceedings of the IEEE International Conference on Acoustics, 2022

Neural-FST Class Language Model for End-to-End Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021
Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study.
CoRR, 2021

Noisy Training Improves E2E ASR for the Edge.
CoRR, 2021

Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios.
CoRR, 2021

Transformer-Based Acoustic Modeling for Streaming Speech Synthesis.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Dynamic Encoder Transducer: A Flexible Solution for Trading Off Accuracy for Latency.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Collaborative Training of Acoustic Encoders for Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Flexi-Transducer: Optimizing Latency, Accuracy and Compute for Multi-Domain On-Device Scenarios.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

SapAugment: Learning A Sample Adaptive Policy for Data Augmentation.
Proceedings of the IEEE International Conference on Acoustics, 2021

2019
Bandwidth Embeddings for Mixed-Bandwidth Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Parametric Cepstral Mean Normalization for Robust Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

2016
Analysis of Multi-Lingual Emotion Recognition Using Auditory Attention Features.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015
Emotion clustering based on probabilistic linear discriminant analysis.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2013
Combination of auditory attention features with phone posteriors for better automatic phoneme segmentation.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012
Automatic Phoneme Segmentation Using Auditory Attention Features.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011
Syllable Segmentation of Continuous Speech Using Auditory Attention Cues.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Tone and pitch accent classification using auditory attention cues.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Noise Adaptive Training for Robust Automatic Speech Recognition.
IEEE Trans. Speech Audio Process., 2010

2009
Prominence Detection Using Auditory Attention Cues and Task-Dependent High Level Information.
IEEE Trans. Speech Audio Process., 2009

Saliency-driven unstructured acoustic scene classification using latent perceptual indexing.
Proceedings of the 2009 IEEE International Workshop on Multimedia Signal Processing, 2009

Continuous speech recognition using attention shift decoding with soft decision.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Noise adaptive training using a vector taylor series approach for noise robust automatic speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Combining task-dependent information with auditory attention cues for prominence detection in speech.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

A top-down auditory attention model for learning task dependent influences on prominence detection in speech.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
A saliency-based auditory attention model with applications to unsupervised prominent syllable detection in speech.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Early auditory processing inspired features for robust automatic speech recognition.
Proceedings of the 15th European Signal Processing Conference, 2007


  Loading...