Boris Ginsburg
According to our database1,
Boris Ginsburg
authored at least 100 papers
between 2002 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
CoRR, 2024
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning.
CoRR, 2024
CoRR, 2024
Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data.
CoRR, 2024
META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR.
CoRR, 2024
Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition.
CoRR, 2024
Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens.
CoRR, 2024
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation.
CoRR, 2024
CoRR, 2024
NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks.
CoRR, 2024
Genetic Instruct: Scaling up Synthetic Generation of Coding Instructions for Large Language Models.
CoRR, 2024
Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations.
CoRR, 2024
BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5.
CoRR, 2024
CoRR, 2024
CoRR, 2024
CoRR, 2024
CoRR, 2024
CoRR, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024
Stateful Conformer with Cache-Based Inference for Streaming Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
SALM: Speech-Augmented Language Model with in-Context Learning for Speech Recognition and Translation.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System.
CoRR, 2023
Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation.
CoRR, 2023
Towards training Bilingual and Code-Switched Speech Recognition models from Monolingual data sources.
CoRR, 2023
CoRR, 2023
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023
Proceedings of the 20th International Conference on Spoken Language Translation, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
A Compact End-to-End Model with Local and Global Context for Spoken Language Identification.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Leveraging Pretrained ASR Encoders for Effective and Efficient End-to-End Speech Intent Classification and Slot Filling.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
SpellMapper: A non-autoregressive neural spellchecker for ASR customization with candidate retrieval based on n-gram mappings.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the International Conference on Machine Learning, 2023
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Conformer-Based Target-Speaker Automatic Speech Recognition For Single-Channel Audio.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
ACE-VC: Adaptive and Controllable Voice Conversion Using Explicitly Disentangled Self-Supervised Speech Representations.
Proceedings of the IEEE International Conference on Acoustics, 2023
Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models.
Proceedings of the IEEE International Conference on Acoustics, 2023
Vani: Very-Lightweight Accent-Controllable TTS for Native And Non-Native Speakers With Identity Preservation.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of End-to-End ASR Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
2022
Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-to-End Automatic Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Mixer-TTS: Non-Autoregressive, Fast and Compact Text-to-Speech Model Conditioned on Language Model Embeddings.
Proceedings of the IEEE International Conference on Acoustics, 2022
TitaNet: Neural Model for Speaker Representation with 1D Depth-Wise Separable Convolutions and Global Context.
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
J. Chem. Inf. Model., 2021
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.
CoRR, 2021
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
SPGISpeech: 5, 000 Hours of Transcribed Financial Audio for Fully Formatted End-to-End Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
TalkNet: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Cross-Language Transfer Learning and Domain Adaptation for End-to-End Automatic Speech Recognition.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021
MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection.
Proceedings of the IEEE International Conference on Acoustics, 2021
2020
MatchboxNet: 1D Time-Channel Separable Convolutional Neural Network Architecture for Speech Commands Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Quartznet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Correction of Automatic Speech Recognition with Transformer Sequence-To-Sequence Model.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks.
CoRR, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
2018
CoRR, 2018
OpenSeq2Seq: extensible toolkit for distributed and mixed precision training of sequence-to-sequence models.
CoRR, 2018
Comput. methods Biomech. Biomed. Eng. Imaging Vis., 2018
Proceedings of the 6th International Conference on Learning Representations, 2018
Proceedings of the 6th International Conference on Learning Representations, 2018
2017
Comparison of Batch Normalization and Weight Normalization Algorithms for the Large-scale Image Classification.
CoRR, 2017
Proceedings of the 5th International Conference on Learning Representations, 2017
Proceedings of the 5th International Conference on Learning Representations, 2017
2016
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016
2002
Proceedings of the Tools and Algorithms for the Construction and Analysis of Systems, 2002