Siddharth Dalmia

Orcid: 0000-0003-0437-5988

According to our database1, Siddharth Dalmia authored at least 41 papers between 2015 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 




Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?
CoRR, 2024

Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems.
CoRR, 2024

LLM Augmented LLMs: Expanding Capabilities through Composition.
CoRR, 2024

LLM Augmented LLMs: Expanding Capabilities through Composition.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Multimodal Modeling for Spoken Language Identification.
Proceedings of the IEEE International Conference on Acoustics, 2024

LegoNN: Building Modular Encoder-Decoder Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Multimodal Modeling For Spoken Language Identification.
CoRR, 2023

Align, Write, Re-Order: Explainable End-to-End Speech Translation via Operation Sequence Generation.
Proceedings of the IEEE International Conference on Acoustics, 2023

CTC Alignments Improve Autoregressive Translation.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

A Study on the Integration of Pre-Trained SSL, ASR, LM and SLU Models for Spoken Language Understanding.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

FLEURS: FEW-Shot Learning Evaluation of Universal Representations of Speech.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

CMU's IWSLT 2022 Dialect Speech Translation System.
Proceedings of the 19th International Conference on Spoken Language Translation, 2022

Two-Pass Low Latency End-to-End Spoken Language Understanding.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding.
Proceedings of the International Conference on Machine Learning, 2022

Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization.
Proceedings of the IEEE International Conference on Acoustics, 2022

ESPnet-SLU: Advancing Spoken Language Understanding Through ESPnet.
Proceedings of the IEEE International Conference on Acoustics, 2022

Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

ESPnet-ST IWSLT 2021 Offline Speech Translation System.
Proceedings of the 18th International Conference on Spoken Language Translation, 2021

Differentiable Allophone Graphs for Language-Universal Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Transformer-Transducers for Code-Switched Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

NoiseQA: Challenge Set Evaluation for User-Centric Question Answering.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Universal Phone Recognition with a Multilingual Allophone System.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

On Long-Tailed Phenomena in Neural Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Towards Zero-Shot Learning for Automatic Phonemic Transcription.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Enforcing Encoder-Decoder Modularity in Sequence-to-Sequence Models.
CoRR, 2019

The ARIEL-CMU Systems for LoReHLT18.
CoRR, 2019

SANTLR: Speech Annotation Toolkit for Low Resource Languages.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Multilingual Speech Recognition with Corpus Relatedness Sampling.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Cross-Attention End-to-End ASR for Two-Party Conversations.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Phoneme Level Language Models for Sequence Based Low Resource ASR.
Proceedings of the IEEE International Conference on Acoustics, 2019

Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Domain Robust Feature Extraction for Rapid Low Resource ASR Development.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Epitran: Precision G2P for Many Languages.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Sequence-Based Multi-Lingual Low Resource Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

A novel similarity measure: Voronoi audio similarity for genre classification.
Int. J. Intell. Syst. Technol. Appl., 2017

An approach for self-training audio event detectors using web data.
Proceedings of the 25th European Signal Processing Conference, 2017

Robust ASR using neural network based speech enhancement and feature simulation.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
