Jun Zhang

Affiliations:

AI Lab, ByteDance Inc., Speech & Audio Team, Beijing, China
Alibaba Shenma Search, Beijing, China

According to our database¹, Jun Zhang authored at least 21 papers between 2017 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation.

[BibT_eX]

[DOI]

CoRR, 2024

Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2024

A Comprehensive Solution to Connect Speech Encoder and Large Language Model for ASR.

[BibT_eX]

[DOI]

CoRR, 2024

Can Large Language Models Understand Spatial Audio?

[BibT_eX]

[DOI]

CoRR, 2024

SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Language-specific Acoustic Boundary Learning for Mandarin-English Code-switching Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2023

Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Language-specific Boundary Learning for Improving Mandarin-English Code-switching Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Improving Large-Scale Deep Biasing With Phoneme Features and Text-Only Data in Streaming Transducer.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

CIF-PT: Bridging Speech and Text Representations for Spoken Language Understanding via Continuous Integrate-and-Fire Pre-Training.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022

Bring dialogue-context into RNN-T for streaming ASR.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Token-level Speaker Change Detection Using Speaker Difference and Speech Content via Continuous Integrate-and-fire.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

The Volcspeech System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

HMM-Free Encoder Pre-Training for Streaming RNN Transducer.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020

Improving RNN transducer with normalized jointer network.

[BibT_eX]

[DOI]

CoRR, 2020

Dynamic latency speech recognition with asynchronous revision.

[BibT_eX]

[DOI]

CoRR, 2020

2017

Deep LSTM for Large Vocabulary Continuous Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2017

Frame Stacking and Retaining for Recurrent Neural Network Acoustic Model.

[BibT_eX]

[DOI]

CoRR, 2017

Exponential Moving Average Model in Parallel Speech Recognition Training.

[BibT_eX]

[DOI]

CoRR, 2017

Jun Zhang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...