Jun Zhang

Affiliations:
  • AI Lab, ByteDance Inc., Speech & Audio Team, Beijing, China
  • Alibaba Shenma Search, Beijing, China


According to our database1, Jun Zhang authored at least 21 papers between 2017 and 2024.

Collaborative distances:

Timeline

2017
2018
2019
2020
2021
2022
2023
2024
0
1
2
3
4
5
6
7
8
6
1
2
3
1
4
3
1

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation.
CoRR, 2024

Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation.
CoRR, 2024

Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition.
CoRR, 2024

A Comprehensive Solution to Connect Speech Encoder and Large Language Model for ASR.
CoRR, 2024

SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words.
CoRR, 2024

Can Large Language Models Understand Spatial Audio?
CoRR, 2024

SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Language-specific Acoustic Boundary Learning for Mandarin-English Code-switching Speech Recognition.
CoRR, 2023

Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Language-specific Boundary Learning for Improving Mandarin-English Code-switching Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Improving Large-Scale Deep Biasing With Phoneme Features and Text-Only Data in Streaming Transducer.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

CIF-PT: Bridging Speech and Text Representations for Spoken Language Understanding via Continuous Integrate-and-Fire Pre-Training.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Bring dialogue-context into RNN-T for streaming ASR.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Token-level Speaker Change Detection Using Speaker Difference and Speech Content via Continuous Integrate-and-fire.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

The Volcspeech System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
HMM-Free Encoder Pre-Training for Streaming RNN Transducer.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020
Improving RNN transducer with normalized jointer network.
CoRR, 2020

Dynamic latency speech recognition with asynchronous revision.
CoRR, 2020

2017
Deep LSTM for Large Vocabulary Continuous Speech Recognition.
CoRR, 2017

Frame Stacking and Retaining for Recurrent Neural Network Acoustic Model.
CoRR, 2017

Exponential Moving Average Model in Parallel Speech Recognition Training.
CoRR, 2017


  Loading...