Zhijian Ou
Orcid: 0000-0002-9018-5074Affiliations:
- Tsinghua University, China
According to our database1,
Zhijian Ou
authored at least 90 papers
between 2001 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
Found. Trends Signal Process., 2024
CoRR, 2024
Low-Resourced Speech Recognition for Iu Mien Language via Weakly-Supervised Phoneme-based Multilingual Pre-training.
CoRR, 2024
CUSIDE-T: Chunking, Simulating Future and Decoding for Transducer based Streaming ASR.
CoRR, 2024
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision.
CoRR, 2024
The 2nd FutureDial Challenge: Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG).
CoRR, 2024
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
2023
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Proceedings of the 44th IEEE Symposium on Security and Privacy, 2023
Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
2022
A Generative User Simulator with GPT-based Architecture and Goal State Tracking for Reinforced Multi-Domain Dialog Systems.
CoRR, 2022
Jointly Reinforced User Simulator and Task-oriented Dialog System with Simplified Generative Architecture.
CoRR, 2022
Information Extraction and Human-Robot Dialogue towards Real-life Tasks: A Baseline Study with the MobileCS Dataset.
CoRR, 2022
CoRR, 2022
Revisiting Markovian Generative Architectures for Efficient Task-Oriented Dialog Systems.
CoRR, 2022
Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study.
CoRR, 2022
Building Markovian Generative Architectures Over Pretrained LM Backbones for Efficient Task-Oriented Dialog Systems.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Advancing Semi-Supervised Task Oriented Dialog Systems by JSA Learning of Discrete Latent Variable Models.
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2022
Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
An Empirical Study of Language Model Integration for Transducer based Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
2021
Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers.
CoRR, 2021
Efficient Neural Architecture Search for End-to-End Speech Recognition Via Straight-Through Gradients.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
The SLT 2021 Children Speech Recognition Challenge: Open Datasets, Rules and Baselines.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
An Empirical Comparison of Joint-Training and Pre-Training for Domain-Agnostic Semi-Supervised Learning Via Energy-Based Models.
Proceedings of the 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP), 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Multilingual and Crosslingual Speech Recognition Using Phonological-Vector Based Phone Embeddings.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
2020
Semi-Supervised Seq2seq Joint-Stochastic-Approximation Autoencoders With Applications to Semantic Parsing.
IEEE Signal Process. Lett., 2020
An empirical study of domain-agnostic semi-supervised learning via energy-based models: joint-training and pre-training.
CoRR, 2020
A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning.
CoRR, 2020
Joint Stochastic Approximation and Its Application to Learning Discrete Latent Variable Models.
Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
CAT: A CTC-CRF Based ASR Toolkit Bridging the Hybrid and the End-to-End Approaches Towards Data Efficiency and Low Latency.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Integrating Discrete and Neural Features Via Mixed-Feature Trans-Dimensional Random Field Language Models.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
Task-Oriented Dialog Systems That Consider Multiple Appropriate Responses under the Same Context.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
2018
IEEE Trans. Pattern Anal. Mach. Intell., 2018
A Review of Learning with Deep Generative Models from perspective of graphical modeling.
CoRR, 2018
Improved Training Of Neural Trans-Dimensional Random field Language Models with Dynamic Noise-Contrastive Estimation.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
Learning Sparse Structured Ensembles with stochastic Gradient MCMC Sampling and Network Pruning.
Proceedings of the 28th IEEE International Workshop on Machine Learning for Signal Processing, 2018
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Learning Neural Trans-Dimensional Random Field Language Models with Noise-Contrastive Estimation.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
2016
Scalable Discovery of Audio Fingerprint Motifs in Broadcast Streams With Determinantal Point Process Based Motif Clustering.
IEEE ACM Trans. Audio Speech Lang. Process., 2016
Block-wise map inference for determinantal point processes with application to change-point detection.
Proceedings of the IEEE Statistical Signal Processing Workshop, 2016
Use of particle filtering and MCMC for inference in Probabilistic Acoustic Tube model.
Proceedings of the IEEE Statistical Signal Processing Workshop, 2016
2015
Block-Wise MAP Inference for Determinantal Point Processes with Application to Change-Point Detection.
CoRR, 2015
Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015
2014
Low-complexity video encoder for smart eyes based on underdetermined blind signal separation.
CoRR, 2014
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
2012
Probabilistic acoustic tube: a probabilistic generative model of speech for speech analysis/synthesis.
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012
CRF-based confidence measures of recognized candidates for lattice-based audio indexing.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Combining eigenvoice speaker modeling and VTS-based environment compensation for robust speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
2011
Combining HMM-based melody extraction and NMF-based soft masking for separating voice and accompaniment from monaural audio.
Proceedings of the IEEE International Conference on Acoustics, 2011
2010
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Spoken English assessment system for non-native speakers using acoustic and prosodic features.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
2008
Proceedings of the IEEE International Conference on Acoustics, 2008
2007
Closely Coupled Array Processing and Model-Based Compensation for Microphone Array Speech Recognition.
IEEE Trans. Speech Audio Process., 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
2006
Generalized Time-Series Active Search With Kullback-Leibler Distance for Audio Fingerprinting.
IEEE Signal Process. Lett., 2006
Partial-tied-mixture Auxiliary Chain Models for Speech Recognition Based on Dynamic Bayesian Networks.
Proceedings of the IEEE International Conference on Systems, 2006
Switching Auxiliary Chains for Speech Recognition based on Dynamic Bayesian Networks.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006
2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Closely Coupled Array Processing and Model-Based Compensation for Microphone Array Speech Recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
2002
A combined model of statics-dynamics of speech optimized using maximum mutual information.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002
Proceedings of the IEEE International Conference on Acoustics, 2002
2001
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001