Xiaohai Tian

Orcid: 0000-0001-5219-1249

According to our database1, Xiaohai Tian authored at least 60 papers between 2010 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation.
CoRR, 2024

SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words.
CoRR, 2024

CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing.
CoRR, 2024

2023
Optimization of Cross-Lingual Voice Conversion With Linguistics Losses to Reduce Foreign Accents.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

TTS-Guided Training for Accent Conversion Without Parallel Data.
IEEE Signal Process. Lett., 2023

Disentangling the Contribution of Non-native Speech in Automated Pronunciation Assessment.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

An ASR-Free Fluency Scoring Approach with Self-Supervised Learning.
Proceedings of the IEEE International Conference on Acoustics, 2023

Leveraging Phone-Level Linguistic-Acoustic Similarity For Utterance-Level Pronunciation Scoring.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Improving Non-native Word-level Pronunciation Scoring with Phone-level Mixup Data Augmentation and Multi-source Information.
CoRR, 2022

A Transfer and Multi-Task Learning based Approach for MOS Prediction.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Using Fluency Representation Learned from Sequential Raw Features for Improving Non-native Fluency Scoring.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021
Language Agnostic Speaker Embedding for Cross-Lingual Personalized Speech Generation.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

NHSS: A speech and singing parallel database.
Speech Commun., 2021

Factorized WaveNet for voice conversion with limited data.
Speech Commun., 2021

Optimizing Voice Conversion Network with Cycle Consistency Loss of Speaker Identity.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Cross-Lingual Voice Conversion with a Cycle Consistency Loss on Linguistic Representation.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

The Multi-Speaker Multi-Style Voice Cloning Challenge 2021.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Multi-Task WaveRNN With an Integrated Architecture for Cross-Lingual Voice Conversion.
IEEE Signal Process. Lett., 2020

Black-box Attacks on Automatic Speaker Verification using Feedback-controlled Voice Conversion.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Personalized Singing Voice Generation Using WaveRNN.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

The Attacker's Perspective on Automatic Speaker Verification: An Overview.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

End-to-End Code-Switching TTS with Cross-Lingual Language Model.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Effective Wavenet Adaptation for Voice Conversion with Limited Data.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

The NUS & NWPU system for Voice Conversion Challenge 2020.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020

Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020

NUS-HLT System for Blizzard Challenge 2020.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020

Voice Conversion Challenge 2020 -- Intra-lingual semi-parallel and cross-lingual voice conversion --.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020

2019
Voice conversion with parallel/non-parallel data and synthetic speech detection
PhD thesis, 2019

A Vocoder-free WaveNet Voice Conversion with Non-Parallel Data.
CoRR, 2019

A Speaker-Dependent WaveNet for Voice Conversion with Non-Parallel Data.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Cross-lingual Voice Conversion with Bilingual Phonetic Posteriorgram and Average Modeling.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Modularized Neural Network with Language-Specific Output Layers for Cross-Lingual Voice Conversion.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

WaveNet Factorization with Singular Value Decomposition for Voice Conversion.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Many-to-many Cross-lingual Voice Conversion with a Jointly Trained Speaker Embedding Network.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Speaker-independent Spectral Mapping for Speech-to-Singing Conversion.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Average Modeling Approach to Voice Conversion with Non-Parallel Data.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Usability Analysis of the Novel Functions to Assist the Senior Customers in Online Shopping.
Proceedings of the Social Computing and Social Media. User Experience and Behavior, 2018

The TL-NTU Text-to-speech System for the Blizzard Challenge 2018.
Proceedings of the Blizzard Challenge 2018, Hyderabad, India, September 8, 2018, 2018

2017
An Exemplar-Based Approach to Frequency Warping for Voice Conversion.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Towards Age-friendly E-commerce Through Crowd-Improved Speech Recognition, Multimodal Search, and Personalized Speech Feedback.
Proceedings of the 2nd International Conference on Crowd Science and Engineering, 2017

Improving air traffic control speech intelligibility by reducing speaking rate effectively.
Proceedings of the 2017 International Conference on Asian Language Processing, 2017

Novel Functional Technologies for Age-Friendly E-commerce.
Proceedings of the Human Aspects of IT for the Aged Population. Applications, Services and Contexts, 2017

An investigation of spectral feature partitioning for replay attacks detection.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
High quality voice conversion using prosodic and high-resolution spectral features.
Multim. Tools Appl., 2016

Spoofing detection under noisy conditions: a preliminary investigation and an initial database.
CoRR, 2016

An Automatic Voice Conversion Evaluation Strategy Based on Perceptual Background Noise Distortion and Speaker Similarity.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

An Investigation of Spoofing Speech Detection Under Additive Noise and Reverberant Conditions.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Spoofing detection from a feature representation perspective.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Spoofing speech detection using temporal convolutional neural network.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015
Spoofing speech detection using high dimensional magnitude and phase features: the NTU approach for ASVspoof 2015 challenge.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

System fusion for high-performance voice conversion.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Personalized synthetic voices for speaking impaired: website and app.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Sparse representation for frequency warping based voice conversion.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Detecting synthetic speech using long term magnitude and phase information.
Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

A waveform representation framework for high-quality statistical parametric speech synthesis.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

2014
Correlation-based frequency warping for voice conversion.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

A comparative study of spectral transformation techniques for singing voice synthesis.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2013
Local partial least square regression for spectral mapping in voice conversion.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2010
Speech and Auditory Interfaces for Ubiquitous, Immersive and Personalized Applications.
Proceedings of the Symposia and Workshops on Ubiquitous, 2010


  Loading...