Yu Tsao
Orcid: 0000-0001-6956-0418Affiliations:
- Academia Sinica, Research Center for Information Technology Innovation, Taipei, Taiwan
According to our database1,
Yu Tsao
authored at least 391 papers
between 2001 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
IEEE Trans. Cogn. Dev. Syst., February, 2024
An SRAM-Based Reconfigurable Cognitive Computation Matrix for Sensor Edge Applications.
IEEE J. Solid State Circuits, February, 2024
Unsupervised Face-Masked Speech Enhancement Using Generative Adversarial Networks With Human-in-the-Loop Assessment Metrics.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing.
CoRR, 2024
Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement.
CoRR, 2024
CoRR, 2024
Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition.
CoRR, 2024
Temporal Order Preserved Optimal Transport-based Cross-modal Knowledge Transfer Learning for ASR.
CoRR, 2024
Exploiting Consistency-Preserving Loss and Perceptual Contrast Stretching to Boost SSL-based Speech Enhancement.
CoRR, 2024
EMO-Codec: An In-Depth Look at Emotion Preservation capacity of Legacy and Neural Codec Models With Subjective and Objective Evaluations.
CoRR, 2024
SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models.
CoRR, 2024
CoRR, 2024
Towards Environmental Preference Based Speech Enhancement For Individualised Multi-Modal Hearing Aids.
CoRR, 2024
Audio-Visual Speech Enhancement in Noisy Environments via Emotion-Based Contextual Cues.
CoRR, 2024
A Non-Intrusive Neural Quality Assessment Model for Surface Electromyography Signals.
CoRR, 2024
CoRR, 2024
Prognosticating Lumbar Spinal Surgery Outcomes for Low Back Pain and Sciatica Patients by Utilizing Preoperative Assessments from Western and Eastern Medicine and Multimodal Fusion Learning Techniques.
Proceedings of the 2024 8th International Conference on Medical and Health Informatics, 2024
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Lightly Weighted Automatic Audio Parameter Extraction for the Quality Assessment of Consensus Auditory-Perceptual Evaluation of Voice.
Proceedings of the IEEE International Conference on Consumer Electronics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Scalable Ensemble-Based Detection Method Against Adversarial Attacks For Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-Based ASR.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Bridging the Gap: Integrating Pre-Trained Speech Enhancement and Recognition Models for Robust Speech Recognition.
Proceedings of the 32nd European Signal Processing Conference, 2024
The Multilayer Neural Network Implementation Using SRAM-Based Reconfigurable Cognitive Computation Matrices.
Proceedings of the 6th IEEE International Conference on AI Circuits and Systems, 2024
2023
Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing.
J. Open Source Softw., November, 2023
IEEE Trans. Biomed. Eng., October, 2023
Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing (espnet-v.202310).
Dataset, October, 2023
SRECG: ECG Signal Super-Resolution Framework for Portable/Wearable Devices in Cardiac Arrhythmias Classification.
IEEE Trans. Consumer Electron., August, 2023
Deep Learning-Based Non-Intrusive Multi-Objective Speech Assessment Model With Cross-Domain Features.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Improving Speech Enhancement Performance by Leveraging Contextual Broad Phonetic Class Information.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
IEEE Signal Process. Lett., 2023
AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency for Video Deepfake Detection.
CoRR, 2023
CoRR, 2023
AVTENet: Audio-Visual Transformer-based Ensemble Network Exploiting Multiple Experts for Video Deepfake Detection.
CoRR, 2023
CoRR, 2023
Utilizing Whisper to Enhance Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids.
CoRR, 2023
Noise robust speech emotion recognition with signal-to-noise ratio adapting speech enhancement.
CoRR, 2023
Deep denoising autoencoder-based non-invasive blood flow detection for arteriovenous fistula.
CoRR, 2023
Preoperative Prognosis Assessment of Lumbar Spinal Surgery for Low Back Pain and Sciatica Patients based on Multimodalities and Multimodal Learning.
CoRR, 2023
Self-supervised based general laboratory progress pretrained model for cardiovascular event detection.
CoRR, 2023
BASPRO: a balanced script producer for speech corpus collection based on the genetic algorithm.
CoRR, 2023
Biomed. Signal Process. Control., 2023
Wearable-based Pain Assessment in Patients with Adhesive Capsulitis Using Machine Learning.
Proceedings of the 11th International IEEE/EMBS Conference on Neural Engineering, 2023
Proceedings of the 33rd IEEE International Workshop on Machine Learning for Signal Processing, 2023
Proceedings of the 33rd IEEE International Workshop on Machine Learning for Signal Processing, 2023
Proceedings of the 33rd IEEE International Workshop on Machine Learning for Signal Processing, 2023
Deep Learning-based Fall Detection Algorithm Using Ensemble Model of Coarse-fine CNN and GRU Networks.
Proceedings of the IEEE International Symposium on Medical Measurements and Applications, 2023
Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
A Training and Inference Strategy Using Noisy and Enhanced Speech as Target for Speech Enhancement without Clean Speech.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Proceedings of the Eleventh International Conference on Learning Representations, 2023
ECG Artifact Removal from Single-Channel Surface EMG Using Fully Convolutional Networks.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Towards Individualised Speech Enhancement: An SNR Preference Learning System for Multi-Modal Hearing Aids.
Proceedings of the IEEE International Conference on Acoustics, 2023
T5lephone: Bridging Speech and Text Self-Supervised Models for Spoken Language Understanding Via Phoneme Level T5.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Audio-Visual Speech Enhancement and Separation by Utilizing Multi-Modal Self-Supervised Embeddings.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023
Abnormal Respiratory Sound Identification Using Audio-Spectrogram Vision Transformer.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
The Voicemos Challenge 2023: Zero-Shot Subjective Speech Quality Prediction for Multiple Domains.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Study on the Correlation Between Objective Evaluations and Subjective Speech Quality and Intelligibility.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
2022
SEOFP-NET: Compression and Acceleration of Deep Neural Networks for Speech Enhancement Using Sign-Exponent-Only Floating-Points.
IEEE ACM Trans. Audio Speech Lang. Process., 2022
IEEE ACM Trans. Audio Speech Lang. Process., 2022
Deep-Learning-Based Signal Enhancement of Low-Resolution Accelerometer for Fall Detection Systems.
IEEE Trans. Cogn. Dev. Syst., 2022
A Novel Temporal Attentive-Pooling based Convolutional Recurrent Architecture for Acoustic Signal Enhancement.
IEEE Trans. Artif. Intell., 2022
IEEE Signal Process. Lett., 2022
EPG2S: Speech Generation and Speech Enhancement Based on Electropalatography and Audio Signals Using Multimodal Learning.
IEEE Signal Process. Lett., 2022
Neural correlates of individual differences in predicting ambiguous sounds comprehension level.
NeuroImage, 2022
Audio-Visual Speech Enhancement and Separation by Leveraging Multi-Modal Self-Supervised Embeddings.
CoRR, 2022
A Teacher-student Framework for Unsupervised Speech Enhancement Using Noise Remixing Training and Two-stage Inference.
CoRR, 2022
Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN.
CoRR, 2022
EPG2S: Speech Generation and Speech Enhancement based on Electropalatography and Audio Signals using Multimodal Learning.
CoRR, 2022
A Novel Speech Intelligibility Enhancement Model based on CanonicalCorrelation and Deep Learning.
CoRR, 2022
IEEE Access, 2022
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing, 2022
Preservation Of Interaural Level Difference Cue In A Deep Learning-Based Speech Separation System For Bilateral And Bimodal Cochlear Implants Users.
Proceedings of the 17th International Workshop on Acoustic Signal Enhancement, 2022
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
InQSS: a speech intelligibility and quality assessment model using a multi-task learning network.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
When BERT Meets Quantum Temporal Convolution Learning for Text Classification in Heterogeneous Computing.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
MetricGAN-U: Unsupervised Speech Enhancement/ Dereverberation Based Only on Noisy/ Reverberated Speech.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE Global Communications Conference, 2022
Recurrent Neural Network-based Estimation and Correction of Relative Transfer Function for Preserving Spatial Cues in Speech Separation.
Proceedings of the 30th European Signal Processing Conference, 2022
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022
A Novel Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022
XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2022
2021
Dress With Style: Learning Style From Joint Deep Embedding of Clothing Styles and Body Shapes.
IEEE Trans. Multim., 2021
Coupling a Generative Model With a Discriminative Learning Framework for Speaker Verification.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
A Study of Joint Effect on Denoising Techniques and Visual Cues to Improve Speech Intelligibility in Cochlear Implant Simulation.
IEEE Trans. Cogn. Dev. Syst., 2021
IEEE Trans. Cogn. Dev. Syst., 2021
Sensing ecosystem dynamics via audio source separation: A case study of marine soundscapes off northeastern Taiwan.
PLoS Comput. Biol., 2021
Predicting the Travel Distance of Patients to Access Healthcare using Deep Neural Networks.
CoRR, 2021
InQSS: a speech intelligibility assessment model using a multi-task learning network.
CoRR, 2021
CoRR, 2021
A Study of Low-Resource Speech Commands Recognition based on Adversarial Reprogramming.
CoRR, 2021
Integrating a joint Bayesian generative model in a discriminative learning framework for speaker verification.
CoRR, 2021
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing, 2021
Investigation of a Single-Channel Frequency-Domain Speech Enhancement Network to Improve End-to-End Bengali Automatic Speech Recognition Under Unseen Noisy Conditions.
Proceedings of the 24th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2021
Unsupervised Noise Adaptive Speech Enhancement by Discriminator-Constrained Optimal Transport.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Deep Learning and Explainable Artificial Intelligence to Predict Patients' Choice of Hospital Levels in Urban and Rural Areas.
Proceedings of the MEDINFO 2021: One World, One Health - Global Partnership for Digital Innovation, 2021
MoEVC: A Mixture of Experts Voice Conversion System With Sparse Gating Mechanism for Online Computation Acceleration.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Attention-Based Multi-Task Learning for Speech-Enhancement and Speaker-Identification in Multi-Speaker Dialogue Scenario.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021
Relational Data Selection for Data Augmentation of Speaker-Dependent Multi-Band MelGAN Vocoder.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Improving Perceptual Quality by Phone-Fortified Perceptual Loss Using Wasserstein Distance for Speech Enhancement.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Unsupervised Neural Adaptation Model Based on Optimal Transport for Spoken Language Identification.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the 29th European Signal Processing Conference, 2021
Proceedings of the 29th European Signal Processing Conference, 2021
Instrumented shoulder functional assessment using inertial measurement units for frozen shoulder.
Proceedings of the IEEE EMBS International Conference on Biomedical and Health Informatics, 2021
Mandarin Electrolaryngeal Speech Voice Conversion with Sequence-to-Sequence Modeling.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
MIMO Speech Compression and Enhancement Based on Convolutional Denoising Autoencoder.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Estimation and Correction of Relative Transfer Function for Binaural Speech Separation Networks to Preserve Spatial Cues.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
2020
Blind Monaural Source Separation on Heart and Lung Sounds Based on Periodic-Coded Deep Autoencoder.
IEEE J. Biomed. Health Informatics, 2020
Unsupervised Representation Disentanglement Using Cross Domain Features and Adversarial Learning in Variational Autoencoder Based Voice Conversion.
IEEE Trans. Emerg. Top. Comput. Intell., 2020
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Multichannel Speech Enhancement by Raw Waveform-Mapping Using Fully Convolutional Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Subspace-Based Representation and Learning for Phonotactic Spoken Language Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
IEEE Trans. Cogn. Dev. Syst., 2020
IEEE Signal Process. Lett., 2020
WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-End Speech Enhancement.
IEEE Signal Process. Lett., 2020
Learning With Learned Loss Function: Speech Enhancement With Quality-Net to Improve Perceptual Evaluation of Speech Quality.
IEEE Signal Process. Lett., 2020
ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech.
Comput. Speech Lang., 2020
ECG Signal Super-resolution by Considering Reconstruction and Cardiac Arrhythmias Classification Loss.
CoRR, 2020
Improving Perceptual Quality by Phone-Fortified Perceptual Loss for Speech Enhancement.
CoRR, 2020
CoRR, 2020
Using Deep Learning and Explainable Artificial Intelligence in Patients' Choices of Hospital Levels.
CoRR, 2020
Boosting Objective Scores of Speech Enhancement Model through MetricGAN Post-Processing.
CoRR, 2020
CoRR, 2020
CoRR, 2020
IEEE Access, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
iMetricGAN: Intelligibility Enhancement for Speech-in-Noise Using Generative Adversarial Network-Based Metric Learning.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
SERIL: Noise Adaptive Speech Enhancement Using Regularization-Based Incremental Learning.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Enhancing Intelligibility of Dysarthric Speech Using Gated Convolutional-Based Voice Conversion System.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the IEEE International Conference on Image Processing, 2020
Exponentiated magnitude spectrogram-based relative-to-maximum masking for speech enhancement in adverse environments.
Proceedings of the IEEE International Conference on Consumer Electronics - Taiwan, 2020
Self-Supervised Denoising Autoencoder with Linear Regression Decoder for Speech Enhancement.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Cross-Technology Interference Mitigation Using Fully Convolutional Denoising Autoencoders.
Proceedings of the IEEE Global Communications Conference, 2020
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
STOI-Net: A Deep Learning based Non-Intrusive Speech Intelligibility Assessment Model.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
Boosting Objective Scores of a Speech Enhancement Model by MetricGAN Post-processing.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
2019
Computation-Performance Optimization of Convolutional Neural Networks With Redundant Filter Removal.
IEEE Trans. Circuits Syst. I Regul. Pap., 2019
Toward Automating Oral Presentation Scoring During Principal Certification Program Using Audio-Video Low-Level Behavior Profiles.
IEEE Trans. Affect. Comput., 2019
Increasing Compactness of Deep Learning Based Speech Enhancement Models With Parameter Pruning and Quantization Techniques.
IEEE Signal Process. Lett., 2019
MoEVC: A Mixture-of-experts Voice Conversion System with Sparse Gating Mechanism for Accelerating Online Computation.
CoRR, 2019
CoRR, 2019
Seeing Voices in Noise: A Study of Audiovisual-Enhanced Vocoded Speech Intelligibility in Cochlear Implant Simulation.
CoRR, 2019
Improving the Intelligibility of Electric and Acoustic Stimulation Speech Using Fully Convolutional Networks Based Speech Enhancement.
CoRR, 2019
Multichannel Speech Enhancement by Raw Waveform-mapping using Fully Convolutional Networks.
CoRR, 2019
Robust S1 and S2 heart sound recognition based on spectral restoration and multi-style training.
Biomed. Signal Process. Control., 2019
Evaluating Indoor Positioning Systems in a Shopping Mall: The Lessons Learned From the IPIN 2018 Competition.
IEEE Access, 2019
IEEE Access, 2019
Generalization of Spectrum Differential based Direct Waveform Modification for Voice Conversion.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019
Speech enhancement based on the integration of fully convolutional network, temporal lowpass filtering and spectrogram masking.
Proceedings of the 31st Conference on Computational Linguistics and Speech Processing, 2019
Proceedings of the 2nd IEEE Conference on Multimedia Information Processing and Retrieval, 2019
Proceedings of the Increasing Naturalness and Flexibility in Spoken Dialogue Interaction, 2019
Comparative Study of Masking and Mapping Based on Hierarchical Extreme Learning Machine for Speech Enhancement.
Proceedings of the 2019 International Symposium on Intelligent Signal Processing and Communication Systems, 2019
Specialized Speech Enhancement Model Selection Based on Learned Non-Intrusive Quality Assessment Metric.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
IA-NET: Acceleration and Compression of Speech Enhancement Using Integer-Adder Deep Neural Network.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Investigation of F0 Conditioning and Fully Convolutional Networks in Variational Autoencoder Based Voice Conversion.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Speaker-Aware Deep Denoising Autoencoder with Embedded Speaker Identity for Speech Enhancement.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Generative Adversarial Networks for Unpaired Voice Transformation on Impaired Speech.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement.
Proceedings of the 36th International Conference on Machine Learning, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the 27th European Signal Processing Conference, 2019
Proceedings of the 27th European Signal Processing Conference, 2019
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Investigation of Neural Network Approaches for Unified Spectral and Prosodic Feature Enhancement.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
A Pruned-CELP Speech Codec Using Denoising Autoencoder with Spectral Compensation for Quality and Intelligibility Enhancement.
Proceedings of the IEEE International Conference on Artificial Intelligence Circuits and Systems, 2019
2018
IEEE Trans. Emerg. Top. Comput. Intell., 2018
Suppression by Selecting Wavelets for Feature Compression in Distributed Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Speech Commun., 2018
SmartHear: A Smartphone-Based Remote Microphone Hearing Assistive System Using Wireless Technologies.
IEEE Syst. J., 2018
Off-Line Evaluation of Mobile-Centric Indoor Positioning Systems: The Experiences from the 2017 IPIN Competition.
Sensors, 2018
J. Inf. Sci. Eng., 2018
Speech Enhancement Based on Reducing the Detail Portion of Speech Spectrograms in Modulation Domain via Discrete Wavelet Transform.
CoRR, 2018
IEEE Access, 2018
A Study on Speech Enhancement Using Exponent-Only Floating Point Quantized Neural Network (EOFP-QNN).
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
Architecture Design of Convolutional Neural Networks for Face Detection on an FPGA Platform.
Proceedings of the 2018 IEEE International Workshop on Signal Processing Systems, 2018
WaveNet 聲碼器及其於語音轉換之應用 (WaveNet Vocoder and its Applications in Voice Conversion) [In Chinese].
Proceedings of the 30th Conference on Computational Linguistics and Speech Processing, 2018
Automatic Detection of Speech Under Cold Using Discriminative Autoencoders and Strength Modeling with Multiple Sub-Dictionary Generation.
Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Speech Enhancement Based on Reducing the Detail Portion of Speech Spectrograms in Modulation Domain via DiscreteWavelet Transform.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Quality-Net: An End-to-End Non-intrusive Speech Quality Assessment Model Based on BLSTM.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the International Conference on Fuzzy Theory and Its Applications, 2018
A Novel LSTM-Based Speech Preprocessor for Speaker Diarization in Realistic Mismatch Conditions.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Congruent Visual Stimulation Facilitates Auditory Frequency Change Detection: An ERP Study.
Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018
Improving the performance of hearing aids in noisy environments based on deep learning technology.
Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018
2017
A Deep Denoising Autoencoder Approach to Improving the Intelligibility of Vocoded Speech in Cochlear Implant Simulation.
IEEE Trans. Biomed. Eng., 2017
Joint Dictionary Learning-Based Non-Negative Matrix Factorization for Voice Conversion to Improve Speech Intelligibility After Oral Surgery.
IEEE Trans. Biomed. Eng., 2017
IEEE Trans. Biomed. Eng., 2017
IEEE ACM Trans. Audio Speech Lang. Process., 2017
Int. J. Comput. Linguistics Chin. Lang. Process., 2017
Acoustic Echo Cancellation Using an Improved Vector-Space-Based Adaptive Filtering Algorithm.
Int. J. Comput. Linguistics Chin. Lang. Process., 2017
Regularization of neural network model with distance metric learning for i-vector based spoken language identification.
Comput. Speech Lang., 2017
Multi-style learning with denoising autoencoders for acoustic modeling in the internet of things (IoT).
Comput. Speech Lang., 2017
End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks.
CoRR, 2017
CoRR, 2017
Audio-Visual Speech Enhancement based on Multimodal Deep Convolutional Neural Network.
CoRR, 2017
IEEE Access, 2017
A Smartphone-Based Multi-Functional Hearing Assistive System to Facilitate Speech Recognition in the Classroom.
IEEE Access, 2017
以軟體為基礎建構語音增強系統使用者介面 (Development of a software-based User-Interface of Speech Enhancement System) [In Chinese].
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing, 2017
以語音能量特性發展即時語速偵測裝置-前導型研究 (Real-time monitoring device of phonation speed and volume based on speech energy: A pilot study) [In Chinese].
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing, 2017
基於鑑別式自編碼解碼器之錄音回放攻擊偵測系統 (A Replay Spoofing Detection System Based on Discriminative Autoencoders) [In Chinese].
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing, 2017
改進的向量空間可適性濾波器用於聲學回聲消除 (Acoustic Echo Cancellation Using an Improved Vector-Space-Based Adaptive Filtering Algorithm) [In Chinese].
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing, 2017
多樣訊雜比之訓練語料於降噪自動編碼器其語音強化功能之初步研究 (A Preliminary Study of Various SNR-level Training Data in the Denoising Auto-encoder (DAE) Technique for Speech Enhancement) [In Chinese].
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing, 2017
Complex spectrogram enhancement by convolutional neural network with multi-metrics learning.
Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017
Proceedings of the IEEE International Symposium on Circuits and Systems, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
A Post-Filtering Approach Based on Locally Linear Embedding Difference Compensation for Speech Enhancement.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Voice Conversion from Unaligned Corpora Using Variational Autoencoding Wasserstein Generative Adversarial Networks.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Track-Clustering Error Evaluation for Track-Based Multi-camera Tracking System Employing Human Re-identification.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017
A deep learning based noise reduction approach to improve speech intelligibility for cochlear implant recipients in the presence of competing speech noise.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Proceedings of the 51st Asilomar Conference on Signals, Systems, and Computers, 2017
2016
IEEE Signal Process. Lett., 2016
Generalized maximum a posteriori spectral amplitude estimation for speech enhancement.
Speech Commun., 2016
Modeling speech intelligibility with recovered envelope from temporal fine structure stimulus.
Speech Commun., 2016
CoRR, 2016
Image Retrieval Using Color-Aware Tag on Progressive Image Search and Recommendation System.
Proceedings of the MultiMedia Modeling - 22nd International Conference, 2016
A pseudo-task design in multi-task learning deep neural network for speaker recognition.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Improving the performance of speech perception in noisy environment based on an FAME strategy.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Incorporating local environment information with ensemble neural networks to robust automatic speech recognition.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Pair-Wise Distance Metric Learning of Neural Network Model for Spoken Language Identification.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Minimization of Regression and Ranking Losses with Shallow Neural Networks on Automatic Sincerity Evaluation.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the IEEE International Conference on Consumer Electronics-Taiwan, 2016
Leveraging nonnegative matrix factorization in processing the temporal modulation spectrum for speech enhancement.
Proceedings of the IEEE International Conference on Consumer Electronics-Taiwan, 2016
Nonnegative matrix factorization-based frequency lowering technology for Mandarin-speaking hearing aid users.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the IEEE 5th Global Conference on Consumer Electronics, 2016
A linear regression model with dynamic pulse transit time features for noninvasive blood pressure prediction.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2016
Proceedings of the IEEE Second International Conference on Multimedia Big Data, 2016
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
2015
Compensating for Orientation Mismatch in Robust Wi-Fi Localization Using Histogram Equalization.
IEEE Trans. Veh. Technol., 2015
IEEE Signal Process. Lett., 2015
Rapid Converging M-Max Partial Update Least Mean Square Algorithms with New Variable Step-Size Methods.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2015
Robust Voice Activity Detection Algorithm Based on Feature of Frequency Modulation of Harmonics and Its DSP Implementation.
IEICE Trans. Inf. Syst., 2015
類神經網路訓練結合環境群集及專家混合系統於強健性語音辨識(Automatic Speech Recognition using Neural Network based Acoustic Model with the Environment Clustering and Mixture of Experts Algorithms) [In Chinese].
Proceedings of the 27th Conference on Computational Linguistics and Speech Processing, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the IEEE International Conference on Consumer Electronics - Taiwan, 2015
Proceedings of the IEEE International Conference on Consumer Electronics - Taiwan, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE Global Conference on Signal and Information Processing, 2015
Proceedings of the 2015 IEEE Global Conference on Signal and Information Processing, 2015
Improving denoising auto-encoder based speech enhancement with the speech parameter generation algorithm.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
2014
A MAP-based Online Estimation Approach to Ensemble Speaker and Speaking Environment Modeling.
IEEE ACM Trans. Audio Speech Lang. Process., 2014
IEICE Trans. Inf. Syst., 2014
Incorporating local information of the acoustic environments to MAP-based feature compensation and acoustic model adaptation.
Comput. Speech Lang., 2014
Effect of adaptive envelope compression in simulated electric hearing in reverberation.
Proceedings of the 2014 International Symposium on Integrated Circuits (ISIC), 2014
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
An adaptive envelope compression strategy for speech processing in cochlear implants.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Ensemble of machine learning algorithms for cognitive and physical speaker load detection.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
A Transfer Probabilistic Collective Factorization Model to Handle Sparse Data in Collaborative Filtering.
Proceedings of the 2014 IEEE International Conference on Data Mining, 2014
Sparse representation based on a bag of spectral exemplars for acoustic event detection.
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Robust anchorperson detection based on audio streams using a hybrid I-vector and DNN system.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
2013
結合I-Vector 及深層神經網路之語者驗證系統 (Text-independent Speaker Verification using a Hybrid I-Vector/DNN Approach) [In Chinese].
Proceedings of the 25th Conference on Computational Linguistics and Speech Processing, 2013
Evaluation of generalized maximum a posteriori spectral amplitude (GMAPA) speech enhancement algorithm in hearing aids.
Proceedings of the IEEE International Symposium on Consumer Electronics, 2013
Recurrent neural network based language model personalization by social network crowdsourcing.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
An investigation of spectral restoration algorithms for deep neural networks based noise robust speech recognition.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Ensemble of machine learning and acoustic segment model techniques for speech emotion and autism spectrum disorders recognition.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Alleviating the over-smoothing problem in GMM-based voice conversion with discriminative training.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Proceedings of the 2013 International Joint Conference on Neural Networks, 2013
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013
Filtering on the temporal probability sequence in histogram equalization for robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013
Speech enhancement using generalized maximum a posteriori spectral amplitude estimator.
Proceedings of the IEEE International Conference on Acoustics, 2013
Robust Wi-Fi location fingerprinting against device diversity based on spatial mean normalization.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013
2012
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Acoustic space partition based on broad phonetic class for ensemble acoustic modeling.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Discriminative Fuzzy Clustering Maximum a Posterior Linear Regression for Speaker Adaptation.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
2011
Incorporating Regional Information to Enhance MAP-Based Stochastic Feature Compensation for Robust Speech Recognition.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
A sampling-based environment population projection approach for rapid acoustic model adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2011
Increasing discriminative capability on MAP-based mapping function estimation for acoustic model adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2011
2010
An environment structuring framework to facilitating suitable prior density estimation for MAPLR on robust speech recognition.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
An acoustic segment model approach to incorporating temporal information into speaker modeling for text-independent speaker recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010
2009
An Ensemble Speaker and Speaking Environment Modeling Approach to Robust Speech Recognition.
IEEE Trans. Speech Audio Process., 2009
Soft margin estimation on improving environment structures for ensemble speaker and speaking environment modeling.
Proceedings of the 3rd International Universal Communication Symposium, 2009
A study on soft margin estimation of linear regression parameters for speaker adaptation.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Ensemble speaker and speaking environment modeling approach with advanced online estimation process.
Proceedings of the IEEE International Conference on Acoustics, 2009
MAP estimation of online mapping parameters in ensemble speaker and speaking environment modeling.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009
2008
An ensemble speaker and speaking environment modeling approach to robust speech recognition.
PhD thesis, 2008
Improving the ensemble speaker and speaking environment modeling approach by enhancing the precision of the online estimation process.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the IEEE International Conference on Acoustics, 2008
2007
An ensemble modeling approach to joint characterization of speaker and speaking environments.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Two extensions to ensemble speaker and speaking environment modeling for robust automatic speech recognition.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007
2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
2005
IEEE Trans. Speech Audio Process., 2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
A Study on Knowledge Source Integration for Candidate Rescoring in Automatic Speech Recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
2001
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001