Masato Akagi
Orcid: 0000-0003-2450-6754
According to our database1,
Masato Akagi
authored at least 155 papers
between 1988 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Modeling and Estimation of Vocal Tract and Glottal Source Parameters Using ARMAX-LF Model.
CoRR, 2024
Machine Anomalous Sound Detection Using Spectral-temporal Modulation Representations Derived from Machine-specific Filterbanks.
CoRR, 2024
2023
IEEE ACM Trans. Audio Speech Lang. Process., 2023
IEEE Access, 2023
Data-driven Non-uniform Filterbanks Based on F-ratio for Machine Anomalous Sound Detection.
Proceedings of the 31st European Signal Processing Conference, 2023
Increasing Speech Intelligibility by Mimicking Professional Announcers' Voices and Its Physical Correlates.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
2022
Speech Commun., 2022
Survey on bimodal speech emotion recognition from acoustic and linguistic information fusion.
Speech Commun., 2022
Speech Emotion and Naturalness Recognitions With Multitask and Single-Task Learnings.
IEEE Access, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Speak Like a Professional: Increasing Speech Intelligibility by Mimicking Professional Announcer Voice with Voice Conversion.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Relationship Between Speakers' Physiological Structure and Acoustic Speech Signals: Data-Driven Study Based on Frequency-Wise Attentional Neural Network.
Proceedings of the 30th European Signal Processing Conference, 2022
2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Increasing speech intelligibility and naturalness in noise based on concepts of modulation spectrum and modulation transfer function.
Speech Commun., 2021
Two-stage dimensional emotion recognition by fusing predictions of acoustic and text networks using SVM.
Speech Commun., 2021
Multi-resolution modulation-filtered cochleagram feature for LSTM-based dimensional emotion recognition from speech.
Neural Networks, 2021
Comput. Speech Lang., 2021
Cross-Lingual Voice Conversion With Controllable Speaker Individuality Using Variational Autoencoder and Star Generative Adversarial Network.
IEEE Access, 2021
Study on Simultaneous Estimation of Glottal Source and Vocal Tract Parameters by ARMAX-LF Model for Speech Analysis/Synthesis.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Hierarchical Prosody Analysis Improves Categorical and Dimensional Emotion Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
2020
Simultaneous Estimation of Glottal Source Waveforms and Vocal Tract Shapes from Speech Signals Based on ARX-LF Model.
J. Signal Process. Syst., 2020
Effect of articulatory and acoustic features on the intelligibility of speech in noise: An articulatory synthesis study.
Speech Commun., 2020
Combining F0 and non-negative constraint robust principal component analysis for singing voice separation.
Signal Process., 2020
IEICE Trans. Inf. Syst., 2020
IEICE Trans. Inf. Syst., 2020
Speech Emotion Recognition Using 3D Convolutions and Attention-Based Sliding Recurrent Networks With Auditory Front-Ends.
IEEE Access, 2020
Predicting Valence and Arousal by Aggregating Acoustic Features for Acoustic-Linguistic Information Fusion.
Proceedings of the 2020 IEEE Region 10 Conference, 2020
On The Differences Between Song and Speech Emotion Recognition: Effect of Feature Sets, Feature Types, and Classifiers.
Proceedings of the 2020 IEEE Region 10 Conference, 2020
Improving Valence Prediction in Dimensional Speech Emotion Recognition Using Linguistic Information.
Proceedings of the 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Segment-Level Effects of Gender, Nationality and Emotion Information on Text-Independent Speaker Verification.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Multitask Learning and Multistage Fusion for Dimensional Audiovisual Emotion Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Non-parallel Voice Conversion based on Hierarchical Latent Embedding Vector Quantized Variational Autoencoder.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
Enhancement of speech intelligibility under noisy reverberant conditions based on modulation spectrum concept.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
2019
Improving multilingual speech emotion recognition by combining acoustic features in a three-layer model.
Speech Commun., 2019
Blind monaural singing voice separation using rank-1 constraint robust principal component analysis and vocal activity detection.
Neurocomputing, 2019
The Contribution of Acoustic Features Analysis to Model Emotion Perceptual Process for Language Diversity.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Dimensional Emotion Recognition from Speech Using Modulation Spectral Features and Recurrent Neural Networks.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Evaluation of the Lombard effect model on synthesizing Lombard speech in varying noise level environments with limited data.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Non-parallel Voice Conversion with Controllable Speaker Individuality using Variational Autoencoder.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
2018
Voice conversion for emotional speech: Rule-based synthesis with degree of emotion controllable in dimensional space.
Speech Commun., 2018
Estimation of glottal source waveforms and vocal tract shapes from speech signals based on ARX-LF model.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
A Three-Layer Emotion Perception Model for Valence and Arousal-Based Detection from Multilingual Speech.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Auditory-Inspired End-to-End Speech Emotion Recognition Using 3D Convolutional Recurrent Neural Networks Based on Spectral-Temporal Representation.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018
Unsupervised Singing Voice Separation Based on Robust Principal Component Analysis Exploiting Rank-1 Constraint.
Proceedings of the 26th European Signal Processing Conference, 2018
Estimation of glottal source waveforms and vocal tract shape for singing voices with wide frequency range.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018
Unsupervised Singing Voice Separation Using Gammatone Auditory Filterbank and Constraint Robust Principal Component Analysis.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018
Maximal Information Coefficient and Predominant Correlation-Based Feature Selection Toward A Three-Layer Model for Speech Emotion Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018
2017
Method of Blindly Estimating Speech Transmission Index in Noisy Reverberant Environments.
J. Inf. Hiding Multim. Signal Process., 2017
Method of Estimating Signal-to-Noise Ratio Based on Optimal Design for Sub-band Voice Activity Detection.
J. Inf. Hiding Multim. Signal Process., 2017
Proceedings of the 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment, 2017
Commonalities of Glottal Sources and Vocal Tract Shapes Among Speakers in Emotional Speech.
Proceedings of the Studies on Speech Production - 11th International Seminar, 2017
Weighted Robust Principal Component Analysis with Gammatone Auditory Filterbank for Singing Voice Separation.
Proceedings of the Neural Information Processing - 24th International Conference, 2017
Study on method for protecting speech privacy by actively controlling speech transmission index in simulated room.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Speech emotion recognition using multichannel parallel convolutional recurrent neural networks based on gammatone auditory filterbank.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
2016
Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments.
J. Signal Process. Syst., 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Voice conversion to emotional speech based on three-layered model in dimensional approach and parameterization of dynamic features in prosody.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
Proceedings of the International Conference on Advanced Intelligent Systems and Informatics, 2016
2015
Toward improving estimation accuracy of emotion dimensions in bilingual scenario based on three-layered model.
Proceedings of the 2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015
Emotional speech synthesis system based on a three-layered model using a dimensional approach.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
2014
Binaural Sound Source Localization in Noisy Reverberant Environments Based on Equalization-Cancellation Theory.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2014
Toward relaying an affective Speech-to-Speech translator: Cross-language perception of emotional state represented by emotion dimensions.
Proceedings of the 2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA), 2014
Proceedings of the Knowledge and Systems Engineering, 2014
Emotional Speech Recognition and Synthesis in Multiple Languages toward Affective Speech-to-Speech Translation System.
Proceedings of the 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2014
A method for emotional speech synthesis based on the position of emotional state in Valence-Activation space.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
Toward affective speech-to-speech translation: Strategy for emotional speech recognition and synthesis in multiple languages.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
2013
Improving Naturalness of HMM-Based TTS Trained with Limited Data by Temporal Decomposition.
IEICE Trans. Inf. Syst., 2013
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013
Comparative investigation of objective speech intelligibility prediction measures for noise-reduced signals in Mandarin and Japanese.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Admissible Range for Individualization of Head-Related Transfer Function in Median Plane.
Proceedings of the Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2013
Blind method of estimating speech transmission index from reverberant speech signals.
Proceedings of the 21st European Signal Processing Conference, 2013
Acoustic sound source tracking for a moving object using precise Doppler-Shift measurement.
Proceedings of the 21st European Signal Processing Conference, 2013
Blind method of estimating speech transmission index in room acoustics based on concept of modulation transfer function.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013
Objective Japanese intelligibility prediction for noisy speech signals before and after noise-reduction processing.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013
Improve equalization-cancellation-based sound localization in noisy reverberant environments using direct-to-reverberant energy ratio.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013
Cross-lingual speech emotion recognition system based on a three-layer model for human perception.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013
2012
Evaluation of objective intelligibility prediction measures for noise-reduced signals in mandarin.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012
Speech emotion recognition system based on a dimensional approach using a three-layered model.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012
2011
Two-stage binaural speech enhancement with Wiener filter for high-quality speech communication.
Speech Commun., 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
2010
IEICE Trans. Inf. Syst., 2010
Intelligibility investigation of single-channel noise reduction algorithms for Chinese and Japanese.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
2009
Two-stage binaural speech enhancement with wiener filter based on equalization-cancellation model.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009
Efficient modeling of temporal structure of speech for applications in voice transformation.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Psychoacoustically-motivated adaptive beta-order generalized spectral subtraction for cochlear implant patients.
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the 17th European Signal Processing Conference, 2009
2008
Signal Process., 2008
A Two-Microphone Noise Reduction Method in Highly Non-stationary Multiple-Noise-Source Environments.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2008
The Improved TS-BASE Approaches with Interference Compensation and Their Evaluations for Speech Enhancement.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Robust front end processing for speech recognition in reverberant environments: utilization of speech characteristics.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
High-quality analysis/synthesis method based on temporal decomposition for speech modification.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Psychoacoustically-motivated adaptive β-order generalized spectral subtraction based on data-driven optimization.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
2007
Limited error based event localizing temporal decomposition and its application to variable-rate speech coding.
Speech Commun., 2007
Speaker Individualities in Speech Spectral Envelopes and Fundamental Frequency Contours.
Proceedings of the Speaker Classification II, 2007
Method of LP-based blind restoration for improving intelligibility of bone-conducted speech.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
A flexible spectral modification method based on temporal decomposition and Gaussian mixture model.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Noise reduction based on adaptive β-order generalized spectral subtraction for speech enhancement.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
2006
A noise reduction system based on hybrid noise estimation technique and post-filtering in arbitrary noise environments.
Speech Commun., 2006
Communication Between Speech Production and Perception Within the Brain-Observation and Simulation.
J. Comput. Sci. Technol., 2006
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006
A robust feature extraction based on the MTF concept for speech recognition in reverberant environment.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Improved hybrid microphone array post-filter by integrating a robust speech absence probability estimator for speech enhancement.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
2005
Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis.
Speech Commun., 2005
A model for selective segregation of a target instrument sound from the mixed sound of various instruments.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
A noise reduction system in arbitrary noise environments and its applications to speech enhancement and speech recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
Toward a Rule-Based Synthesis of Emotional Speech on Linguistic Descriptions of Perception.
Proceedings of the Affective Computing and Intelligent Interaction, 2005
2004
Fundamental Frequency Estimation for Noisy Speech Using Entropy-Weighted Periodic and Harmonic Features.
IEICE Trans. Inf. Syst., 2004
Analysis of acoustic features affecting "singing-ness" and its application to singing-voice synthesis from speaking-voice.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
A speech dereverberation method based on the MTF concept using adaptive time-frequency divisions.
Proceedings of the 2004 12th European Signal Processing Conference, 2004
2003
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
A model for selective segregation of a target instrument sound from the mixed sound of various instruments.
Proceedings of the 2003 International Computer Music Conference, 2003
A method based on the MTF concept for dereverberating the power envelope from the reverberant signal.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
2002
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002
Improvement of the restricted temporal decomposition method for line spectral frequency parameters.
Proceedings of the IEEE International Conference on Acoustics, 2002
Noise reduction using a small-scale microphone array in multi noise source environment.
Proceedings of the IEEE International Conference on Acoustics, 2002
Proceedings of the 11th European Signal Processing Conference, 2002
2001
Comput. Speech Lang., 2001
A fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001
2000
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
Perception of synthesized singing voices with fine fluctuations in their fundamental frequency contours.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
1999
Speech Commun., 1999
Segregation of vowel in background noise using the model of segregating two acoustic sources based on auditory scene analysis.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999
An objective distortion estimator for hearing aids and its application to noise reduction.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999
1998
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998
1997
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997
1996
Proceedings of the 4th International Conference on Spoken Language Processing, 1996
1995
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995
1994
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994
1990
Proceedings of the First International Conference on Spoken Language Processing, 1990
1988
Proceedings of the IEEE International Conference on Acoustics, 1988