Zeyu Jin

Orcid: 0000-0001-8465-8878

According to our database1, Zeyu Jin authored at least 51 papers between 2013 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Audiovisual emotion recognition based on bi-layer LSTM and multi-head attention mechanism on RAVDESS dataset.
J. Supercomput., January, 2025

2024
DMDSpeech: Distilled Diffusion Model Surpassing The Teacher in Zero-shot Speech Synthesis via Direct Metric Optimization.
CoRR, 2024

Code Drift: Towards Idempotent Neural Audio Codecs.
CoRR, 2024

Improving Generalization of Speech Separation in Real-World Scenarios: Strategies in Simulation, Optimization, and Evaluation.
CoRR, 2024

VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap.
CoRR, 2024

SpeechCraft: A Fine-Grained Expressive Speech Dataset with Natural Language Description.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

A Closer Look at the Limitations of Instruction Tuning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

GR0: Self-Supervised Global Representation Learning for Zero-Shot Voice Conversion.
Proceedings of the IEEE International Conference on Acoustics, 2024

Maskmark: Robust Neuralwatermarking for Real and Synthetic Speech.
Proceedings of the IEEE International Conference on Acoustics, 2024

MDX-GAN: Enhancing Perceptual Quality in Multi-Class Source Separation Via Adversarial Training.
Proceedings of the IEEE International Conference on Acoustics, 2024

SoulSkipper: A Voice-Controlled Emotional Adaptive Game to Complement Therapy for Social Anxiety Disorder.
Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2024

2023
HoloSinger: Semantics and Music Driven Motion Generation with Octahedral Holographic Projection.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

White Box Search Over Audio Synthesizer Parameters.
Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023

Efficient Spoken Language Recognition via Multilabel Classification.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022
High-order Numerical Homogenization for Dissipative Ordinary Differential Equations.
Multiscale Model. Simul., March, 2022

Stochastic Augmented Projected Gradient Methods for the Large-Scale Precoding Matrix Indicator Selection Problem.
IEEE Trans. Wirel. Commun., 2022

HEAR 2021: Holistic Evaluation of Audio Representations.
CoRR, 2022

Record Once, Post Everywhere: Automatic Shortening of Audio Stories for Social Media.
Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology, 2022

AI Carpet: Automatic Generation of Aesthetic Carpet Pattern.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Audio Similarity is Unreliable as a Proxy for Audio Quality.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Controllable Speech Representation Learning Via Voice Conversion and AIC Loss.
Proceedings of the IEEE International Conference on Acoustics, 2022

SQAPP: No-Reference Speech Quality Assessment Via Pairwise Preference.
Proceedings of the IEEE International Conference on Acoustics, 2022

Music Enhancement via Image Translation and Vocoding.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet.
CoRR, 2021

HiFi-GAN-2: Studio-Quality Speech Enhancement via Generative Adversarial Networks Conditioned on Acoustic Features.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021


Controllable deep melody generation via hierarchical music structure representation.
Proceedings of the 22nd International Society for Music Information Retrieval Conference, 2021

Compare Machine Learning Models in Text Classification Using Steam User Reviews.
Proceedings of the ICSED 2021: 3rd International Conference on Software Engineering and Development, Xiamen, China, November 19, 2021

Bandwidth Extension is All You Need.
Proceedings of the IEEE International Conference on Acoustics, 2021

Context-Aware Prosody Correction for Text-Based Speech Editing.
Proceedings of the IEEE International Conference on Acoustics, 2021

CDPAM: Contrastive Learning for Perceptual Audio Similarity.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Pose2Pose: pose selection and transfer for 2D character animation.
Proceedings of the IUI '20: 25th International Conference on Intelligent User Interfaces, 2020

Metric learning vs classification for disentangled music representation learning.
Proceedings of the 21th International Society for Music Information Retrieval Conference, 2020

HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Controllable Neural Prosody Synthesis.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Acoustic Matching By Embedding Impulse Responses.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

F0-Consistent Many-To-Many Non-Parallel Voice Conversion Via Conditional Autoencoder.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Disentangled Multidimensional Metric Learning for Music Similarity.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Music Creation by Example.
Proceedings of the CHI '20: CHI Conference on Human Factors in Computing Systems, 2020

2019
Text-based editing of talking-head video.
ACM Trans. Graph., 2019

Perceptually-motivated Environment-specific Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2019

Learning Bandwidth Expansion Using Perceptually-motivated Loss.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Speech Synthesis for Text-Based Editing of Audio Narration
PhD thesis, 2018

Fftnet: A Real-Time Speaker-Dependent Neural Vocoder.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
VoCo: text-based insertion and replacement in audio narration.
ACM Trans. Graph., 2017

2016
Cute: A concatenative method for voice conversion using exemplar-based unit selection.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Mallo: a distributed synchronized musical instrument designed for internet performance.
Proceedings of the 15th International Conference on New Interfaces for Musical Expression, 2015

2014
AudioQuilt: 2D Arrangements of Audio Samples using Metric Learning and Kernelized Sorting.
Proceedings of the 14th International Conference on New Interfaces for Musical Expression, 2014

2013
Formal Semantics for Music Notation control Flow.
Proceedings of the 39th International Computer Music Conference, 2013


  Loading...