Ye Wang

Orcid: 0000-0002-0123-1260

Affiliations:
  • National University of Singapore, School of Computing, Singapore
  • Nokia Research Center, Speech and Audio Systems Laboratory, Tampere, Finland


According to our database1, Ye Wang authored at least 148 papers between 1999 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Automatic Lyric Transcription and Automatic Music Transcription from Multimodal Singing.
ACM Trans. Multim. Comput. Commun. Appl., July, 2024

Symbolic Music Generation From Graph-Learning-Based Preference Modeling and Textual Queries.
IEEE Trans. Multim., 2024

Drawlody: Sketch-Based Melody Creation With Enhanced Usability and Interpretability.
IEEE Trans. Multim., 2024

SinTechSVS: A Singing Technique Controllable Singing Voice Synthesis System.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

When Attention Sink Emerges in Language Models: An Empirical View.
CoRR, 2024

On Calibration of LLM-based Guard Models for Reliable Content Moderation.
CoRR, 2024

Unlocking Potential in Pre-Trained Music Language Models for Versatile Multi-Track Music Arrangement.
CoRR, 2024

Benchmarking Large Language Models on Communicative Medical Coaching: a Novel System and Dataset.
CoRR, 2024

End-to-End Real-World Polyphonic Piano Audio-to-Score Transcription with Hierarchical Decoding.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

XAI-Lyricist: Improving the Singability of AI-Generated Lyrics with Prosody Explanations.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Advancing Test-Time Adaptation in Wild Acoustic Test Settings.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Benchmarking Large Language Models on Communicative Medical Coaching: A Dataset and a Novel System.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Disentangled Adversarial Domain Adaptation for Phonation Mode Detection in Singing and Speech.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

AccoMontage-3: Full-Band Accompaniment Arrangement via Sequential Style Transfer and Multi-Track Function Prior.
CoRR, 2023

Advancing Test-Time Adaptation for Acoustic Foundation Models in Open-World Shifts.
CoRR, 2023

On Memorization in Diffusion Models.
CoRR, 2023

LOAF-M2L: Joint Learning of Wording and Formatting for Singable Melody-to-Lyric Generation.
CoRR, 2023

Deep Audio-Visual Singing Voice Transcription based on Self-Supervised Learning Models.
CoRR, 2023

Elucidate Gender Fairness in Singing Voice Transcription.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Zero-Shot Automatic Pronunciation Assessment.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Q&A: Query-Based Representation Learning for Multi-Track Symbolic Music re-Arrangement.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Phonation Mode Detection in Singing: A Singer Adapted Model.
Proceedings of the IEEE International Conference on Acoustics, 2023

Towards Informative Few-Shot Prompt with Maximum Information Gain for In-Context Learning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Songs Across Borders: Singable and Controllable Neural Lyric Translation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

FedNP: Towards Non-IID Federated Learning via Federated Neural Propagation.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Unsupervised Mismatch Localization in Cross-Modal Sequential Data with Application to Mispronunciations Localization.
Trans. Mach. Learn. Res., 2022

Towards Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription.
CoRR, 2022

Unsupervised Mismatch Localization in Cross-Modal Sequential Data.
CoRR, 2022

Extrapolative Continuous-time Bayesian Neural Network for Fast Training-free Test-time Adaptation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Content based User Preference Modeling in Music Generation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

MM-ALT: A Multimodal Automatic Lyric Transcription System.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Domain Adversarial Training on Conditional Variational Auto-Encoder for Controllable Music Generation.
Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention.
Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription.
Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

Exploring Transformer's Potential on Automatic Piano Transcription.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
AI-Lyricist: Generating Music and Vocabulary Constrained Lyrics.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

STRODE: Stochastic Boundary Ordinary Differential Equation.
Proceedings of the 38th International Conference on Machine Learning, 2021

2020
Automatic Evaluation of Song Intelligibility Using Singing Adapted STOI and Vocal-Specific Features.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Automatic Leaderboard: Evaluation of Singing Quality Without a Standard Reference.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Deep Graph Random Process for Relational-Thinking-Based Speech Recognition.
Proceedings of the 37th International Conference on Machine Learning, 2020

A-CRNN: A Domain Adaptation Model for Sound Event Detection.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Mobile Gait Analysis Using Foot-Mounted UWB Sensors.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2019

Automatic Lyrics-to-audio Alignment on Polyphonic Music Using Singing-adapted Acoustic Models.
Proceedings of the IEEE International Conference on Acoustics, 2019

SubSpectralNet - Using Sub-spectrogram Based Convolutional Neural Networks for Acoustic Scene Classification.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
SLIONS: A Karaoke Application to Enhance Foreign Language Learning.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Semi-supervised Lyrics and Solo-singing Alignment.
Proceedings of the 19th International Society for Music Information Retrieval Conference, 2018

Empirically Weighting the Importance of Decision Factors for Singing Preference.
Proceedings of the 19th International Society for Music Information Retrieval Conference, 2018

Automatic Pronunciation Evaluation of Singing.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

MANA: Designing and Validating a User-Centered Mobility Analysis System.
Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility, 2018

Automatic Evaluation of Singing Quality without a Reference.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Intelligibility of Sung Lyrics: A Pilot Study.
Proceedings of the 18th International Society for Music Information Retrieval Conference, 2017

Towards Automatic Mispronunciation Detection in Singing.
Proceedings of the 18th International Society for Music Information Retrieval Conference, 2017

Discourse Analysis of Lyric and Lyric-Based Classification of Music.
Proceedings of the 18th International Society for Music Information Retrieval Conference, 2017

Perceptual evaluation of singing quality.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
A Computer Vision-Based System for Stride Length Estimation using a Mobile Phone Camera.
Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility, 2016

2015
Quantifying Lexical Novelty in Song Lyrics.
Proceedings of the 16th International Society for Music Information Retrieval Conference, 2015

2014
Exploration in Interactive Personalized Music Recommendation: A Reinforcement Learning Approach.
ACM Trans. Multim. Comput. Commun. Appl., 2014

Validating an iOS-based Rhythmic Auditory Cueing Evaluation (iRACE) for Parkinson's Disease.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Bridging the User Intention Gap: an Intelligent and Interactive Multidimensional Music Search Engine.
Proceedings of the First International Workshop on Internet-Scale Multimedia Management, 2014

Improving Content-based and Hybrid Music Recommendation using Deep Learning.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Enhancing Collaborative Filtering Music Recommendation by Balancing Exploration and Exploitation.
Proceedings of the 15th International Society for Music Information Retrieval Conference, 2014

2013
Scalable Content-Based Music Retrieval Using Chord Progression Histogram and Tree-Structure LSH.
IEEE Trans. Multim., 2013

Query-Document-Dependent Fusion: A Case Study of Multimodal Music Retrieval.
IEEE Trans. Multim., 2013

Non-reference audio quality assessment for online live music recordings.
Proceedings of the ACM Multimedia Conference, 2013

Basic Evaluation of Auditory Temporal Stability (Beats): A Novel Rationale and Implementation.
Proceedings of the 14th International Society for Music Information Retrieval Conference, 2013

The NUS sung and spoken lyrics corpus: A quantitative comparison of singing and speech.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2012
Reducing the Power Consumption of an IMU-Based Gait Measurement System.
Proceedings of the Advances in Multimedia Information Processing - PCM 2012, 2012

A Real-Time On-Chip Algorithm for IMU-Based Gait Measurement.
Proceedings of the Advances in Multimedia Information Processing - PCM 2012, 2012

MOGAT: mobile games with auditory training for children with cochlear implants.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

MOGAT: a cloud-based mobile game system with auditory training for children with cochlear implants.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

A daily, activity-aware, mobile music recommender system.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Context-aware mobile music recommendation for daily activities.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

When music, information technology, and medicine meet.
Proceedings of the second international ACM workshop on Music information retrieval with user-centered and multimodal strategies, 2012

A domain-specific music search engine for gait training.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Recognition and Summarization of Chord Progressions and Their Application to Music Information Retrieval.
Proceedings of the 2012 IEEE International Symposium on Multimedia, 2012

2011
Sensor-Assisted Video Encoding for Mobile Devices in Real-World Environments.
IEEE Trans. Circuits Syst. Video Technol., 2011

A tempo-sensitive music search engine with multimodal inputs.
Proceedings of the 1st international ACM workshop on Music information retrieval with user-centered and multimodal strategies, Scottsdale, AZ, USA, November 28, 2011

Document dependent fusion in multimodal music retrieval.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

MOGCLASS: evaluation of a collaborative system of mobile devices for classroom music education of young children.
Proceedings of the International Conference on Human Factors in Computing Systems, 2011

2010
MOGCLASS: a collaborative system of mobile devices forclassroom music education.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Large-scale music tag recommendation with explicit multiple attributes.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Automated sleep quality measurement using EEG signal: first step towards a domain specific music recommendation system.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

A music search engine for therapeutic gait training.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

2009
An optimal speed control scheme supported by media servers for low-power multimedia applications.
Multim. Syst., 2009

A joint encoder-decoder framework for supporting energy efficient audio decoding.
Multim. Syst., 2009

Power Management for Mobile Multimedia: From Audio to Video & Games.
Proceedings of the VLSI Design 2009: Improving Productivity through Higher Abstraction, 2009

CompositeMap: a novel framework for music similarity measure.
Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009

MOGFUN: musical mObile group for FUN.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

CompositeMap: a novel music similarity measure for personalized multimodal music search.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Comprehensive query-dependent fusion using regression-on-folksonomies: a case study of multimodal music search.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

SaVE: sensor-assisted motion estimation for efficient h.264/AVC video encoding.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Cultural style based music classification of audio signals.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
LyricAlly: Automatic Synchronization of Textual Lyrics to Acoustic Music Signals.
IEEE Trans. Speech Audio Process., 2008

Application-Specific Music Transcription for Tutoring.
IEEE Multim., 2008

Complexity-Scalable Beat Detection with MP3 Audio Bitstreams.
Comput. Music. J., 2008

Watermarking Video Clips with Workload Information for DVS.
Proceedings of the 21st International Conference on VLSI Design (VLSI Design 2008), 2008

Decoding-workload-aware video encoding.
Proceedings of the Network and Operating System Support for Digital Audio and Video, 2008

iDVT: an interactive digital violin tutoring system based on audio-visual fusion.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

SenseCoding: accelerometer-assisted motion estimation for efficient video encoding.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

Multimedia power management on a platter: from audio to video & games.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

Multiple-Feature Fusion Based Onset Detection for Solo Singing Voice.
Proceedings of the ISMIR 2008, 2008

Clustering Music Recordings by Their Keys.
Proceedings of the ISMIR 2008, 2008

Onset detection in pitched non-percussive music using warping-compensated correlation.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Visual analysis of fingering for pedagogical violin transcription.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Educational violin transcription by fusing multimedia streams.
Proceedings of the International Workshop on Educational Multimedia and Multimedia Education 2007, 2007

Effective use of multimedia for computer-assisted musical instrument tutoring.
Proceedings of the International Workshop on Educational Multimedia and Multimedia Education 2007, 2007

A workload prediction model for decoding mpeg video and its application to workload-scalable transcoding.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

A compressed domain distortion measure for fast video transcoding.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Pop Music Beat Detection in the Huffman Coded Domain.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Speaker Verification with Adaptive Spectral Subband Centroids.
Proceedings of the Advances in Biometrics, International Conference, 2007

Interactive digital violin tutor (IDVT): an edutainment system for violin learners.
Proceedings of the International Conference on Advances in Computer Entertainment Technology, 2007

2006
Generic forward error correction of short frames for IP streaming applications.
Multim. Tools Appl., 2006

Syllabic level automatic synchronization of music signals and text lyrics.
Proceedings of the 14th ACM International Conference on Multimedia, 2006

Low Level Descriptors for Automatic Violin Transcription.
Proceedings of the ISMIR 2006, 2006

Efficient Partial Spectrum Reconstruction using an Asymmetric PQMF Algorithm for MPEG-Coded Stereo Audio.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

A Violin Music Transcriber for Personalized Learning.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

2005
Toward bandwidth-efficient and error-robust audio streaming over lossy packet networks.
Multim. Syst., 2005

Key, Chord, and Rhythm Tracking of Popular Music Recordings.
Comput. Music. J., 2005

Effect of packet size on loss rate and delay in wireless links.
Proceedings of the IEEE Wireless Communications and Networking Conference, 2005

Power-efficient streaming for mobile terminals.
Proceedings of the Network and Operating System Support for Digital Audio and Video, 2005

Digital violin tutor: an integrated system for beginning violin learners.
Proceedings of the 13th ACM International Conference on Multimedia, 2005

Power-aware bandwidth and stereo-image scalable audio decoding.
Proceedings of the 13th ACM International Conference on Multimedia, 2005

Using offline bitstream analysis for power-aware video decoding in portable devices.
Proceedings of the 13th ACM International Conference on Multimedia, 2005

Optimization of source and channel coding for voice over IP.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Music transcription using an instrument model.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

A Perception-Aware Low-Power Software Audio Decoder for Portable Devices.
Proceedings of the 2005 3rd Workshop on Embedded Systems for Real-Time Multimedia, 2005

2004
Semantic Region Detection in Acoustic Music Signals.
Proceedings of the Advances in Multimedia Information Processing - PCM 2004, 5th Pacific Rim Conference on Multimedia, Tokyo, Japan, November 30, 2004

The creation of a music-driven digital violinist.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

LyricAlly: automatic synchronization of acoustic musical signals and textual lyrics.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

A framework for robust and scalable audio streaming.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Singing voice detection in popular music.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Automatic Detection Of Vocal Segments In Popular Songs.
Proceedings of the ISMIR 2004, 2004

Singer Identification Based on Vocal and Instrumental Models.
Proceedings of the 17th International Conference on Pattern Recognition, 2004

Key determination of acoustic musical signals.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Singing voice detection using twice-iterated composite Fourier transform.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Automatic music summarization in compressed domain.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Application of a content-based percussive sound synthesizer to packet loss recovery in music streaming.
Proceedings of the Eleventh ACM International Conference on Multimedia, 2003

Content-based UEP: a new scheme for packet loss recovery in music streaming.
Proceedings of the Eleventh ACM International Conference on Multimedia, 2003

An SVM-based classification approach to musical audio.
Proceedings of the ISMIR 2003, 2003

Parametric vector quantization for coding percussive sounds in music.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Schemes for error resilient streaming of perceptually coded audio.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
A drumbeat-pattern based error concealment method for music streaming applications.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
A compressed domain beat detector using MP3 audio bitstreams.
Proceedings of the 9th ACM International Conference on Multimedia 2001, Ottawa, Ontario, Canada, September 30, 2001

A Beat-Pattern based Error Concealment Scheme for Music Delivery with Burst Packet Loss.
Proceedings of the 2001 IEEE International Conference on Multimedia and Expo, 2001

2000
DFT, DCT, MDCT, DST and signal Fourier spectrum analysis.
Proceedings of the 10th European Signal Processing Conference, 2000

1999
An excitation level based psychoacoustic model for audio compression.
Proceedings of the 7th ACM International Conference on Multimedia '99, Orlando, FL, USA, October 30, 1999

Audio Signal Representation and Processing in Time-Frequency Domain.
Proceedings of the 1999 International Computer Music Conference, 1999


  Loading...