Shusuke Takahashi

Marco A. Martínez Ramírez

Trans. Int. Soc. Music. Inf. Retr., January, 2024

SAVGBench: Benchmarking Spatially Aligned Audio-Video Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Music Foundation Model as Generic Booster for Music Downstream Tasks.

[BibT_eX]

[DOI]

CoRR, 2024

OpenMU: Your Swiss Army Knife for Music Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

Mining Your Own Secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models.

[BibT_eX]

[DOI]

Muhammad Jehanzeb Mirza

CoRR, 2024

SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond.

[BibT_eX]

[DOI]

CoRR, 2024

MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training.

[BibT_eX]

[DOI]

CoRR, 2024

Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation.

[BibT_eX]

[DOI]

CoRR, 2024

SpecMaskGIT: Masked Generative Modeling of Audio Spectrogram for Efficient Audio Synthesis and Beyond.

[BibT_eX]

[DOI]

Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024

Zero- and Few-Shot Sound Event Localization and Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023.

[BibT_eX]

[DOI]

Aapo Hakala

Dataset, March, 2023

STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023.

[BibT_eX]

[DOI]

Aapo Hakala

Alexander L. Stempkovskiy

Dataset, March, 2023

The Sound Demixing Challenge 2023 - Cinematic Demixing Track.

[BibT_eX]

[DOI]

Tatiana Habruseva

Mikhail Sukhovei

CoRR, 2023

The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation.

[BibT_eX]

[DOI]

CoRR, 2023

Diffusion-based Signal Refiner for Speech Separation.

[BibT_eX]

[DOI]

CoRR, 2023

Extending Audio Masked Autoencoders toward Audio Restoration.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Diffiner: A Versatile Diffusion-based Generative Refiner for Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

An Attention-Based Approach to Hierarchical Multi-Label Music Instrument Classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Diffroll: Diffusion-Based Generative Music Transcription with Unsupervised Pretraining Capability.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset.

[BibT_eX]

[DOI]

Sharath Adavanne

Yuichiro Koyama

Naoya Takahashi

Tuomas Virtanen

Dataset, May, 2022

STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset.

[BibT_eX]

[DOI]

Adavanne Politis

Dataset, March, 2022

An Approach to Collecting Object Graphs for Data-structure Live Programming Based on a Language Implementation Framework.

[BibT_eX]

[DOI]

J. Inf. Process., 2022

Preventing oversmoothing in VAE via generalized variance parameterization.

[BibT_eX]

[DOI]

Neurocomputing, 2022

A Versatile Diffusion-based Generative Refiner for Speech Enhancement.

[BibT_eX]

[DOI]

CoRR, 2022

SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Multi-ACCDOA: Localizing And Detecting Overlapping Sounds From The Same Class With Auxiliary Duplicating Permutation Invariant Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Character Error Rate is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-Box Acoustic Models.

[BibT_eX]

[DOI]

Ryosuke Sawata

Yosuke Kashiwagi

Proceedings of the IEEE International Conference on Acoustics, 2022

Spatial Mixup: Directional Loudness Modification as Data Augmentation for Sound Event Localization and Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Music Source Separation With Deep Equilibrium Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

STARSS22: A Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events.

[BibT_eX]

[DOI]

Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

2021

Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection.

[BibT_eX]

[DOI]

CoRR, 2021

Preventing Posterior Collapse Induced by Oversmoothing in Gaussian VAE.

[BibT_eX]

[DOI]

CoRR, 2021

Manifold-Aware Deep Clustering: Maximizing Angles Between Embedding Vectors Based on Regular Simplex.

[BibT_eX]

[DOI]

Keitaro Tanaka

Ryosuke Sawata