Heinrich Dinkel

Orcid: 0000-0003-4330-8980

According to our database1, Heinrich Dinkel authored at least 41 papers between 2015 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Enhancing Automated Audio Captioning via Large Language Models with Optimized Audio Encoding.
CoRR, 2024

Bridging Language Gaps in Audio-Text Retrieval.
CoRR, 2024

Scaling up masked audio encoder learning for general audio classification.
CoRR, 2024

CED: Consistent Ensemble Distillation for Audio Tagging.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Understanding temporally weakly supervised training: A case study for keyword spotting.
CoRR, 2023

Streaming Audio Transformers for Online Audio Tagging.
CoRR, 2023

Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Av-Sepformer: Cross-Attention Sepformer for Audio-Visual Target Speaker Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2023

Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
An Empirical Study of Weakly Supervised Audio Tagging Embeddings for General Audio Representations.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

UniKW-AT: Unified Keyword Spotting and Audio Tagging.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Category-Adapted Sound Event Enhancement with Weakly Labeled Data.
Proceedings of the IEEE International Conference on Acoustics, 2022

Pseudo Strong Labels for Large Scale Weakly Supervised Audio Tagging.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Voice Activity Detection in the Wild: A Data-Driven Approach Using Teacher-Student Training.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Towards Duration Robust Weakly Supervised Sound Event Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

DEPA: Self-Supervised Audio Embedding for Depression Detection.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Audio Caption in a Car Setting with a Sentence-Level Loss.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

A Lightweight Framework for Online Voice Activity Detection in the Wild.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning.
Proceedings of the IEEE International Conference on Acoustics, 2021

Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Lightweight Approach for Semi-Supervised Sound Event Detection with Unsupervised Data Augmentation.
Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

A Contrastive Semi-Supervised Learning Framework For Anomaly Sound Detection.
Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

2020
GPVAD: Towards noise robust voice activity detection via weakly supervised sound event detection.
CoRR, 2020

Dual-Adversarial Domain Adaptation for Generalized Replay Attack Detection.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Voice Activity Detection in the Wild via Weakly Supervised Sound Event Detection.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Duration Robust Weakly Supervised Sound Event Detection.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Multiple Sound Sources Localization from Coarse to Fine.
Proceedings of the Computer Vision - ECCV 2020, 2020

A CRNN-GRU Based Reinforcement Learning Approach to Audio Captioning.
Proceedings of 5th the Workshop on Detection and Classification of Acoustic Scenes and Events 2020 (DCASE 2020), 2020

2019
What does a Car-ssette tape tell?
CoRR, 2019

Text-based Depression Detection: What Triggers An Alert.
CoRR, 2019

Duration robust sound event detection.
CoRR, 2019

The SJTU Robust Anti-Spoofing System for the ASVspoof 2019 Challenge.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Cross-Domain Replay Spoofing Attack Detection Using Domain Adversarial Training.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Audio Caption: Listen and Tell.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Investigating Raw Wave Deep Neural Networks for End-to-End Speaker Spoofing Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Covariance Based Deep Feature for Text-Dependent Speaker Verification.
Proceedings of the Intelligence Science and Big Data Engineering, 2018

2017
Deep Feature Engineering for Noise Robust Spoofing Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Small-footprint convolutional neural network for spoofing detection.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

End-to-end spoofing detection with raw waveform CLDNNS.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016

2015
Robust deep feature for spoofing detection - the SJTU system for ASVspoof 2015 challenge.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015


  Loading...