Zhen He

Appl. Soft Comput., 2025

2024

Location Privacy Protection for the Internet of Things with Edge Computing Based on Clustering K-Anonymity.

[DOI]

Sensors, September, 2024

Token-based deep reinforcement learning for Heterogeneous VRP with Service Time Constraints.

[DOI]

Knowl. Based Syst., 2024

Web Semantic-Based Robust Graph Contrastive Learning for Recommendation via Invariant Learning.

[DOI]

Wengui Dai

Int. J. Semantic Web Inf. Syst., 2024

Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering.

[DOI]

CoRR, 2024

Enhancing Automated Audio Captioning via Large Language Models with Optimized Audio Encoding.

[DOI]

CoRR, 2024

Scaling up masked audio encoder learning for general audio classification.

[DOI]

CoRR, 2024

Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling.

[DOI]

CoRR, 2024

Multi-UAV Distributed Collaborative Path Planning Based on NTVPPSO.

[DOI]

Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2024

Efficient Extraction of Noise-Robust Discrete Units from Self-Supervised Speech Models.

[DOI]

Jakob Poncelet

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Optimizing Dysarthria Wake-Up Word Spotting: an End-to-End Approach For SLT 2024 LRDWWS Challenge.

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Bridging Language Gaps in Audio-Text Retrieval.

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Speaker Change Detection with Weighted-sum Knowledge Distillation based on Self-supervised Pre-trained Models.

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Enhancing Automated Audio Captioning via Large Language Models with Optimized Audio Encoding.

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling.

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Streaming Audio Transformers for Online Audio Tagging.

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Scaling up masked audio encoder learning for general audio classification.

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Two-Stage Neural Network Model with Packet Loss Detection for ICASSP 2024 PLC Challenge.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

TNFormer: Single-Pass Multilingual Text Normalization with a Transformer Decoder Model.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

CED: Consistent Ensemble Distillation for Audio Tagging.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Disentangling Speaker Representations from Intuitive Prosodic Features for Speaker-Adaptative and Prosody-Controllable Speech Synthesis.

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2023

Deep Learning Models in Computer Data Mining for Intrusion Detection.

[DOI]

Informatica (Slovenia), 2023

Understanding temporally weakly supervised training: A case study for keyword spotting.

[DOI]

CoRR, 2023

Streaming Audio Transformers for Online Audio Tagging.

[DOI]

CoRR, 2023

Construction and Application of VS-DBN Anti-Theft Diagnosis Model Based on Neural Architecture Search.

[DOI]

IEEE Access, 2023

Ship Classification Based on Trajectories Data and LightGBM Considering Offshore Distance Feature.

[DOI]

Proceedings of the Spatial Data and Intelligence - 4th International Conference, 2023

The Abnormal Detection Method of Ship Trajectory with Adaptive Transformer Model Based on Migration Learning.

[DOI]

Proceedings of the Spatial Data and Intelligence - 4th International Conference, 2023

Improving Bilingual TTS Using Language And Phonology Embedding With Embedding Strength Modulator.

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

LightClone: Speaker-guided Parallel Subnet Selection for Few-shot Voice Cloning.

[DOI]

Jie Wu

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information.

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Improving Weakly Supervised Sound Event Detection with Causal Intervention.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Av-Sepformer: Cross-Attention Sepformer for Audio-Visual Target Speaker Extraction.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Relate Auditory Speech To Eeg By Shallow-Deep Attention-Based Network.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

The Xiaomi-ASLP Text-to-speech System for Blizzard Challenge 2023.

[DOI]

Proceedings of the 18th Blizzard Challenge Workshop, Grenoble, France, August 29, 2023, 2023

Enhanced Neural Beamformer with Spatial Information for Target Speech Extraction.

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

Experiment and Analysis of Temperature Sensing of Microstructured Fiber with Silver and PDMS Films.

[DOI]

Sensors, 2022

Improve Bilingual TTS Using Dynamic Language and Phonology Embedding.

[DOI]

Fengyu Yang

CoRR, 2022

AIS Data Driven CNN-BiGRU Model for Ship Target Classification.

[DOI]

Proceedings of the Spatial Data and Intelligence - Third International Conference, 2022

An Empirical Study of Weakly Supervised Audio Tagging Embeddings for General Audio Representations.

[DOI]

Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

J-TranPSP: A Joint Transition-based Model for Prosodic Structure Prediction, Word Segmentation and PoS Tagging.

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

UniKW-AT: Unified Keyword Spotting and Audio Tagging.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Exploring representation learning for small-footprint keyword spotting.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation.

[DOI]

Fengyu Yang

Proceedings of the IEEE International Conference on Acoustics, 2022

MSDTRON: A High-Capability Multi-Speaker Speech Synthesis System for Diverse Data Using Characteristic Information.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Learning Decoupling Features Through Orthogonality Regularization.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

PAMA-TTS: Progression-Aware Monotonic Attention for Stable SEQ2SEQ TTS with Accurate Phoneme Duration Control.

[DOI]

Yunchao He

Proceedings of the IEEE International Conference on Acoustics, 2022

Pseudo Strong Labels for Large Scale Weakly Supervised Audio Tagging.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Multi-Scale Refinement Network Based Acoustic Echo Cancellation.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Research on the Evaluation of the Classical Chinese Difficulty in the Compulsory Education Stage.

[DOI]

Proceedings of the International Conference on Asian Language Processing, 2022

Detect What You Want: Target Sound Detection.

[DOI]

Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

2021

Study of the key technology on the Geo-hazard spatial information sharing platform in Meizoseismal Region of Wenchuan Earthquake Zone.

[DOI]

J. Ambient Intell. Humaniz. Comput., 2021

Residual error based knowledge distillation.

[DOI]

Mengya Gao

Liang Wan

Neurocomputing, 2021

A Separable Temporal Convolution Neural Network with Attention for Small-Footprint Keyword Spotting.

[DOI]

CoRR, 2021

Effective and Differentiated Use of Control Information for Multi-speaker Speech Synthesis.

[DOI]

CoRR, 2021

GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10, 000 Hours of Transcribed Audio.

[DOI]

CoRR, 2021

An Enhanced MAX-SINR Strategy With Interference Leakage Power Constraint in Multiuser Multiantenna SWIPT Systems.

[DOI]

IEEE Access, 2021

Cognitive Radio Primary Network Secure Communication Strategy Based on Energy Harvesting and Destination Assistance.

[DOI]

Proceedings of the 13th International Conference on Wireless Communications and Signal Processing, 2021

Multi-Channel Automatic Speech Recognition Using Deep Complex Unet.

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

speechocean762: An Open-Source Non-Native English Speech Corpus for Pronunciation Assessment.

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Noise Robust Singing Voice Synthesis Using Gaussian Mixture Variational Autoencoder.

[DOI]

Proceedings of the ICMI '21 Companion: Companion Publication of the 2021 International Conference on Multimodal Interaction, Montreal, QC, Canada, October 18, 2021

An Experimental Study on Replay Attack Detection Using Spoofing Clues from both Voiced and Non-Voiced Segments.

[DOI]

Proceedings of the ICDSP 2021: 5th International Conference on Digital Signal Processing, 2021

AutoKWS: Keyword Spotting with Differentiable Architecture Search.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Multi-Channel Speech Enhancement with 2-D Convolutional Time-Frequency Domain Features and a Pre-Trained Acoustic Model.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Frequency Axis Pooling Method for Weakly Labeled Sound Event Detection and Classification.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Separable Temporal Convolution plus Temporally Pooled Attention for Lightweight High-Performance Keyword Spotting.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020

Analysis of spatiotemporal influence patterns of toxic gas monitoring concentrations in an urban drainage network based on IoT and GIS.

[DOI]

Pattern Recognit. Lett., 2020

Evaluation of Geological and Ecological Bearing Capacity and Spatial Pattern along Du-Wen Road Based on the Analytic Hierarchy Process (AHP) and the Technique for Order of Preference by Similarity to an Ideal Solution (TOPSIS) Method.

[DOI]

ISPRS Int. J. Geo Inf., 2020

Computer Audition for Healthcare: Opportunities and Challenges.

[DOI]

Frontiers Digit. Health, 2020

Data Augmentation For Children's Speech Recognition - The "Ethiopian" System For The SLT 2021 Children Speech Recognition Challenge.

[DOI]

CoRR, 2020

Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis.

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Application of Graph Database for the Storage of Knowledge Map of Power Grid Model.

[DOI]

Proceedings of the IEEE International Conference on Signal Processing, 2020

2019

RawNet: Fast End-to-End Neural Vocoder.

[DOI]

Yunchao He

Haitong Zhang

CoRR, 2019

Research on the comprehensive evaluation system of eco-geological environmental carrying capacity based on the analytic hierarchy process.

[DOI]

Clust. Comput., 2019

Robust Speech Recognition based on Multi-Objective Learning with GRU Network.

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

End-to-end Models with auditory attention in Multi-channel Keyword Spotting.

[DOI]

Haitong Zhang

Junbo Zhang

CoRR, 2018

Sequence-to-sequence Models for Small-Footprint Keyword Spotting.

[DOI]

Haitong Zhang

Junbo Zhang

CoRR, 2018

Empirical Evaluation of Speaker Adaptation on DNN Based Acoustic Model.

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Investigating Generative Adversarial Networks Based Speech Dereverberation for Robust Speech Recognition.

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Attention-based End-to-End Models for Small-Footprint Keyword Spotting.

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Attention-Based End-to-End Speech Recognition on Voice Search.

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Attention-Based End-to-End Speech Recognition in Mandarin.

[DOI]

CoRR, 2017

Design and Development of Intelligent Learning Companion for Primary School Students Based on the Tour of Well-Known Scenic Spots in Beijing.

[DOI]

Proceedings of the Learning and Collaboration Technologies. Technology in Education, 2017

2016

Light source imitation by using galvanometer scanner and spot light.

[DOI]

Can Fang

Multim. Tools Appl., 2016

Synchronized contention windows-based backoff algorithm in IEEE 802.11 wireless networks.

[DOI]

Proceedings of the International Conference on Computer, 2016

2015

Developing an Ontology-Based Cold Chain Logistics Monitoring and Decision System.

[DOI]

J. Sensors, 2015

Maximal singularity-free orientation workspace over a position region of Gough-Stewart platform.

[DOI]

Adv. Robotics, 2015

2013

Missing Data Solutions for Robust Speech Recognition.

[DOI]

Proceedings of the Essential Speech and Language Technology for Dutch, 2013

Respiration features of Chinese learners under self-narration task - The case of learners from Korea, Japan, America and Thailand.

[DOI]

Yuan Jia

Aijun Li

Proceedings of the 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013

2012

Multi-candidate missing data imputation for robust speech recognition.

[DOI]

EURASIP J. Audio Speech Music. Process., 2012

2011

Consistent Synchronization Schemes for Workload Replay.

[DOI]

Konstantinos Morfonios

Romain Colle

Benoît Dageville

Karl Dias

Proc. VLDB Endow., 2011

Gaussian Selection Using Self-Organizing Map for Automatic Speech Recognition.

[DOI]

Proceedings of the Advances in Self-Organizing Maps - 8th International Workshop, 2011

Automatic Speech Recognition Using Missing Data Techniques: Handling of Real-World Data.

[DOI]

Jort F. Gemmeke

Maarten Van Segbroeck

Bert Cranen

Proceedings of the Robust Speech Recognition of Uncertain or Missing Data, 2011

2009

Oracle Database Replay.

[DOI]

Romain Colle

Proc. VLDB Endow., 2009

Real application testing with database replay.

[DOI]

Romain Colle

Karl Dias

Uri Shaft

Proceedings of the 2nd International Workshop on Testing Database Systems, 2009

Application of noise robust MDT speech recognition on the SPEECON and speechdat-car databases.

[DOI]

Jort F. Gemmeke

Maarten Van Segbroeck

Bert Cranen

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A Compression Error and Optimize Compression Algorithm for Vector Data.

[DOI]

Guolv Tan

Proceedings of the 2009 International Conference on Environmental Science and Information Application Technology, 2009

2008

Oracle database replay.

[DOI]

Venkateshwaran Venkataramani

Leng Leng Tan

Graham Wood

Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

Oracle real application testing.

[DOI]

Peter Belknap

Venkateshwaran Venkataramani

Uri Shaft

Leng Leng Tan

Proceedings of the 1st International Workshop on Testing Database Systems, 2008

2005

Efficiently querying spatial histograms.

[DOI]

Simone Santini

Amarnath Gupta

Proceedings of the Storage and Retrieval Methods and Applications for Multimedia 2005, 2005

Distributed Out-of-Core Preprocessing of Very Large Microscopy Images for Efficient Querying.

[DOI]

Proceedings of the 2005 IEEE International Conference on Cluster Computing (CLUSTER 2005), September 26, 2005

2003

FPV: Fast Protein Visualization Using Java 3DTM.

[DOI]

Bioinform., 2003

FPV: Fast Protein Visualization Using Java 3D.

[DOI]

Proceedings of the 2003 ACM Symposium on Applied Computing (SAC), 2003

2000

Dynamic Interval Index Structure in Constraint Database Systems.

[DOI]

Wei Wang

Baile Shi

J. Comput. Sci. Technol., 2000

1997

On the expressive power of F-logic language.

[DOI]

J. Comput. Sci. Technol., 1997

Decomposition and Lossless Join in Constraint Databases.

[DOI]