2025
The ICME 2025 Audio Encoder Capability Challenge.
,
,
,
,
,
,
,
,
,
,
CoRR, January, 2025
Analysis of the Spatiotemporal Changes in Enclosed Aquaculture Areas in Hongze Lake, China, Over the Past 40 Years and Their Impact on the Water Environment.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2025
Service defects identification by integrating fuzzy clustering and optimization model with quality function deployment.
Appl. Soft Comput., 2025
2024
Location Privacy Protection for the Internet of Things with Edge Computing Based on Clustering K-Anonymity.
Sensors, September, 2024
Token-based deep reinforcement learning for Heterogeneous VRP with Service Time Constraints.
Knowl. Based Syst., 2024
Web Semantic-Based Robust Graph Contrastive Learning for Recommendation via Invariant Learning.
Int. J. Semantic Web Inf. Syst., 2024
Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering.
CoRR, 2024
Enhancing Automated Audio Captioning via Large Language Models with Optimized Audio Encoding.
CoRR, 2024
Scaling up masked audio encoder learning for general audio classification.
CoRR, 2024
Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling.
CoRR, 2024
Multi-UAV Distributed Collaborative Path Planning Based on NTVPPSO.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2024
Efficient Extraction of Noise-Robust Discrete Units from Self-Supervised Speech Models.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Optimizing Dysarthria Wake-Up Word Spotting: an End-to-End Approach For SLT 2024 LRDWWS Challenge.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Bridging Language Gaps in Audio-Text Retrieval.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
Speaker Change Detection with Weighted-sum Knowledge Distillation based on Self-supervised Pre-trained Models.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
Enhancing Automated Audio Captioning via Large Language Models with Optimized Audio Encoding.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
Streaming Audio Transformers for Online Audio Tagging.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
Scaling up masked audio encoder learning for general audio classification.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
Two-Stage Neural Network Model with Packet Loss Detection for ICASSP 2024 PLC Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2024
TNFormer: Single-Pass Multilingual Text Normalization with a Transformer Decoder Model.
Proceedings of the IEEE International Conference on Acoustics, 2024
CED: Consistent Ensemble Distillation for Audio Tagging.
Proceedings of the IEEE International Conference on Acoustics, 2024
Disentangling Speaker Representations from Intuitive Prosodic Features for Speaker-Adaptative and Prosody-Controllable Speech Synthesis.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024
2023
Deep Learning Models in Computer Data Mining for Intrusion Detection.
Informatica (Slovenia), 2023
Understanding temporally weakly supervised training: A case study for keyword spotting.
CoRR, 2023
Streaming Audio Transformers for Online Audio Tagging.
CoRR, 2023
Construction and Application of VS-DBN Anti-Theft Diagnosis Model Based on Neural Architecture Search.
IEEE Access, 2023
Ship Classification Based on Trajectories Data and LightGBM Considering Offshore Distance Feature.
Proceedings of the Spatial Data and Intelligence - 4th International Conference, 2023
The Abnormal Detection Method of Ship Trajectory with Adaptive Transformer Model Based on Migration Learning.
Proceedings of the Spatial Data and Intelligence - 4th International Conference, 2023
Improving Bilingual TTS Using Language And Phonology Embedding With Embedding Strength Modulator.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
LightClone: Speaker-guided Parallel Subnet Selection for Few-shot Voice Cloning.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Improving Weakly Supervised Sound Event Detection with Causal Intervention.
Proceedings of the IEEE International Conference on Acoustics, 2023
Av-Sepformer: Cross-Attention Sepformer for Audio-Visual Target Speaker Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2023
Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers.
Proceedings of the IEEE International Conference on Acoustics, 2023
Relate Auditory Speech To Eeg By Shallow-Deep Attention-Based Network.
Proceedings of the IEEE International Conference on Acoustics, 2023
The Xiaomi-ASLP Text-to-speech System for Blizzard Challenge 2023.
Proceedings of the 18th Blizzard Challenge Workshop, Grenoble, France, August 29, 2023, 2023
Enhanced Neural Beamformer with Spatial Information for Target Speech Extraction.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
2022
Experiment and Analysis of Temperature Sensing of Microstructured Fiber with Silver and PDMS Films.
Sensors, 2022
Improve Bilingual TTS Using Dynamic Language and Phonology Embedding.
CoRR, 2022
AIS Data Driven CNN-BiGRU Model for Ship Target Classification.
Proceedings of the Spatial Data and Intelligence - Third International Conference, 2022
An Empirical Study of Weakly Supervised Audio Tagging Embeddings for General Audio Representations.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022
J-TranPSP: A Joint Transition-based Model for Prosodic Structure Prediction, Word Segmentation and PoS Tagging.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
UniKW-AT: Unified Keyword Spotting and Audio Tagging.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Exploring representation learning for small-footprint keyword spotting.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation.
Proceedings of the IEEE International Conference on Acoustics, 2022
MSDTRON: A High-Capability Multi-Speaker Speech Synthesis System for Diverse Data Using Characteristic Information.
Proceedings of the IEEE International Conference on Acoustics, 2022
Learning Decoupling Features Through Orthogonality Regularization.
Proceedings of the IEEE International Conference on Acoustics, 2022
PAMA-TTS: Progression-Aware Monotonic Attention for Stable SEQ2SEQ TTS with Accurate Phoneme Duration Control.
Proceedings of the IEEE International Conference on Acoustics, 2022
Pseudo Strong Labels for Large Scale Weakly Supervised Audio Tagging.
Proceedings of the IEEE International Conference on Acoustics, 2022
Multi-Scale Refinement Network Based Acoustic Echo Cancellation.
Proceedings of the IEEE International Conference on Acoustics, 2022
Research on the Evaluation of the Classical Chinese Difficulty in the Compulsory Education Stage.
Proceedings of the International Conference on Asian Language Processing, 2022
Detect What You Want: Target Sound Detection.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022
2021
Study of the key technology on the Geo-hazard spatial information sharing platform in Meizoseismal Region of Wenchuan Earthquake Zone.
J. Ambient Intell. Humaniz. Comput., 2021
Residual error based knowledge distillation.
Neurocomputing, 2021
A Separable Temporal Convolution Neural Network with Attention for Small-Footprint Keyword Spotting.
CoRR, 2021
Effective and Differentiated Use of Control Information for Multi-speaker Speech Synthesis.
CoRR, 2021
GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10, 000 Hours of Transcribed Audio.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2021
An Enhanced MAX-SINR Strategy With Interference Leakage Power Constraint in Multiuser Multiantenna SWIPT Systems.
IEEE Access, 2021
Cognitive Radio Primary Network Secure Communication Strategy Based on Energy Harvesting and Destination Assistance.
Proceedings of the 13th International Conference on Wireless Communications and Signal Processing, 2021
Multi-Channel Automatic Speech Recognition Using Deep Complex Unet.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
speechocean762: An Open-Source Non-Native English Speech Corpus for Pronunciation Assessment.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Noise Robust Singing Voice Synthesis Using Gaussian Mixture Variational Autoencoder.
Proceedings of the ICMI '21 Companion: Companion Publication of the 2021 International Conference on Multimodal Interaction, Montreal, QC, Canada, October 18, 2021
An Experimental Study on Replay Attack Detection Using Spoofing Clues from both Voiced and Non-Voiced Segments.
Proceedings of the ICDSP 2021: 5th International Conference on Digital Signal Processing, 2021
AutoKWS: Keyword Spotting with Differentiable Architecture Search.
Proceedings of the IEEE International Conference on Acoustics, 2021
Multi-Channel Speech Enhancement with 2-D Convolutional Time-Frequency Domain Features and a Pre-Trained Acoustic Model.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Frequency Axis Pooling Method for Weakly Labeled Sound Event Detection and Classification.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Separable Temporal Convolution plus Temporally Pooled Attention for Lightweight High-Performance Keyword Spotting.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
2020
Analysis of spatiotemporal influence patterns of toxic gas monitoring concentrations in an urban drainage network based on IoT and GIS.
Pattern Recognit. Lett., 2020
Evaluation of Geological and Ecological Bearing Capacity and Spatial Pattern along Du-Wen Road Based on the Analytic Hierarchy Process (AHP) and the Technique for Order of Preference by Similarity to an Ideal Solution (TOPSIS) Method.
ISPRS Int. J. Geo Inf., 2020
Computer Audition for Healthcare: Opportunities and Challenges.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Frontiers Digit. Health, 2020
Data Augmentation For Children's Speech Recognition - The "Ethiopian" System For The SLT 2021 Children Speech Recognition Challenge.
CoRR, 2020
Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Application of Graph Database for the Storage of Knowledge Map of Power Grid Model.
Proceedings of the IEEE International Conference on Signal Processing, 2020
2019
RawNet: Fast End-to-End Neural Vocoder.
CoRR, 2019
Research on the comprehensive evaluation system of eco-geological environmental carrying capacity based on the analytic hierarchy process.
Clust. Comput., 2019
Robust Speech Recognition based on Multi-Objective Learning with GRU Network.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
2018
End-to-end Models with auditory attention in Multi-channel Keyword Spotting.
CoRR, 2018
Sequence-to-sequence Models for Small-Footprint Keyword Spotting.
CoRR, 2018
Empirical Evaluation of Speaker Adaptation on DNN Based Acoustic Model.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Investigating Generative Adversarial Networks Based Speech Dereverberation for Robust Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Attention-based End-to-End Models for Small-Footprint Keyword Spotting.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Attention-Based End-to-End Speech Recognition on Voice Search.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
Attention-Based End-to-End Speech Recognition in Mandarin.
CoRR, 2017
Design and Development of Intelligent Learning Companion for Primary School Students Based on the Tour of Well-Known Scenic Spots in Beijing.
Proceedings of the Learning and Collaboration Technologies. Technology in Education, 2017
2016
Light source imitation by using galvanometer scanner and spot light.
Multim. Tools Appl., 2016
Synchronized contention windows-based backoff algorithm in IEEE 802.11 wireless networks.
Proceedings of the International Conference on Computer, 2016
2015
Developing an Ontology-Based Cold Chain Logistics Monitoring and Decision System.
J. Sensors, 2015
Maximal singularity-free orientation workspace over a position region of Gough-Stewart platform.
Adv. Robotics, 2015
2013
Missing Data Solutions for Robust Speech Recognition.
Proceedings of the Essential Speech and Language Technology for Dutch, 2013
Respiration features of Chinese learners under self-narration task - The case of learners from Korea, Japan, America and Thailand.
Proceedings of the 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013
2012
Multi-candidate missing data imputation for robust speech recognition.
EURASIP J. Audio Speech Music. Process., 2012
2011
Consistent Synchronization Schemes for Workload Replay.
Proc. VLDB Endow., 2011
Gaussian Selection Using Self-Organizing Map for Automatic Speech Recognition.
Proceedings of the Advances in Self-Organizing Maps - 8th International Workshop, 2011
Automatic Speech Recognition Using Missing Data Techniques: Handling of Real-World Data.
Proceedings of the Robust Speech Recognition of Uncertain or Missing Data, 2011
2009
Real application testing with database replay.
Proceedings of the 2nd International Workshop on Testing Database Systems, 2009
Application of noise robust MDT speech recognition on the SPEECON and speechdat-car databases.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
A Compression Error and Optimize Compression Algorithm for Vector Data.
Proceedings of the 2009 International Conference on Environmental Science and Information Application Technology, 2009
2008
,
,
,
,
,
,
,
,
,
,
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008
Oracle real application testing.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 1st International Workshop on Testing Database Systems, 2008
2005
Efficiently querying spatial histograms.
Proceedings of the Storage and Retrieval Methods and Applications for Multimedia 2005, 2005
Distributed Out-of-Core Preprocessing of Very Large Microscopy Images for Efficient Querying.
Proceedings of the 2005 IEEE International Conference on Cluster Computing (CLUSTER 2005), September 26, 2005
2003
FPV: Fast Protein Visualization Using Java 3DTM.
Bioinform., 2003
FPV: Fast Protein Visualization Using Java 3D.
Proceedings of the 2003 ACM Symposium on Applied Computing (SAC), 2003
2000
Dynamic Interval Index Structure in Constraint Database Systems.
J. Comput. Sci. Technol., 2000
1997
On the expressive power of F-logic language.
J. Comput. Sci. Technol., 1997
Decomposition and Lossless Join in Constraint Databases.
Proceedings of the Constraint Databases and Their Applications, 1997
1996
Design and implementation of a concurrency control mechanism in an object-oriented database system.
J. Comput. Sci. Technol., 1996