2025
Improved GBNN Guided Multirobot Coverage Search Based on Neuronal Connectivity.
IEEE Syst. J., June, 2025

METOR: A Unified Framework for Mutual Enhancement of Objects and Relationships in Open-vocabulary Video Visual Relationship Detection.
CoRR, May, 2025

K2: On Optimizing Distributed Transactions in a Multi-region Data Store with TrueTime Clocks (Extended Version).
CoRR, April, 2025

Design of an Expression Recognition Solution Based on the Global Channel-Spatial Attention Mechanism and Proportional Criterion Fusion.
CoRR, March, 2025

Solution for 8th Competition on Affective & Behavior Analysis in-the-wild.
CoRR, March, 2025

Technical Approach for the EMI Challenge in the 8th Affective Behavior Analysis in-the-Wild Competition.
CoRR, March, 2025

Interactive Multimodal Fusion with Temporal Modeling.
CoRR, March, 2025

A fault hierarchical propagation reliability improvement method for CNC machine tools based on spatiotemporal factors coupling.
Reliab. Eng. Syst. Saf., 2025

TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
GA-Based Multipopulation Synergistic Gene Screening Strategy on Critical Nodes Detection.
IEEE Trans. Comput. Soc. Syst., June, 2024

A Double Deep Q-Network framework for a flexible job shop scheduling problem with dynamic job arrivals and urgent job insertions.
Eng. Appl. Artif. Intell., 2024

End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting.
CoRR, 2024

Accompanied Singing Voice Synthesis with Fully Text-controlled Melody.
CoRR, 2024

Frieren: Efficient Video-to-Audio Generation with Rectified Flow Matching.
CoRR, 2024

Text-to-Song: Towards Controllable Music Generation Incorporating Vocals and Accompaniment.
CoRR, 2024

AUD-TGN: Advancing Action Unit Detection with Temporal Convolution and GPT-2 in Wild Audiovisual Contexts.
CoRR, 2024

Multimodal Fusion Method with Spatiotemporal Sequences and Relationship Learning for Valence-Arousal Estimation.
CoRR, 2024

Exploring Facial Expression Recognition through Semi-Supervised Pretraining and Temporal Modeling.
CoRR, 2024

Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning.
CoRR, 2024

Trajectory-based Calibration for Optical See-Through Head-Mounted Displays Without Alignment.
Proceedings of the Pattern Recognition and Computer Vision - 7th Chinese Conference, 2024

MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and Correspondence.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

VoiceTuner: Self-Supervised Pre-training and Efficient Fine-tuning For Voice Generation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

InstructSpeech: Following Speech Editing Instructions via Large Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

AUD-TGN: Advancing Action Unit Detection with Temporal Convolution and GPT-2 in Wild Audiovisual Contexts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Improving Valence-Arousal Estimation with Spatiotemporal Relationship Learning and Multimodal Fusion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Exploring Facial Expression Recognition through Semi-Supervised Pre-training and Temporal Modeling.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Remote Registration of Multiple Authenticators.
Proceedings of the Fourteenth ACM Conference on Data and Application Security and Privacy, 2024

Speech-to-Speech Translation with Discrete-Unit-Based Style Transfer.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 2024

Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Robust Singing Voice Transcription Serves Synthesis.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Text-to-Song: Towards Controllable Music Generation Incorporating Vocal and Accompaniment.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Multi-Modal Prompting for Open-Vocabulary Video Visual Relationship Detection.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Silicon-Cantilever-Enhanced Single-Fiber Photoacoustic Acetylene Gas Sensor.
Sensors, September, 2023

Comprehensive Evaluation of Experimental Teaching Quality Using AHP-TOPSIS Technique.
Int. J. Emerg. Technol. Learn., June, 2023

An Improved Differential Evolution Framework Using Network Topology Information for Critical Nodes Detection.
IEEE Trans. Comput. Soc. Syst., April, 2023

Resilience-oriented design for complex MEP systems in BIM.
Adv. Eng. Informatics, January, 2023

Make-A-Voice: Unified Voice Synthesis With Discrete Representation.
CoRR, 2023

Connecting Multi-modal Contrastive Representations.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

The analysis of pictures and the study on the origin of the plant herbs in Bencao Pinhui Jingyao.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2023

2022
Detecting logical relationships in mechanical, electrical, and plumbing (MEP) systems with BIM using graph matching.
Adv. Eng. Informatics, 2022

FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

2021
Network Embedding Attack: An Euclidean Distance Based Method.
Proceedings of the MDATA: A New Knowledge Representation Model, 2021

Data-driven quantification of public-private partnership experience levels under uncertainty with Bayesian hierarchical model.
Appl. Soft Comput., 2021

Inclusive Design in the Context of Smart Community.
Proceedings of the Advances in Industrial Design, 2021

2018
Phenomenological Thermodynamics of Irreversible Processes.
Entropy, 2018

2016
Adaptive Collaborative Gaussian Mixture Probability Hypothesis Density Filter for Multi-Target Tracking.
Sensors, 2016

An Effective Correction Method for Seriously Oblique Remote Sensing Images Based on Multi-View Simulation and a Piecewise Model.
Sensors, 2016

2013
Global space-time association for Probability Hypothesis Density filter.
Proceedings of the 16th International Conference on Information Fusion, 2013

2012
Dynamics and Control on Double-drum Roller Maximum Brake Deceleration.
J. Comput., 2012

2011
Application of gesture recognition based on simulated annealing BP neural network.
Proceedings of the International Conference on Electronic and Mechanical Engineering and Information Technology, 2011

2007
Time-dependent magnetohydrodynamic flow induced by non-coaxial rotations of a non-torsionally oscillating porous plate and a third-order fluid at infinity.
Math. Comput. Model., 2007