2025
A Survey on (M)LLM-Based GUI Agents.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, April, 2025
InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models.
CoRR, March, 2025
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task.
CoRR, February, 2025
Learning Combinatorial Prompts for Universal Controllable Image Captioning.
Int. J. Comput. Vis., January, 2025
AdaPDTW: An Efficient Abstract-Adaptive Piecewise Dynamic Time Warping for Time Series Classification.
IEEE Access, 2025
Data-Adaptive Dynamic Time Warping-Based Multivariate Time Series Fuzzy Clustering.
IEEE Access, 2025
S^3cMath: Spontaneous Step-Level Self-Correction Makes Large Language Models Better Mathematical Reasoners.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
2024
Improving Reference-Based Distinctive Image Captioning with Contrastive Rewards.
ACM Trans. Multim. Comput. Commun. Appl., December, 2024
Label Semantic Knowledge Distillation for Unbiased Scene Graph Generation.
IEEE Trans. Circuits Syst. Video Technol., January, 2024
S<sup>3</sup>c-Math: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners.
CoRR, 2024
Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism.
CoRR, 2024
ProSwitch: Knowledge-Guided Language Model Fine-Tuning to Generate Professional and Non-Professional Styled Text.
CoRR, 2024
Triad: A Framework Leveraging a Multi-Role LLM-based Agent to Solve Knowledge Base Question Answering.
CoRR, 2024
Triad: A Framework Leveraging a Multi-Role LLM-based Agent to Solve Knowledge Base Question Answering.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Learning Global Controller in Latent Space for Parameter-Efficient Fine-Tuning.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
Dynamic Finite Element Model Based on Timoshenko Beam Theory for Simulating High-Speed Nonlinear Helical Springs.
Sensors, April, 2023
VL-NMS: Breaking Proposal Bottlenecks in Two-stage Visual-language Matching.
ACM Trans. Multim. Comput. Commun. Appl., 2023
Continuous Glucose Monitoring Time Series Data Analysis: A Time Series Analysis Package for Continuous Glucose Monitoring Data.
J. Comput. Biol., 2023
TreePrompt: Learning to Compose Tree Prompts for Explainable Visual Grounding.
CoRR, 2023
Learning Combinatorial Prompts for Universal Controllable Image Captioning.
CoRR, 2023
Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Triple Correlations-Guided Label Supplementation for Unbiased Video Scene Graph Generation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Dark Knowledge Balance Learning for Unbiased Scene Graph Generation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
2022
An Interference-Tolerant Synchronization Scheme for Wireless Communication Systems Based on Direct Sequence Spread Spectrum.
IEEE Trans. Circuits Syst. I Regul. Pap., 2022
Deep Learning for Weakly-Supervised Object Detection and Localization: A Survey.
Neurocomputing, 2022
Citation Trajectory Prediction via Publication Influence Representation Using Temporal Knowledge Graph.
CoRR, 2022
Label Semantic Knowledge Distillation for Unbiased Scene Graph Generation.
CoRR, 2022
Rethinking the Reference-based Distinctive Image Captioning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Explicit Image Caption Editing.
Proceedings of the Computer Vision - ECCV 2022, 2022
Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Rethinking the Evaluation of Unbiased Scene Graph Generation.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022
2021
Explore Video Clip Order With Self-Supervised and Curriculum Learning for Video Applications.
IEEE Trans. Multim., 2021
Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey.
CoRR, 2021
VL-NMS: Breaking Proposal Bottlenecks in Two-Stage Visual-Language Matching.
CoRR, 2021
Instance-wise or Class-wise? A Tale of Neighbor Shapley for Concept-based Explanation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
Natural Language Video Localization with Learnable Moment Proposals.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
Boundary Proposal Network for Two-stage Natural Language Video Localization.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
Empower Distantly Supervised Relation Extraction with Collaborative Adversarial Training.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
An Efficient Sinusoid-Like Pseudo Random Sequence Modulator/Demodulator System With Reduced Adjacent Channel Leakage and High Rejection to Random and Systematic Interference.
IEEE Trans. Circuits Syst., 2020
Hierarchical Temporal Fusion of Multi-grained Attention Features for Video Question Answering.
Neural Process. Lett., 2020
Alleviate Dataset Shift Problem in Fine-grained Entity Typing with Virtual Adversarial Training.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020
2019
Adversarial learning for viewpoints invariant 3D human pose estimation.
J. Vis. Commun. Image Represent., 2019
Efficient Broadband Class AB Amplifier.
Proceedings of the 62nd IEEE International Midwest Symposium on Circuits and Systems, 2019
Self-Supervised Spatiotemporal Learning via Video Clip Order Prediction.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Research on Error Modeling and Compensation Method of Hot Rolling Shape Setting Model Based on Cluster and Neural Network.
Proceedings of the 2019 4th International Conference on Automation, 2019
2017
SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017
Experiential Interaction Modeling for Virtual Training of Ultra-High Voltage Power System and Its Application.
Proceedings of the Advances in Computer Science and Ubiquitous Computing, 2017
A Model and Application of Collaborative Simulation Training System for Substation Based on Virtual Reality.
Proceedings of the Advances in Computer Science and Ubiquitous Computing, 2017
2016
D-Ocean: an unstructured data management system for data ocean environment.
Frontiers Comput. Sci., 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning.
CoRR, 2016
2015
Topic aspect-oriented summarization via group selection.
Neurocomputing, 2015
Comprehensive shape control technology for CSP hot strip mills.
Int. J. Autom. Comput., 2015
RAISE: A Whole Process Modeling Method for Unstructured Data Management.
Proceedings of the 2015 IEEE International Conference on Multimedia Big Data, BigMM 2015, 2015
2014
Hashing with List-Wise learning to rank.
Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014
Cross-Media Hashing with Neural Networks.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014
Jointly Discovering Fine-grained and Coarse-grained Sentiments via Topic Modeling.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014
Geo-informative discriminative image representation by semi-supervised hierarchical topic modeling.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014
2013
Image annotation by semi-supervised cross-domain learning with group sparsity.
J. Vis. Commun. Image Represent., 2013
Hypergraph Spectral Hashing for image retrieval with heterogeneous social contexts.
Neurocomputing, 2013
πLDA: document clustering with selective structural constraints.
Proceedings of the ACM Multimedia Conference, 2013
Digital Library Engine: Adapting Digital Library for Cloud Computing.
Proceedings of the 2013 IEEE Sixth International Conference on Cloud Computing, Santa Clara, CA, USA, June 28, 2013
2012
Sparse Unsupervised Dimensionality Reduction for Multiple View Data.
IEEE Trans. Circuits Syst. Video Technol., 2012
Pattern Recognit. Lett., 2012
A unified framework for web video topic discovery and visualization.
Pattern Recognit. Lett., 2012
The heterogeneous feature selection with structural sparsity for multimedia annotation and hashing: a survey.
Int. J. Multim. Inf. Retr., 2012
LuSH: A Generic High-Dimensional Index Framework.
Proceedings of the Web-Age Information Management, 2012
Image Ranking via Attribute Boosted Hypergraph.
Proceedings of the Advances in Multimedia Information Processing - PCM 2012, 2012
Logistic Tensor Regression for Classification.
Proceedings of the Intelligent Science and Intelligent Data Engineering, 2012
Nonnegative Matrix Factorization for Multimodality Data from Multi-source Domain.
Proceedings of the Eighth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2012
Graph-guided sparse reconstruction for region tagging.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012
2011
Hypergraph spectral hashing for similarity search of social image.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011
Image annotation by composite kernel learning with group structure.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011
Towards a new reading experience via semantic fusion of text and music.
Proceedings of the 2011 Joint International Conference on Digital Libraries, 2011
Inverse-degree Sampling for Spectral Clustering.
Proceedings of the Sixth International Conference on Image and Graphics, 2011
Tag Clustering and Refinement on Semantic Unity Graph.
Proceedings of the 11th IEEE International Conference on Data Mining, 2011
2010
Multiple hypergraph ranking for video concept detection.
J. Zhejiang Univ. Sci. C, 2010
Multi-video summarization using complex graph clustering and mining.
Comput. Sci. Inf. Syst., 2010
Lasso-Based Tag Expansion and Tag-Boosted Collaborative Filtering.
Proceedings of the Advances in Multimedia Information Processing - PCM 2010, 2010
High Dimensionality Reduction Using CUR Matrix Decomposition and Auto-encoder for Web Image Classification.
Proceedings of the Advances in Multimedia Information Processing - PCM 2010, 2010
Topic discovery of web video using star-structured K-partite graph.
Proceedings of the 18th International Conference on Multimedia 2010, 2010
2008
Development of a Mandarin-English Bilingual Speech Recognition System for Real World Music Retrieval.
IEICE Trans. Inf. Syst., 2008
Effects of the Temporal Fine Structure in Different Frequency Bands on Mandarin Tone Perception.
IEICE Trans. Inf. Syst., 2008
A One-Pass Real-Time Decoder Using Memory-Efficient State Network.
IEICE Trans. Inf. Syst., 2008
Effective Acoustic Modeling for Pronunciation Quality Scoring of Strongly Accented Mandarin Speech.
IEICE Trans. Inf. Syst., 2008
Efficient System Combination for Syllable-Confusion-Network-Based Chinese Spoken Term Detection.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Towards vocabulary-independent speech indexing for large-scale repositories.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Addressing the out-of-vocabulary problem for large-scale Chinese spoken term detection.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Spoken Term Detection Using Dynamic Match Subword Confusion Network.
Proceedings of the Fourth International Conference on Natural Computation, 2008
2007
A fast fuzzy keyword spotting algorithm based on syllable confusion network.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Research on a Security Model of Data in Computer Supported Collaborative Design Integrated with PDM System.
Proceedings of the Workshop on Intelligent Information Technology Application, 2007
Keyword Spotting Based on Syllable Confusion Network.
Proceedings of the Third International Conference on Natural Computation, 2007
Real Context Model for Tone Recognition in Mandarin Conversational Telephone Speech.
Proceedings of the Third International Conference on Natural Computation, 2007
2006
Keyword Spotting Based on Phoneme Confusion Matrix.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006
Syllable Based Audio Search Using Confusion Network Arc as Indexing Unit.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006