Ji Zhang
Orcid: 0000-0002-3835-7975Affiliations:
- Alibaba Group, DAMO Academy, Hangzhou, China
According to our database1,
Ji Zhang
authored at least 100 papers
between 2018 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
ACM Trans. Multim. Comput. Commun. Appl., August, 2024
DISC: Plug-and-Play Decoding Intervention with Similarity of Characters for Chinese Spelling Check.
CoRR, 2024
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding.
CoRR, 2024
MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model.
CoRR, 2024
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models.
CoRR, 2024
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration.
CoRR, 2024
TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning.
CoRR, 2024
CoRR, 2024
CoRR, 2024
Meta Ranking: Less Capable Language Models are Capable for Single Response Judgement.
CoRR, 2024
CoRR, 2024
Bridging the Space Gap: Unifying Geometry Knowledge Graph Embedding with Optimal Transport.
Proceedings of the ACM on Web Conference 2024, 2024
Proceedings of the Natural Language Processing and Chinese Computing, 2024
Revisiting Unsupervised Temporal Action Localization: The Primacy of High-Quality Actionness and Pseudolabels.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
Breaking Barriers of System Heterogeneity: Straggler-Tolerant Multimodal Federated Learning via Knowledge Distillation.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
VG-Annotator: Vision-Language Models as Query Annotators for Unsupervised Visual Grounding.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
TinyChart: Efficient Chart Understanding with Program-of-Thoughts Learning and Visual Token Merging.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction Based on Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
mPLUG-OwI2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language Pre-training.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
IAD: In-Context Learning Ability Decoupler of Large Language Models in Meta-Training.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
SiTunes: A Situational Music Recommendation Dataset with Physiological and Psychological Signals.
Proceedings of the 2024 ACM SIGIR Conference on Human Information Interaction and Retrieval, 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Browse and Concentrate: Comprehending Multimodal Content via Prior-LLM Context Fusion.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Towards Better Utilization of Multi-Reference Training Data for Chinese Grammatical Error Correction.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
CycleAlign: Iterative Distillation from Black-box LLM to White-box Models for Better Human Alignment.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
CoRR, 2023
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration.
CoRR, 2023
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model.
CoRR, 2023
ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models.
CoRR, 2023
CValues: Measuring the Values of Chinese Large Language Models from Safety to Responsibility.
CoRR, 2023
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding.
CoRR, 2023
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks.
CoRR, 2023
AMTSS: An Adaptive Multi-Teacher Single-Student Knowledge Distillation Framework For Multilingual Language Inference.
CoRR, 2023
CoRR, 2023
ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital Human.
CoRR, 2023
mPLUG-Octopus: The Versatile Assistant Empowered by A Modularized End-to-End Multimodal LLM.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
COPA : Efficient Vision-Language Pre-training through Collaborative Object- and Patch-Text Alignment.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the International Conference on Machine Learning, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
DialoGPS: Dialogue Path Sampling in Continuous Semantic Space for Data Augmentation in Multi-Turn Conversations.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
CoRR, 2022
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections.
CoRR, 2022
Auto-MLM: Improved Contrastive Learning for Self-supervised Multi-lingual Knowledge Retrieval.
CoRR, 2022
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
DictBERT: Dictionary Description Knowledge Enhanced Language Model Pre-training via Contrastive Learning.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022
CAT-MNER: Multimodal Named Entity Recognition with Knowledge-Refined Cross-Modal Attention.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the 29th International Conference on Computational Linguistics, 2022
Proceedings of the 29th International Conference on Computational Linguistics, 2022
Generating Persuasive Responses to Customer Reviews with Multi-Source Prior Knowledge in E-commerce.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022
2021
SPMoE: Generate Multiple Pattern-Aware Outputs with Sparse Pattern Mixture of Experts.
CoRR, 2021
CoRR, 2021
AliMe DA: A Data Augmentation Framework for Question Answering in Cold-start Scenarios.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021
AliMe Avatar: Multi-modal Content Production and Presentation for Live-streaming E-commerce.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021
Segment, Mask, and Predict: Augmenting Chinese Word Segmentation with Self-Supervision.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021
K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021
KACE: Generating Knowledge Aware Contrastive Explanations for Natural Language Inference.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
2020
CoRR, 2020
Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication.
Proceedings of the 28th International Conference on Computational Linguistics, 2020
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020
Query-to-Session Matching: Do NOT Forget History and Future during Response Selection for Multi-Turn Dialogue Systems.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020
2019
Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication.
CoRR, 2019
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
2018
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018