Zheng Liu

Orcid: 0000-0001-7765-8466

Affiliations:
  • Beijing Academy of Artificial Intelligence, China
  • Huawei Poisson Lab, Shenzhen, China
  • Microsoft Research Asia, Beijing, China
  • Hong Kong University of Science & Technology, Department of Computer Science and Engineering, Hong Kong


According to our database1, Zheng Liu authored at least 109 papers between 2014 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

2014
2016
2018
2020
2022
2024
0
5
10
15
20
25
30
35
40
22
11
9
6
2
16
8
12
6
5
2
2
2
1

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
FineRAG: Fine-grained Retrieval-Augmented Text-to-Image Generation.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

How Credible Is an Answer From Retrieval-Augmented LLMs? Investigation and Evaluation With Multi-Hop QA.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

2024
When large language models meet personalization: perspectives of challenges and opportunities.
World Wide Web (WWW), July, 2024

MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval.
CoRR, 2024

AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark.
CoRR, 2024

Boosting Long-Context Management via Query-Guided Activation Refilling.
CoRR, 2024

Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems.
CoRR, 2024

Technical Report: Enhancing LLM Reasoning with Reward-guided Tree Search.
CoRR, 2024

AssistRAG: Boosting the Potential of Large Language Models with an Intelligent Information Assistant.
CoRR, 2024

Feint and Attack: Attention-Based Strategies for Jailbreaking and Protecting LLMs.
CoRR, 2024

Elephant in the Room: Unveiling the Impact of Reward Model Quality in Alignment.
CoRR, 2024

Making Text Embedders Few-Shot Learners.
CoRR, 2024

Lighter And Better: Towards Flexible Context Adaptation For Retrieval Augmented Generation.
CoRR, 2024

Trustworthiness in Retrieval-Augmented Generation Systems: A Survey.
CoRR, 2024

MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery.
CoRR, 2024

SEA-SQL: Semantic-Enhanced Text-to-SQL with Adaptive Refinement.
CoRR, 2024

Are Long-LLMs A Necessity For Long-Context Tasks?
CoRR, 2024

Extending Llama-3's Context Ten-Fold Overnight.
CoRR, 2024

Understanding Privacy Risks of Embeddings Induced by Large Language Models.
CoRR, 2024

Extensible Embedding: A Flexible Multipler For LLM's Context Length.
CoRR, 2024

BGE Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models.
CoRR, 2024

BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation.
CoRR, 2024

Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization.
CoRR, 2024

Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon.
CoRR, 2024

Information Retrieval Meets Large Language Models.
Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, 2024


Metacognitive Retrieval-Augmented Large Language Models.
Proceedings of the ACM on Web Conference 2024, 2024

Generative Retrieval via Term Set Generation.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

C-Pack: Packed Resources For General Chinese Embeddings.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Boosting the Potential of Large Language Models with an Intelligent Information Assistant.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

ChatRetriever: Adapting Large Language Models for Generalized and Robust Conversational Dense Retrieval.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

RAG-Studio: Towards In-Domain Adaptation of Retrieval Augmented Generation Through Self-Alignment.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Large Language Models as Foundations for Next-Gen Dense Retrieval: A Comprehensive Empirical Assessment.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Negating Negatives: Alignment with Human Negative Samples via Distributional Dispreference Optimization.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

A Multi-Task Embedder For Retrieval Augmented LLMs.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Grounding Language Model with Chunking-Free In-Context Retrieval.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Llama2Vec: Unsupervised Adaptation of Large Language Models for Dense Retrieval.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Semi-Supervised Variational User Identity Linkage via Noise-Aware Self-Learning.
IEEE Trans. Knowl. Data Eng., October, 2023

An Adaptive Graph Pre-training Framework for Localized Collaborative Filtering.
ACM Trans. Inf. Syst., April, 2023

CDSM: Cascaded Deep Semantic Matching on Textual Graphs Leveraging Ad-hoc Neighbor Selection.
ACM Trans. Intell. Syst. Technol., April, 2023

Reinforcement Routing on Proximity Graph for Efficient Recommendation.
ACM Trans. Inf. Syst., January, 2023

Making Large Language Models A Better Foundation For Dense Retrieval.
CoRR, 2023

LM-Cocktail: Resilient Tuning of Language Models via Model Merging.
CoRR, 2023

Retrieve Anything To Augment Large Language Models.
CoRR, 2023

C-Pack: Packaged Resources To Advance General Chinese Embedding.
CoRR, 2023

When Large Language Models Meet Personalization: Perspectives of Challenges and Opportunities.
CoRR, 2023

Term-Sets Can Be Strong Document Identifiers For Auto-Regressive Search Engines.
CoRR, 2023

WebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus.
CoRR, 2023

Cooperative Retriever and Ranker in Deep Recommenders.
Proceedings of the ACM Web Conference 2023, 2023

RecStudio: Towards a Highly-Modularized Recommender System.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

LibVQ: A Toolkit for Optimizing Vector Quantization and Efficient Neural Retrieval.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

Constructing Tree-based Index for Efficient and Effective Dense Retrieval.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Longtriever: a Pre-trained Long Text Encoder for Dense Document Retrieval.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Towards Efficient and Effective Transformers for Sequential Recommendation.
Proceedings of the Database Systems for Advanced Applications, 2023

RetroMAE-2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
RetroMAE v2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models.
CoRR, 2022

Bi-Phase Enhanced IVFPQ for Time-Efficient Ad-hoc Retrieval.
CoRR, 2022

Pre-training for Information Retrieval: Are Hyperlinks Fully Explored?
CoRR, 2022

A Neural Corpus Indexer for Document Retrieval.
CoRR, 2022

RetroMAE: Pre-training Retrieval-oriented Transformers via Masked Auto-Encoder.
CoRR, 2022

Distill-VQ: Learning Retrieval Oriented Vector Quantization By Distilling Knowledge from Dense Embeddings.
CoRR, 2022

A Mutually Reinforced Framework for Pretrained Sentence Embeddings.
CoRR, 2022

Uni-Retriever: Towards Learning The Unified Embedding Based Retriever in Bing Sponsored Search.
CoRR, 2022

GateFormer: Speeding Up News Feed Recommendation with Input Gated Transformers.
CoRR, 2022

Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval.
Proceedings of the WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25, 2022

MINDSim: User Simulator for News Recommenders.
Proceedings of the WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25, 2022

Distill-VQ: Learning Retrieval Oriented Vector Quantization By Distilling Knowledge from Dense Embeddings.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Forest-based Deep Recommender.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Ada-Ranker: A Data Distribution Adaptive Ranking Paradigm for Sequential Recommendation.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

A Neural Corpus Indexer for Document Retrieval.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Recommender Forest for Efficient Retrieval.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Uni-Retriever: Towards Learning the Unified Embedding Based Retriever in Bing Sponsored Search.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Training Large-Scale News Recommenders with Pretrained Language Models in the Loop.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language Understanding.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Anisotropic Additive Quantization for Fast Inner Product Search.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Semi-Supervised Variational User Identity Linkage via Noise-Aware Self-Learning.
CoRR, 2021

GraphFormers: GNN-nested Language Models for Linked Text Representation.
CoRR, 2021

Hybrid Encoder: Towards Efficient and Precise Native AdsRecommendation via Hybrid Transformer Encoding Networks.
CoRR, 2021

Search-oriented Differentiable Product Quantization.
CoRR, 2021

Training Microsoft News Recommenders with Pretrained Language Models in the Loop.
CoRR, 2021

Multi-Interest-Aware User Modeling for Large-Scale Sequential Recommendations.
CoRR, 2021

AdsGNN: Behavior-Graph Augmented Relevance Modeling in Sponsored Search.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

Lighter and Better: Low-Rank Decomposed Self-Attention Networks for Next-Item Recommendation.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Reinforced Anchor Knowledge Graph Generation for News Recommendation Reasoning.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Matching-oriented Embedding Quantization For Ad-hoc Retrieval.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Leveraging Bidding Graphs for Advertiser-Aware Relevance Modeling in Sponsored Search.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

2020
LightRec: A Memory and Search-Efficient Recommender System.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020

Leveraging Demonstrations for Reinforcement Recommendation Reasoning over Knowledge Graphs.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

Octopus: Comprehensive and Elastic User Representation for the Generation of Recommendation Candidates.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

Sampling-Decomposable Generative Adversarial Recommender.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Fine-grained Interest Matching for Neural News Recommendation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
A Novel User Representation Paradigm for Making Personalized Candidate Retrieval.
CoRR, 2019

Hi-Fi Ark: Deep User Representation via High-Fidelity Archive Network.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Neural News Recommendation with Long- and Short-term User Representations.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Context-aware Academic Collaborator Recommendation.
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

Realtime Traffic Speed Estimation with Sparse Crowdsourced Data.
Proceedings of the 34th IEEE International Conference on Data Engineering, 2018

2017
Worker Recommendation for Crowdsourced Q&A Services: A Triple-Factor Aware Approach.
Proc. VLDB Endow., 2017

Speaker Direction-of-Arrival Estimation Based on Frequency-Independent Beampattern.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Tuning Crowdsourced Human Computation.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

2016
Tuning Crowdsourced Human Computation.
CoRR, 2016

Mutual benefit aware task assignment in a bipartite labor market.
Proceedings of the 32nd IEEE International Conference on Data Engineering, 2016

2015
Cleaning uncertain data with a noisy crowd.
Proceedings of the 31st IEEE International Conference on Data Engineering, 2015

2014
gMission: A General Spatial Crowdsourcing Platform.
Proc. VLDB Endow., 2014


  Loading...