Marc Najork

Orcid: 0000-0003-1423-0854

Affiliations:
  • Google Research, Mountain View, CA, USA
  • Compaq Systems Research Center, Palo Alto, CA, USA
  • University of Illinois at Urbana-Champaign, Department of Computer Science, IL, USA


According to our database1, Marc Najork authored at least 141 papers between 1990 and 2024.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2019, "For contributions to web search and web science".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Knowledge Distillation with Perturbed Loss: From a Vanilla Teacher to a Proxy Teacher.
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

2023
Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge Distillation.
CoRR, 2023

"Why is this misleading?": Detecting News Headline Hallucinations with Explanations.
Proceedings of the ACM Web Conference 2023, 2023

STRUM: Extractive Aspect-Based Contrastive Summarization.
Proceedings of the Companion Proceedings of the ACM Web Conference 2023, 2023

Job Type Extraction for Service Businesses.
Proceedings of the Companion Proceedings of the ACM Web Conference 2023, 2023

Generative Information Retrieval.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

Towards Disentangling Relevance and Bias in Unbiased Learning to Rank.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

End-to-End Query Term Weighting.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

DSI++: Updating Transformer Memory with New Documents.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Creator Context for Tweet Recommendation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: EMNLP 2023, 2023

Exploring the Viability of Synthetic Query Generation for Relevance Prediction.
Proceedings of the 2023 SIGIR Workshop on eCommerce co-located with the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023), 2023

Regression Compatible Listwise Objectives for Calibrated Ranking with Binary Relevance.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

2022
Introduction to the Special Section on Graph Technologies for User Modeling and Recommendation, Part 2.
ACM Trans. Inf. Syst., 2022

Graph Technologies for User Modeling and Recommendation: Introduction to the Special Issue - Part 1.
ACM Trans. Inf. Syst., 2022

Regression Compatible Listwise Objectives for Calibrated Ranking.
CoRR, 2022

Data-Efficient Information Extraction from Form-Like Documents.
CoRR, 2022

Revisiting Two-tower Models for Unbiased Learning to Rank.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

On Optimizing Top-K Metrics for Neural Ranking Models.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Scale Calibration of Deep Ranking Models.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Rax: Composable Learning-to-Rank Using JAX.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Out-of-Domain Semantics to the Rescue! Zero-Shot Hybrid Retrieval Models.
Proceedings of the Advances in Information Retrieval, 2022

2021
Rethinking search: making domain experts out of dilettantes.
SIGIR Forum, 2021

Report on the 2nd international conference on design of experimental search & information retrieval systems (DESIRES 2021).
SIGIR Forum, 2021

Glean: Structured Extractions from Templatic Documents.
Proc. VLDB Endow., 2021

Search and Discovery in Personal Email Collections.
Found. Trends Inf. Retr., 2021

Rank4Class: A Ranking Formulation for Multiclass Classification.
CoRR, 2021

Born Again Neural Rankers.
CoRR, 2021

Rethinking Search: Making Experts out of Dilettantes.
CoRR, 2021

Privacy-Adaptive BERT for Natural Language Understanding.
CoRR, 2021

Improving Cloud Storage Search with User Activity.
Proceedings of the WSDM '21, 2021

WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

Scalable Hierarchical Agglomerative Clustering.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Dynamic Language Models for Continuously Evolving Content.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Bootstrapping Recommendations at Chrome Web Store.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Ensemble Distillation for BERT-Based Ranking Models.
Proceedings of the ICTIR '21: The 2021 ACM SIGIR International Conference on the Theory of Information Retrieval, 2021

Are Neural Rankers still Outperformed by Gradient Boosted Decision Trees?
Proceedings of the 9th International Conference on Learning Representations, 2021

Preface.
Proceedings of the Second International Conference on Design of Experimental Search & Information REtrieval Systems, 2021

Natural Language Understanding with Privacy-Preserving BERT.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

2020
Scalable Bottom-Up Hierarchical Clustering.
CoRR, 2020

DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling.
CoRR, 2020

Leveraging Semantic and Lexical Matching to Improve the Recall of Document Retrieval Systems: A Hybrid Approach.
CoRR, 2020

Active Learning for Skewed Data Sets.
CoRR, 2020

Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Document Matching.
CoRR, 2020

Learning-to-Rank with BERT in TF-Ranking.
CoRR, 2020

Adversarial Bandits Policy for Crawling Commercial Web Content.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020

A Stochastic Treatment of Learning to Rank Scoring Functions.
Proceedings of the WSDM '20: The Thirteenth ACM International Conference on Web Search and Data Mining, 2020

Feature Transformation for Neural Ranking Models.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

Learning to Cluster Documents into Workspaces Using Large Scale Activity Logs.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Permutation Equivariant Document Interaction Network for Neural Learning to Rank.
Proceedings of the ICTIR '20: The 2020 ACM SIGIR International Conference on the Theory of Information Retrieval, 2020

DiPair: Fast and Accurate Distillation for Trillion-ScaleText Matching and Pair Modeling.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Long-Form Document Matching.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

Migrating a Privacy-Safe Information Extraction System to a Software 2.0 Design.
Proceedings of the 10th Conference on Innovative Data Systems Research, 2020

Representation Learning for Information Extraction from Form-like Documents.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Online Template Induction for Machine-Generated Emails.
Proc. VLDB Endow., 2019

Self-Attentive Document Interaction Networks for Permutation Equivariant Ranking.
CoRR, 2019

RiSER: Learning Better Representations for Richly Structured Emails.
Proceedings of the World Wide Web Conference, 2019

Semantic Text Matching for Long-Form Documents.
Proceedings of the World Wide Web Conference, 2019

Predictive Crawling for Commercial Web Content.
Proceedings of the World Wide Web Conference, 2019

Addressing Trust Bias for Unbiased Learning-to-Rank.
Proceedings of the World Wide Web Conference, 2019

Uncovering Hidden Structure in Sequence Data via Threading Recurrent Models.
Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, 2019

Estimating Position Bias without Intrusive Interventions.
Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, 2019

Multi-view Embedding-based Synonyms for Email Search.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Revisiting Approximate Metric Optimization in the Age of Deep Neural Networks.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank.
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019

An Analysis of the Softmax Cross Entropy Loss for Learning-to-Rank with Binary Relevance.
Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, 2019

Learning Groupwise Multivariate Scoring Functions Using Deep Neural Networks.
Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, 2019

2018
Web Search Relevance Ranking.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Web Spam Detection.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Web Crawler Architecture.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Learning Groupwise Scoring Functions Using Deep Neural Networks.
CoRR, 2018

Offline Comparison of Ranking Functions using Randomized Data.
CoRR, 2018

Hidden in Plain Sight: Classifying Emails Using Embedded Image Contents.
Proceedings of the 2018 World Wide Web Conference on World Wide Web, 2018

Position Bias Estimation for Unbiased Learning to Rank in Personal Search.
Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, 2018

Semantic Location in Email Query Suggestion.
Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018

Anatomy of a Privacy-Safe Large-Scale Information Extraction System Over Email.
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

Learning with Sparse and Biased Feedback for Personal Search.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Training On-Device Ranking Models from Cross-User Interactions in a Privacy-Preserving Fashion.
Proceedings of the First Biennial Conference on Design of Experimental Search & Information Retrieval Systems, 2018

The LambdaLoss Framework for Ranking Metric Optimization.
Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018

Learning Effective Embeddings for Machine Generated Emails with Applications to Email Category Prediction.
Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

2017
Email Category Prediction.
Proceedings of the 26th International Conference on World Wide Web Companion, 2017

Learning from User Interactions in Personal Search via Attribute Parameterization.
Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, 2017

Quick Access: Building a Smart Experience for Google Drive.
Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13, 2017

2016
Learning to Rank with Selection Bias in Personal Search.
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

Using Machine Learning to Improve the Email Experience.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

2015
Debugging a Crowdsourced Task with Low Inter-Rater Agreement.
Proceedings of the 15th ACM/IEEE-CE Joint Conference on Digital Libraries, 2015

2013
Robust query rewriting using anchor data.
Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, 2013

Boot-Strapping Language Identifiers for Short Colloquial Postings.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2013

A Human-Centered Framework for Ensuring Reliability on Crowdsourced Labeling Tasks.
Proceedings of the Human Computation and Crowdsourcing: Works in Progress and Demonstration Abstracts, 2013

Are Some Tweets More Interesting Than Others? #HardQuestion.
Proceedings of the Symposium on Human-Computer Interaction and Information Retrieval, 2013

2012
Editorial.
ACM Trans. Web, 2012

How user behavior is related to social affinity.
Proceedings of the Fifth International Conference on Web Search and Web Data Mining, 2012

Of hammers and nails: an empirical comparison of three paradigms for processing large graphs.
Proceedings of the Fifth International Conference on Web Search and Web Data Mining, 2012

Detecting quilted web pages at scale.
Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, 2012

2011
Microsoft Research at TREC 2011 Web Track.
Proceedings of The Twentieth Text REtrieval Conference, 2011

The Power of Peers.
Proceedings of the Advances in Information Retrieval, 2011

2010
Web Crawling.
Found. Trends Inf. Retr., 2010

A sketch-based distance oracle for web-scale graphs.
Proceedings of the Third International Conference on Web Search and Web Data Mining, 2010

Microsoft Research at TREC 2010 Web Track.
Proceedings of The Nineteenth Text REtrieval Conference, 2010

Querying the Web Graph - (Invited Talk).
Proceedings of the String Processing and Information Retrieval, 2010

2009
Web Search Relevance Ranking.
Proceedings of the Encyclopedia of Database Systems, 2009

Web Spam Detection.
Proceedings of the Encyclopedia of Database Systems, 2009

Web Crawler Architecture.
Proceedings of the Encyclopedia of Database Systems, 2009

Less is more: sampling the neighborhood graph makes SALSA better and faster.
Proceedings of the Second International Conference on Web Search and Web Data Mining, 2009

Microsoft Research at TREC 2009: Web and Relevance Feedback Track.
Proceedings of The Eighteenth Text REtrieval Conference, 2009

The scalable hyperlink store.
Proceedings of the HYPERTEXT 2009, Proceedings of the 20th ACM Conference on Hypertext and Hypermedia, Torino, Italy, June 29, 2009

2008
Introduction to special section on adversarial issues in Web search.
ACM Trans. Web, 2008

Computing Information Retrieval Performance Measures Efficiently in the Presence of Tied Scores.
Proceedings of the Advances in Information Retrieval , 2008

Efficient and effective link analysis with precomputed salsa maps.
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

2007
Using Bloom Filters to Speed Up HITS-Like Ranking Algorithms.
Proceedings of the Algorithms and Models for the Web-Graph, 5th International Workshop, 2007

Hits on the web: how does it compare?
Proceedings of the SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007

Comparing the effectiveness of hits and salsa.
Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, 2007

2006
Adversarial information retrieval on the web (AIRWeb 2006).
SIGIR Forum, 2006

Detecting spam web pages through content analysis.
Proceedings of the 15th international conference on World Wide Web, 2006

2005
How search engines shape the web.
Proceedings of the 14th international conference on World Wide Web, 2005

Detecting phrase-level duplication on the world wide web.
Proceedings of the SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2005

2004
A large-scale study of the evolution of Web pages.
Softw. Pract. Exp., 2004

On The Evolution of Clusters of Near-Duplicate Web Pages.
J. Web Eng., 2004

Spam, Damn Spam, and Statistics: Using Statistical Analysis to Locate Spam Web Pages.
Proceedings of the Seventh International Workshop on the Web and Databases, 2004

Boxwood: Abstractions as the Foundation for Storage Infrastructure.
Proceedings of the 6th Symposium on Operating System Design and Implementation (OSDI 2004), 2004

2003
Efficient URL caching for world wide web crawling.
Proceedings of the Twelfth International World Wide Web Conference, 2003

2001
Breadth-first crawling yields high-quality pages.
Proceedings of the Tenth International World Wide Web Conference, 2001

Web-based Algorithm Animation.
Proceedings of the 38th Design Automation Conference, 2001

2000
Performance limitations of the Java core libraries.
Concurr. Pract. Exp., 2000

On near-uniform URL sampling.
Comput. Networks, 2000

1999
Mercator: A Scalable, Extensible Web Crawler.
World Wide Web, 1999

Measuring Index Quality Using Random Walks on the Web.
Comput. Networks, 1999

1997
Collaborative Active Textbooks.
J. Vis. Lang. Comput., 1997

A Java-Based Implementation of Collaborative Active Textbooks.
Proceedings of the Proceedings 1997 IEEE Symposium on Visual Languages, 1997

Distributed Applets.
Proceedings of the Human Factors in Computing Systems, 1997

1996
Programming in Three Dimensions.
J. Vis. Lang. Comput., 1996

Distributed Active Objects.
Comput. Networks, 1996

Collaborative Active Textbooks: A Web-Based Algorithm Animation System for an Electronic Classroom.
Proceedings of the 1996 IEEE Symposium on Visual Languages, 1996

1995
Obliq-3D: A High-Level, Fast-Turnaround 3D Animation System.
IEEE Trans. Vis. Comput. Graph., 1995

1994
A Library for Visualizing Combinatorial Structures.
Proceedings of the 5th IEEE Visualization Conference, 1994

1993
Specifying Visual Languages with Conditional Set Rewrite Systems.
Proceedings of the 1993 IEEE Workshop on Visual Languages, 1993

Algorithm Animation Using 3D Interactive Graphics.
Proceedings of the Sixth ACM Symposium on User Interface Software and Technology, 1993

Cube: Eine dreidimensionale visuelle Programmiersprache.
Proceedings of the Informatik - Wirtschaft - Gesellschaft, 23. Gi-Jahrestagung, Dresden, Germany, 27. September, 1993

1992
A Prototype Implementation of the Cube Language.
Proceedings of the 1992 IEEE Workshop on Visual Languages, 1992

1991
The CUBE Language.
Proceedings of the 1991 IEEE Workshop on Visual Languages, Kobe, Japan, October 8-11, 1991, 1991

1990
Roles and their role in posing recursive queries.
Inf. Syst., 1990

Enhancing Show-and-Tell with a polymorphic type system and higher-order functions.
Proceedings of the 1990 IEEE Workshop on Visual Languages, 1990


  Loading...