Tetsuya Sakai

Orcid: 0000-0002-6720-963X

Affiliations:
  • Waseda University, Japan


According to our database1, Tetsuya Sakai authored at least 284 papers between 1998 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
SSR: Solving Named Entity Recognition Problems via a Single-stream Reasoner.
ACM Trans. Inf. Syst., September, 2024

On the Ordering of Pooled Web Pages, Gold Assessments, and Bronze Assessments.
ACM Trans. Inf. Syst., January, 2024

A Versatile Framework for Evaluating Ranked Lists in Terms of Group Fairness and Relevance.
ACM Trans. Inf. Syst., January, 2024

How Many Crowd Workers Do I Need? On Statistical Power when Crowdsourcing Relevance Judgments.
ACM Trans. Inf. Syst., January, 2024

Data-Efficient Massive Tool Retrieval: A Reinforcement Learning Approach for Query-Tool Alignment with Language Models.
CoRR, 2024

AI Can Be Cognitively Biased: An Exploratory Study on Threshold Priming in LLM-Based Batch Relevance Assessment.
CoRR, 2024

CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models.
CoRR, 2024

Vector Quantization for Recommender Systems: A Review and Outlook.
CoRR, 2024

Decoy Effect In Search Interaction: Understanding User Behavior and Measuring System Vulnerability.
CoRR, 2024

Modeling Multimodal Uncertainties via Probability Distribution Encoders Included Vision-Language Models.
IEEE Access, 2024

Enhancing Parameter Efficiency in Model Inference Using an Ultralight Inter-Transformer Linear Structure.
IEEE Access, 2024

ONCE: Boosting Content-based Recommendation with Both Open- and Closed-source Large Language Models.
Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024

ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

ChatRetriever: Adapting Large Language Models for Generalized and Robust Conversational Dense Retrieval.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Evaluating Parrots and Sociopathic Liars: A keynote at ICTIR 2023.
SIGIR Forum, December, 2023

On a Few Responsibilities of (IR) Researchers (Fairness, Awareness, and Sustainability): A Keynote at ECIR 2023.
SIGIR Forum, June, 2023

Decoy Effect in Search Interaction: A Pilot Study.
CoRR, 2023

Towards Consistency Filtering-Free Unsupervised Learning for Dense Retrieval.
CoRR, 2023

A Meta-Evaluation of C/W/L/A Metrics: System Ranking Similarity, System Ranking Consistency and Discriminative Power.
CoRR, 2023

SWAN: A Generic Framework for Auditing Textual Conversational Systems.
CoRR, 2023

A First Look at LLM-Powered Generative News Recommendation.
CoRR, 2023

NER-to-MRC: Named-Entity Recognition Completely Solving as Machine Reading Comprehension.
CoRR, 2023

Zero-Shot Learners for Natural Language Understanding via a Unified Multiple-Choice Perspective.
IEEE Access, 2023

Self-Supervised and Few-Shot Contrastive Learning Frameworks for Text Clustering.
IEEE Access, 2023

A Reference-Dependent Model for Web Search Evaluation: Understanding and Measuring the Experience of Boundedly Rational Users.
Proceedings of the ACM Web Conference 2023, 2023

Practice and Challenges in Building a Business-oriented Search Engine Quality Metric.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

EALM: Introducing Multidimensional Ethical Alignment in Conversational Information Retrieval.
Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, 2023

Open-Domain Dialogue Quality Evaluation: Deriving Nugget-level Scores from Turn-level Scores.
Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, 2023

Chuweb21D: A Deduped English Document Collection for Web Search Tasks.
Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, 2023

Evaluating Parrots and Sociopathic Liars (keynote).
Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval, 2023

MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Relevance Assessments for Web Search Evaluation: Should We Randomise or Prioritise the Pooled Documents?
ACM Trans. Inf. Syst., 2022

Relevance Assessments for Web Search Evaluation: Should We Randomise or Prioritise the Pooled Documents? (Corrected Version).
CoRR, 2022

Corrected Evaluation Results of the NTCIR WWW-2, WWW-3, and WWW-4 English Subtasks.
CoRR, 2022

MAP: Modality-Agnostic Uncertainty-Aware Vision-Language Pre-training Model.
CoRR, 2022

On Variants of Root Normalised Order-aware Divergence and a Divergence based on Kendall's Tau.
CoRR, 2022

Constructing Better Evaluation Metrics by Incorporating the Anchoring Effect into the User Model.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Understanding the Behavior Transparency of Voice Assistant Applications Using the ChatterBox Framework.
Proceedings of the 25th International Symposium on Research in Attacks, 2022

Evaluating the Effects of Embedding with Speaker Identity Information in Dialogue Summarization.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Do Extractive Summarization Algorithms Amplify Lexical Bias in News Articles?
Proceedings of the ICTIR '22: The 2022 ACM SIGIR International Conference on the Theory of Information Retrieval, Madrid, Spain, July 11, 2022

Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

LayerConnect: Hypernetwork-Assisted Inter-Layer Connector to Enhance Parameter Efficiency.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

2021
Retrieval Evaluation Measures that Agree with Users' SERP Preferences: Traditional, Preference-based, and Diversity Measures.
ACM Trans. Inf. Syst., 2021

DCH-2: A Parallel Customer-Helpdesk Dialogue Corpus with Distributions of Annotators' Labels.
CoRR, 2021

A Simple and Effective Usage of Self-supervised Contrastive Learning for Text Clustering.
Proceedings of the 2021 IEEE International Conference on Systems, Man, and Cybernetics, 2021

Scalable Personalised Item Ranking through Parametric Density Estimation.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

WWW3E8: 259, 000 Relevance Labels for Studying the Effect of Document Presentation Order for Relevance Assessors.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

On the Two-Sample Randomisation Test for IR Evaluation.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

A Fast and Exact Randomisation Test for Comparing Two Systems with Paired Data.
Proceedings of the ICTIR '21: The 2021 ACM SIGIR International Conference on the Theory of Information Retrieval, 2021

MIRTT: Learning Multimodal Interaction Representations from Trilinear Transformers for Visual Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

On the Instability of Diminishing Return IR Measures.
Proceedings of the Advances in Information Retrieval, 2021

How Do Users Revise Zero-Hit Product Search Queries?
Proceedings of the Advances in Information Retrieval, 2021

A Closer Look at Evaluation Measures for Ordinal Quantification.
Proceedings of the CIKM 2021 Workshops co-located with 30th ACM International Conference on Information and Knowledge Management (CIKM 2021), 2021

Evaluating Relevance Judgments with Pairwise Discriminative Power.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

Incorporating Query Reformulating Behavior into Web Search Evaluation.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

Evaluating Evaluation Measures for Ordinal Classification and Ordinal Quantification.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
On Fuhr's guideline for IR evaluation.
SIGIR Forum, 2020

Low-cost, bottom-up measures for evaluating search result diversification.
Inf. Retr. J., 2020

Different Types of Voice User Interface Failures May Cause Different Degrees of Frustration.
CoRR, 2020

RealSakaiLab at the TREC 2020 Health Misinformation Track.
Proceedings of the Twenty-Ninth Text REtrieval Conference, 2020

Visual Intents vs. Clicks, Likes, and Purchases in E-commerce.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

Good Evaluation Measures based on Document Preferences.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

How to Measure the Reproducibility of System-oriented IR Experiments.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

Automatic Evaluation of Iconic Image Retrieval based on Colour, Shape, and Texture.
Proceedings of the 2020 on International Conference on Multimedia Retrieval, 2020

A Siamese CNN Architecture for Learning Chinese Sentence Similarity.
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: Student Research Workshop, 2020

2019
Personalized Reason Generation for Explainable Song Recommendation.
ACM Trans. Intell. Syst. Technol., 2019

Attitude Detection for One-Round Conversation: Jointly Extracting Target-Polarity Pairs.
J. Inf. Process., 2019

Graded Relevance Assessments and Graded Relevance Measures of NTCIR: A Survey of the First Twenty Years.
CoRR, 2019

Voice Input Interface Failures and Frustration: Developer and User Perspectives.
Proceedings of the Adjunct Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology, 2019

System Evaluation of Ternary Error-Correcting Output Codes for Multiclass Classification Problems.
Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics, 2019

BM25 Pseudo Relevance Feedback Using Anserini at Waseda University.
Proceedings of the Open-Source IR Replicability Challenge co-located with 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Which Diversity Evaluation Measures Are "Good"?
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Overview of the 2019 Open-Source IR Replicability Challenge (OSIRRC 2019).
Proceedings of the Open-Source IR Replicability Challenge co-located with 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

The SIGIR 2019 Open-Source IR Replicability Challenge (OSIRRC 2019).
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Evaluating Image-Inspired Poetry Generation.
Proceedings of the Natural Language Processing and Chinese Computing, 2019

RSL19BD at DBDC4: Ensemble of Decision Tree-Based and LSTM-Based Models.
Proceedings of the Increasing Naturalness and Flexibility in Spoken Dialogue Interaction, 2019

Generalising Kendall's Tau for Noisy and Incomplete Preference Judgements.
Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, 2019

Celebrating 20 Years of NTCIR: The Book.
Proceedings of the 9th International Workshop on Evaluating Information Access co-located with the 14th NTCIR Conference on the Evaluation of Information Access Technologies (NTCIR 2019), 2019

CENTRE@CLEF 2019.
Proceedings of the Advances in Information Retrieval, 2019

Overview of CENTRE@CLEF 2019: Sequel in the Systematic Reproducibility Realm.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2019

CENTRE@CLEF2019: Overview of the Replicability and Reproducibility Tasks.
Proceedings of the Working Notes of CLEF 2019, 2019

Poster: A First Look at the Privacy Risks of Voice Assistant Apps.
Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, 2019

Generating Short Product Descriptors Based on Very Little Training Data.
Proceedings of the Information Retrieval Technology, 2019

Arc Loss: Softmax with Additive Angular Margin for Answer Retrieval.
Proceedings of the Information Retrieval Technology, 2019

Randomised vs. Prioritised Pools for Relevance Assessments: Sample Size Considerations.
Proceedings of the Information Retrieval Technology, 2019

Towards Automatic Evaluation of Reused Answers in Community Question Answering.
Proceedings of the Information Retrieval Technology, 2019

Unsupervised Answer Retrieval with Data Fusion for Community Question Answering.
Proceedings of the Information Retrieval Technology, 2019

How to Run an Evaluation Task - With a Primary Focus on Ad Hoc Information Retrieval.
Proceedings of the Information Retrieval Evaluation in a Changing World, 2019

2018
Laboratory Experiments in Information Retrieval - Sample Sizes, Effect Sizes, and Statistical Power
The Information Retrieval Series 40, Springer, ISBN: 978-981-13-1198-7, 2018

U-Measure.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Q-Measure.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Expected Reciprocal Rank.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

ERR-IA.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

D-Measure.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

<i>α</i>-nDCG.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Advanced Information Retrieval Measures.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Search Result Diversity Evaluation Based on Intent Hierarchies.
IEEE Trans. Knowl. Data Eng., 2018

Report on NTCIR-13: The Thirteenth Round of NII Testbeds and Community for Information Access Research.
SIGIR Forum, 2018

Towards Automatic Evaluation of Customer-Helpdesk Dialogues.
J. Inf. Process., 2018

Understanding the Inconsistency between Behaviors and Descriptions of Mobile Apps.
IEICE Trans. Inf. Syst., 2018

Conducting Laboratory Experiments Properly with Statistical Tools: An Easy Hands-on Tutorial.
Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018

Comparing Two Binned Probability Distributions for Information Access Evaluation.
Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018

Classifying Community QA Questions That Contain an Image.
Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval, 2018

Topic Set Size Design for Paired and Unpaired Data.
Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval, 2018

Why You Should Listen to This Song: Reason Generation for Explainable Recommendation.
Proceedings of the 2018 IEEE International Conference on Data Mining Workshops, 2018

Overview of CENTRE@CLEF 2018: A First Tale in the Systematic Reproducibility Realm.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2018

CENTRE@CLEF2018: Overview of the Replicability Task.
Proceedings of the Working Notes of CLEF 2018, 2018

2017
Overview of Special Issue.
SIGIR Forum, 2017

Does Document Relevance Affect the Searcher's Perception of Time?
Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, 2017

The Probability that Your Hypothesis Is Correct, Credible Intervals, and Effect Sizes for IR Evaluation.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

Evaluating Mobile Search with Height-Biased Gain.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

LSTM vs. BM25 for Open-domain QA: A Hands-on Comparison of Effectiveness and Efficiency.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

Test Collections and Measures for Evaluating Customer-Helpdesk Dialogues.
Proceedings of the 8th International Workshop on Evaluating Information Access co-located with the 13th NTCIR Conference on the Evaluation of Information Access Technologies (NTCIR 2017), 2017

SLWWW at the NTCIR-13 WWW Task.
Proceedings of the 13th NTCIR Conference, 2017

Overview of the NTCIR-13 Short Text Conversation Task.
Proceedings of the 13th NTCIR Conference, 2017

SLQAL at the NTCIR-13 QA Lab-3 Task.
Proceedings of the 13th NTCIR Conference, 2017

Unanimity-Aware Gain for Highly Subjective Assessments.
Proceedings of the 8th International Workshop on Evaluating Information Access co-located with the 13th NTCIR Conference on the Evaluation of Information Access Technologies (NTCIR 2017), 2017

The Effect of Inter-Assessor Disagreement on IR System Evaluation: A Case Study with Lancers and Students.
Proceedings of the 8th International Workshop on Evaluating Information Access co-located with the 13th NTCIR Conference on the Evaluation of Information Access Technologies (NTCIR 2017), 2017

Towards Automatic Evaluation of Multi-Turn Dialogues: A Task Design that Leverages Inherently Subjective Annotations.
Proceedings of the 8th International Workshop on Evaluating Information Access co-located with the 13th NTCIR Conference on the Evaluation of Information Access Technologies (NTCIR 2017), 2017

Evaluating Evaluation Measures with Worst-Case Confidence Interval Widths.
Proceedings of the 8th International Workshop on Evaluating Information Access co-located with the 13th NTCIR Conference on the Evaluation of Information Access Technologies (NTCIR 2017), 2017

SLOLQ at the NTCIR-13 OpenLiveQ Task.
Proceedings of the 13th NTCIR Conference, 2017

SLSTC at the NTCIR-13 STC Task.
Proceedings of the 13th NTCIR Conference, 2017

Preface from NTCIR-13 General Chairs.
Proceedings of the 13th NTCIR Conference, 2017

Overview of the NTCIR-13 We Want Web Task.
Proceedings of the 13th NTCIR Conference, 2017

Mobile Vertical Ranking based on Preference Graphs.
Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval, 2017

Ranking Rich Mobile Verticals based on Clicks and Abandonment.
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017

Investigating Users' Time Perception during Web Search.
Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval, 2017

2016
Report on the First International Workshop on the Evaluation on Collaborative Information Seeking and Retrieval (ECol'2015).
SIGIR Forum, 2016

Report on NTCIR-12: The Twelfth Round of NII Testbeds and Community for Information Access Research.
SIGIR Forum, 2016

Topic set size design.
Inf. Retr. J., 2016

Evaluating Search Result Diversity using Intent Hierarchies.
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

Two Sample T-tests for IR Evaluation: Student or Welch?
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

Statistical Significance, Power, and Sample Sizes: A Systematic Review of SIGIR and TOIS, 2006-2015.
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

Overview of the NTCIR-12 Short Text Conversation Task.
Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, 2016

On Estimating Variances for Topic Set Size Design.
Proceedings of the Seventh International Workshop on Evaluating Information Access, 2016

NEXTI at NTCIR-12 IMine-2 Task.
Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, 2016

Overview of the NTCIR-12 MobileClick-2 Task.
Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, 2016

Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect User Preferences?
Proceedings of the Seventh International Workshop on Evaluating Information Access, 2016

Preface from NTCIR-12 General Chairs.
Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, 2016

SLLL at the NTCIR-12 Lifelog Task: Sleepflower and the LIT Subtask.
Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, 2016

SLQAL at the NTCIR-12 QALab-2 Task.
Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, 2016

SLSTC at the NTCIR-12 STC Task.
Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, 2016

Simple and Effective Approach to Score Standardisation.
Proceedings of the 2016 ACM on International Conference on the Theory of Information Retrieval, 2016

Topic Set Size Design and Power Analysis in Practice.
Proceedings of the 2016 ACM on International Conference on the Theory of Information Retrieval, 2016

The Effect of Score Standardisation on Topic Set Size Design.
Proceedings of the Information Retrieval Technology, 2016

2015
Dynamic author name disambiguation for growing digital libraries.
Inf. Retr. J., 2015

TREC 2015 Temporal Summarization Track Overview.
Proceedings of The Twenty-Fourth Text REtrieval Conference, 2015

Understanding the Inconsistencies between Text Descriptions and the Use of Privacy-sensitive Resources of Mobile Apps.
Proceedings of the Eleventh Symposium On Usable Privacy and Security, 2015

Search Result Diversification Based on Hierarchical Intents.
Proceedings of the 24th ACM International Conference on Information and Knowledge Management, 2015

ECol 2015: First international workshop on the Evaluation on Collaborative Information Seeking and Retrieval.
Proceedings of the 24th ACM International Conference on Information and Knowledge Management, 2015

Topic Set Size Design with the Evaluation Measures for Short Text Conversation.
Proceedings of the Information Retrieval Technology, 2015

2014
Statistical reform in information retrieval?
SIGIR Forum, 2014

TREC 2014 Temporal Summarization Track Overview.
Proceedings of The Twenty-Third Text REtrieval Conference, 2014

Topic Set Size Design with Variance Estimates from Two-Way ANOVA.
Proceedings of the Sixth International Workshop on Evaluating Information Access, 2014

Overview of the NTCIR-11 MobileClick Task.
Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies, 2014

Preface from NTCIR-11 General Chairs.
Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies, 2014

ReviewCollage: a mobile interface for direct comparison using online reviews.
Proceedings of the 16th international conference on Human-computer interaction with mobile devices & services, 2014

Designing Test Collections for Comparing Many Systems.
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 2014

2013
Asian summer school in information access (ASSIA 2013).
SIGIR Forum, 2013

Web Search Evaluation with Informational and Navigational Intents.
J. Inf. Process., 2013

Mining subtopics from text fragments for a web query.
Inf. Retr., 2013

Diversified search evaluation: lessons from the NTCIR-9 INTENT task.
Inf. Retr., 2013

Introduction to the special issue on search intents and diversification.
Inf. Retr., 2013

When do people use query suggestion? A query suggestion log analysis.
Inf. Retr., 2013

TREC 2013 Temporal Summarization.
Proceedings of The Twenty-Second Text REtrieval Conference, 2013

Summary of the NTCIR-10 INTENT-2 task: subtopic mining and search result diversification.
Proceedings of the 36th International ACM SIGIR conference on research and development in Information Retrieval, 2013

The impact of intent selection on diversified search evaluation.
Proceedings of the 36th International ACM SIGIR conference on research and development in Information Retrieval, 2013

Summaries, ranked retrieval and sessions: a unified framework for information access evaluation.
Proceedings of the 36th International ACM SIGIR conference on research and development in Information Retrieval, 2013

Time-aware structured query suggestion.
Proceedings of the 36th International ACM SIGIR conference on research and development in Information Retrieval, 2013

Report from the NTCIR-10 1CLICK-2 Japanese subtask: baselines, upperbounds and evaluation robustness.
Proceedings of the 36th International ACM SIGIR conference on research and development in Information Retrieval, 2013

Exploring semi-automatic nugget extraction for Japanese one click access evaluation.
Proceedings of the 36th International ACM SIGIR conference on research and development in Information Retrieval, 2013

Metrics, Statistics, Tests.
Proceedings of the Bridging Between Information Retrieval and Databases, 2013

Microsoft Research Asia at the NTCIR-10 Intent Task.
Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, 2013

Overview of the NTCIR-10 INTENT-2 Task.
Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, 2013

The Unreusability of Diversified Search Test Collections.
Proceedings of the 5th International Workshop on Evaluating Information Access, 2013

MSRA at NTCIR-10 1CLICK-2.
Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, 2013

Overview of the NTCIR-10 1CLICK-2 Task.
Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, 2013

Wrap Up of NTCIR-10.
Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, 2013

Overview of NTCIR-10.
Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, 2013

On the reliability and intuitiveness of aggregated search metrics.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

Dynamic query intent mining from a search log stream.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

User-Aware Advertisability.
Proceedings of the Information Retrieval Technology, 2013

Estimating Intent Types for Search Result Diversification.
Proceedings of the Information Retrieval Technology, 2013

On Labelling Intent Types for Evaluating Search Result Diversification.
Proceedings of the Information Retrieval Technology, 2013

How Intuitive Are Diversified Search Metrics? Concordance Test Results for the Diversity U-Measures.
Proceedings of the Information Retrieval Technology, 2013

2012
Query Snowball: A Co-occurrence-based Approach to Multi-document Summarization for Question Answering.
Inf. Media Technol., 2012

Evaluation with informational and navigational intents.
Proceedings of the 21st World Wide Web Conference 2012, 2012

Structured query suggestion for specialization and parallel movement: effect on search behaviors.
Proceedings of the 21st World Wide Web Conference 2012, 2012

Towards zero-click mobile IR evaluation: knowing what and knowing when.
Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, 2012

New assessment criteria for query suggestion.
Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, 2012

AspecTiles: tile-based visualization of diversified web search results.
Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, 2012

The wisdom of advertisers: mining subgoals via query clustering.
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

One Click One Revisited: Enhancing Evaluation Based on Information Units.
Proceedings of the Information Retrieval Technology, 2012

The Reusability of a Diversified Search Test Collection.
Proceedings of the Information Retrieval Technology, 2012

Grid-Based Interaction for Exploratory Search.
Proceedings of the Information Retrieval Technology, 2012

2011
Using graded-relevance metrics for evaluating community QA answer selection.
Proceedings of the Forth International Conference on Web Search and Web Data Mining, 2011

Evaluating diversified search results using per-intent graded relevance.
Proceedings of the Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011

Overview of the NTCIR-9 INTENT Task.
Proceedings of the 9th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2011

Overview of NTCIR-9 1CLICK.
Proceedings of the 9th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2011

Overview of NTCIR-9.
Proceedings of the 9th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2011

Microsoft Research Asia at the NTCIR-9 1CLICK Task.
Proceedings of the 9th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2011

TTOKU Summarization Based Systems at NTCIR-9 1CLICK task.
Proceedings of the 9th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2011

Grid-based Interaction for NTCIR-9 VisEx Task.
Proceedings of the 9th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2011

What Makes a Good Answer in Community Question Answering? An Analysis of Assessors' Criteria.
Proceedings of the 4th International Workshop on Evaluating Information Access, 2011

Microsoft Research Asia at the NTCIR-9 Intent Task.
Proceedings of the 9th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2011

Click the search button and be happy: evaluating direct and immediate information access.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

2010
EVIA 2010: the third international workshop on evaluating information access.
SIGIR Forum, 2010

Boiling down information retrieval test collections.
Proceedings of the Recherche d'Information Assistée par Ordinateur, 2010

Constructing a Test Collection with Multi-Intent Queries.
Proceedings of the 3rd International Workshop on Evaluating Information Access, 2010

Microsoft Research Asia with Redmond at the NTCIR-8 Community QA Pilot Task.
Proceedings of the 8th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2010

Preface.
Proceedings of the 3rd International Workshop on Evaluating Information Access, 2010

Overview of NTCIR-8 ACLIA IR4QA.
Proceedings of the 8th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2010

Ranking Retrieval Systems without Relevance Assessments: Revisited.
Proceedings of the 3rd International Workshop on Evaluating Information Access, 2010

Overview of the NTCIR-8 Community QA Pilot Task (Part II): System Evaluation.
Proceedings of the 8th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2010

Simple Evaluation Metrics for Diversified Search Results.
Proceedings of the 3rd International Workshop on Evaluating Information Access, 2010

Overview of the NTCIR-8 ACLIA Tasks: Advanced Cross-Lingual Information Access.
Proceedings of the 8th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2010

Overview of the NTCIR-8 Community QA Pilot Task (Part I): The Test Collection and the Task.
Proceedings of the 8th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2010

NTCIR-GeoTime Overview: Evaluating Geographic and Temporal Search.
Proceedings of the 8th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2010

2009
EVIA 2008: the second international workshop on evaluating information access.
SIGIR Forum, 2009

Report on the SIGIR 2009 workshop on the future of IR evaluation.
SIGIR Forum, 2009

On the Robustness of Information Retrieval Metrics to Biased Relevance Assessments.
J. Inf. Process., 2009

Ranking the NTCIR ACLIA IR4QA Systems without Relevance Assessments.
Inf. Media Technol., 2009

Serendipitous search via wikipedia: a query log analysis.
Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009

People, clouds, and interaction for information access.
Proceedings of the 3rd International Universal Communication Symposium, 2009

2008
Introduction to the NTCIR-6 Special Issue.
ACM Trans. Asian Lang. Inf. Process., 2008

On information retrieval metrics designed for evaluation with incomplete relevance assessments.
Inf. Retr., 2008

Precision-at-ten considered redundant.
Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2008

Comparing metrics across TREC and NTCIR: : the robustness to pool depth bias.
Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2008

Modelling A User Population for Designing Information Retrieval Metrics.
Proceedings of the 2nd International Workshop on Evaluating Information Access, 2008

NTCIR-7 ACLIA IR4QA Results based on Qrels Version 2.
Proceedings of the 7th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2008

Overview of the NTCIR-7 ACLIA IR4QA Task.
Proceedings of the 7th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2008

Are Popular Documents More Likely To Be Relevant? A Dive into the ACLIA IR4QA Pools.
Proceedings of the 2nd International Workshop on Evaluating Information Access, 2008

Overview of the NTCIR-7 ACLIA Tasks: Advanced Cross-Lingual Information Access.
Proceedings of the 7th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2008

Comparing metrics across TREC and NTCIR: the robustness to system bias.
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

2007
On the reliability of factoid question answering evaluation.
ACM Trans. Asian Lang. Inf. Process., 2007

EVIA 2007: the First International Workshop on Evaluating Information Access.
SIGIR Forum, 2007

On the reliability of information retrieval metrics based on graded relevance.
Inf. Process. Manag., 2007

On the Properties of Evaluation Metrics for Finding One Highly Relevant Document.
Inf. Media Technol., 2007

Evaluating Information Retrieval Metrics Based on Bootstrap Hypothesis Tests.
Inf. Media Technol., 2007

Alternatives to Bpref.
Proceedings of the SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007

Pic-A-Topic: Efficient Viewing of Informative TV Contents on Travel, Cooking, Food and More.
Proceedings of the Computer-Assisted Information Retrieval (Recherche d'Information et ses Applications) - RIAO 2007, 8th International Conference, Carnegie Mellon University, Pittsburgh, PA, USA, May 30, 2007

Toshiba BRIDJE at NTCIR-6 CLIR: The Head/Lead Method and Graded Relevance Feedback.
Proceedings of the 6th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2007

User Satisfaction Task: A Proposal for NTCIR-7.
Proceedings of the 1st International Workshop on Evaluating Information Access, 2007

On Penalising Late Arrival of Relevant Documents in Information Retrieval Evaluation with Graded Relevance.
Proceedings of the 1st International Workshop on Evaluating Information Access, 2007

2006
On the Task of Finding One Highly Relevant Document with High Precision.
Inf. Media Technol., 2006

Give me just one highly relevant document: P-measure.
Proceedings of the SIGIR 2006: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2006

Evaluating evaluation metrics based on the bootstrap.
Proceedings of the SIGIR 2006: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2006

Improving the Robustness to Recognition Errors in Speech Input Question Answering.
Proceedings of the Information Retrieval Technology, 2006

Pic-A-Topic: Gathering Information Efficiently from Recorded TV Shows on Travel.
Proceedings of the Information Retrieval Technology, 2006

Bootstrap-Based Comparisons of IR Metrics for Finding One Relevant Document.
Proceedings of the Information Retrieval Technology, 2006

2005
Flexible pseudo-relevance feedback via selective sampling.
ACM Trans. Asian Lang. Inf. Process., 2005

Introduction to the special issue: Recent advances in information processing and access for Japanese.
ACM Trans. Asian Lang. Inf. Process., 2005

Advanced Technologies for Information Access.
Int. J. Comput. Process. Orient. Lang., 2005

Toshiba BRIDJE at NTCIR-5 CLIR: Evaluation using Geometric Means.
Proceedings of the Fifth NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2005

The Effect of Topic Sampling on Sensitivity Comparisons of Information Retrieval Metrics.
Proceedings of the Fifth NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2005

The Relationship between Answer Ranking and User Satisfaction in a Question Answering System.
Proceedings of the Fifth NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2005

The Reliability of Metrics Based on Graded Relevance.
Proceedings of the Information Retrieval Technology, 2005

2004
The effect of back-formulating questions in question answering evaluation.
Proceedings of the SIGIR 2004: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2004

ASKMi: A Japanese Question Answering System based on Semantic Role Analysis.
Proceedings of the Computer-Assisted Information Retrieval (Recherche d'Information et ses Applications), 2004

Toshiba ASKMi at NTCIR-4 QAC2.
Proceedings of the Fourth NTCIR Workshop on Research in Information Access Technologies Information Retrieval, 2004

Toshiba BRIDJE at NTCIR-4 CLIR: Monolingual/Bilingual IR and Flexible Feedback.
Proceedings of the Fourth NTCIR Workshop on Research in Information Access Technologies Information Retrieval, 2004

New Performance Metrics Based on Multigrade Relevance: Their Application to Question Answering.
Proceedings of the Fourth NTCIR Workshop on Research in Information Access Technologies Information Retrieval, 2004

Ranking the NTCIR Systems Based on Multigrade Relevance.
Proceedings of the Information Retrieval Technology, Asia Information Retrieval Symposium, 2004

2003
Evaluating retrieval performance for Japanese question answering: what are best passages?
Proceedings of the SIGIR 2003: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 28, 2003

Average gain ratio: a simple retrieval performance measure for evaluation with multiple relevance levels.
Proceedings of the SIGIR 2003: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 28, 2003

BRIDJE over a language barrier: cross-language information access by integrating translation and retrieval.
Proceedings of the Sixth International Workshop on Information Retrieval with Asian Languages, 2003

2002
Generating transliteration rules for cross-language information retrieval from machine translation dictionaries.
Proceedings of the IEEE International Conference on Systems, Man and Cybernetics: Bridging the Digital Divide, Yasmine Hammamet, Tunisia, October 6-9, 2002, 2002

The use of external text data in cross-language information retrieval based on machine translation.
Proceedings of the IEEE International Conference on Systems, Man and Cybernetics: Bridging the Digital Divide, Yasmine Hammamet, Tunisia, October 6-9, 2002, 2002

Relative and absolute term selection criteria: a comparative study for English and Japanese IR.
Proceedings of the SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002

Toshiba KIDS at NTCIR-3: Japanese and English-Japanese IR.
Proceedings of the Third NTCIR Workshop on Research in Information Retrieval, 2002

2001
A Framework for Cross-Language Information Access: Application to English and Japanese.
Comput. Humanit., 2001

Japanese-English Cross-Language Information Retrieval Using Machine Translation and Pseudo-Relevance Feedback.
Int. J. Comput. Process. Orient. Lang., 2001

Flexible Pseudo-Relevance Feedback Using Optimization Tables.
Proceedings of the SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2001

Generic Summaries for Indexing in Information Retrieval.
Proceedings of the SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2001

Flexible Pseudo-Relevance Feedback for NTCIR-2.
Proceedings of the Third Second Workshop Meeting on Evaluation of Chinese & Japanese Text Retrieval and Text Summarization, 2001

2000
Incremental Relevance Feedback in Japanese Text Retrieval.
Inf. Retr., 2000

A first step towards flexible local feedback for ad hoc retrieval.
Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages, 2000, Hong Kong, China, September 30, 2000

MT-based Japanese-Enlish cross-language IR experiments using the TREC test collections.
Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages, 2000, Hong Kong, China, September 30, 2000

1999
BMIR-J2: A Test Collection for Evaluation of Japanese Information Retrieval Systems.
SIGIR Forum, 1999

A Comparison of Query Translation Methods for English-Japanese Cross-Language Information Retrieval (poster abstract).
Proceedings of the SIGIR '99: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1999

Cross-Language Information Retrieval for NTCIR at Toshiba.
Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, 1999

1998
Lessons from BMIR-J2: A Test Collection for Japanese IR Systems.
Proceedings of the SIGIR '98: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1998

Experiments in Japanese Text Retrieval and Routing Using the NEAT System.
Proceedings of the SIGIR '98: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1998


  Loading...