Mohammad Taher Pilehvar

Orcid: 0000-0003-3694-4006

Affiliations:
  • University of Cambridge, UK


According to our database1, Mohammad Taher Pilehvar authored at least 90 papers between 2009 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages.
CoRR, 2024

Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models.
CoRR, 2024

RepMatch: Quantifying Cross-Instance Similarities in Representation Space.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Stochastic Fine-Tuning of Language Models Using Masked Gradients.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

SkipPLUS: Skip the First Few Layers to Better Explain Vision Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

TweetTER: A Benchmark for Target Entity Retrieval on Twitter without Knowledge Bases.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Spanning the Spectrum of Hatred Detection: A Persian Multi-Label Hate Speech Dataset with Annotator Rationales.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Pars-OFF: A Benchmark for Offensive Language Detection on Farsi Social Media.
IEEE Trans. Affect. Comput., 2023

SemEval-2023 Task 1: Visual Word Sense Disambiguation.
Proceedings of the The 17th International Workshop on Semantic Evaluation, 2023

DiFair: A Benchmark for Disentangled Assessment of Gender Knowledge and Bias.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Guide the Learner: Controlling Product of Experts Debiasing Method Based on Token Attribution Similarities.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

DecompX: Explaining Transformers Decisions by Propagating Token Decomposition.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning.
CoRR, 2022

PheneBank: a literature-based database of phenotypes.
Bioinform., 2022

GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Looking at the Overlooked: An Analysis on the Word-Overlap Bias in Natural Language Inference.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

An Empirical Study on the Transferability of Transformer Modules in Parameter-efficient Fine-tuning.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Exploiting Language Model Prompts Using Similarity Measures: A Case Study on the Word-in-Context Task.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2022

An Isotropy Analysis in the Multilingual BERT Embedding Space.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

AdapLeR: Speeding up Inference by Adaptive Length Reduction.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

On the Importance of Data Size in Probing Fine-tuned Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Incorporating Stock Market Signals for Twitter Stance Detection.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Analysis and Evaluation of Language Models for Word Sense Disambiguation.
Comput. Linguistics, 2021

Synthetic Examples Improve Cross-Target Generalization: A Study on Stance Detection on a Twitter corpus.
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, 2021

ParsFEVER: a Dataset for Farsi Fact Extraction and Verification.
Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics, 2021

How Does Fine-tuning Affect the Geometry of Embedding Space: A Case Study on Isotropy.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Exploring the Role of BERT Token Representations to Explain Sentence Probing Results.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Don't Discard All the Biased Instances: Investigating a Core Assumption in Dataset Bias Mitigation Techniques.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Adversarial Training for News Stance Detection: Leveraging Signals from a Multi-Genre Corpus.
Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, 2021

WiC-TSV: An Evaluation Benchmark for Target Sense Verification of Words in Context.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Not All Models Localize Linguistic Knowledge in the Same Place: A Layer-wise Probing on BERToids' Representations.
Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2021

A Cluster-based Approach for Improving Isotropy in Contextual Embedding Space.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
PheneBank: Processed Medline Abstracts and PMC full articles + Phenotype-Disease Associations.
Dataset, July, 2020

PheneBank: Processed Medline Abstracts and PMC full articles + Phenotype-Disease Associations.
Dataset, July, 2020

Embeddings in Natural Language Processing: Theory and Advances in Vector Representations of Meaning
Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, ISBN: 978-3-031-02177-0, 2020

A pragmatic guide to geoparsing evaluation.
Lang. Resour. Evaluation, 2020

Language Models and Word Sense Disambiguation: An Overview and Analysis.
CoRR, 2020

SemEval-2020 Task 3: Graded Word Similarity in Context.
Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

STANDER: An Expert-Annotated Dataset for News Stance Detection and Evidence Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Embeddings in Natural Language Processing.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Will-They-Won't-They: A Very Large Dataset for Stance Detection on Twitter.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Generating Knowledge Graph Paths from Textual Definitions using Sequence-to-Sequence Models.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

On the Importance of Distinguishing Word Meaning Representations: A Case Study on Reverse Dictionary Mapping.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

On the Importance of the Kullback-Leibler Divergence Term in Variational Autoencoders for Text Generation.
Proceedings of the 3rd Workshop on Neural Generation and Translation@EMNLP-IJCNLP 2019, 2019

Unseen Word Representation by Aligning Heterogeneous Lexical Semantic Spaces.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
PheneBank: Processed Medline Abstracts and PMC full articles.
Dataset, February, 2018

PheneBank: Processed Medline Abstracts and PMC full articles.
Dataset, February, 2018

PheneBank: Processed Medline Abstracts and PMC full articles.
Dataset, February, 2018

PheneBank: Processed Medline Abstracts.
Dataset, February, 2018

What's missing in geographical parsing?
Lang. Resour. Evaluation, 2018

From Word To Sense Embeddings: A Survey on Vector Representations of Meaning.
J. Artif. Intell. Res., 2018

Card-660: Cambridge Rare Word Dataset - a Reliable Benchmark for Infrequent Word Representation Models.
CoRR, 2018

WiC: 10, 000 Example Pairs for Evaluating Context-Sensitive Representations.
CoRR, 2018

The interplay between lexical resources and Natural Language Processing.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, 2018

Card-660: A Reliable Evaluation Framework for Rare Word Representation Models.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Large-scale Exploration of Neural Relation Classification Architectures.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Mapping Text to Knowledge Graph Entities using Multi-Sense LSTMs.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Towards Automatic Fake News Detection: Cross-Level Stance Detection in News Articles.
Proceedings of the First Workshop on Fact Extraction and VERification, 2018

On the Role of Text Preprocessing in Neural Network Architectures: An Evaluation Study on Text Categorization and Sentiment Analysis.
Proceedings of the Workshop: Analyzing and Interpreting Neural Networks for NLP, 2018

Modeling the Fake News Challenge as a Cross-Level Stance Detection Task.
Proceedings of the CIKM 2018 Workshops co-located with 27th ACM International Conference on Information and Knowledge Management (CIKM 2018), 2018

Which Melbourne? Augmenting Geocoding with Maps.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
Learning Rare Word Representations using Semantic Bridging.
CoRR, 2017

SemEval-2017 Task 2: Multilingual and Cross-lingual Semantic Word Similarity.
Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

Inducing Embeddings for Rare and Unseen Words by Leveraging Lexical Resources.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

Towards a Seamless Integration of Word Senses into Downstream NLP Applications.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

Vancouver Welcomes You! Minimalist Location Metonymy Resolution.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
Cross level semantic similarity: an evaluation framework for universal measures of similarity.
Lang. Resour. Evaluation, 2016

Vafa spell-checker for detecting spelling, grammatical, and real-word errors of Persian language.
Digit. Scholarsh. Humanit., 2016

Semantic Representations of Word Senses and Concepts.
CoRR, 2016

Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities.
Artif. Intell., 2016

SemEval-2016 Task 14: Semantic Taxonomy Enrichment.
Proceedings of the 10th International Workshop on Semantic Evaluation, 2016

De-Conflated Semantic Representations.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Improved Semantic Representation for Domain-Specific Entities.
Proceedings of the 15th Workshop on Biomedical Natural Language Processing, 2016

Embeddings for Word Sense Disambiguation: An Evaluation Study.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2015
From senses to texts: An all-in-one graph-based approach for measuring semantic similarity.
Artif. Intell., 2015

An Open-source Framework for Multi-level Semantic Similarity Measurement.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Reserating the awesometastic: An automatic extension of the WordNet taxonomy for novel terms.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

NASARI: a Novel Approach to a Semantically-Aware Representation of Items.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

SensEmbed: Learning Sense Embeddings for Word and Relational Similarity.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

A Framework for the Construction of Monolingual and Cross-lingual Word Similarity Datasets.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

A Unified Multilingual Semantic Representation of Concepts.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
A Large-Scale Pseudoword-Based Evaluation Framework for State-of-the-Art Word Sense Disambiguation.
Comput. Linguistics, 2014

SemEval-2014 Task 3: Cross-Level Semantic Similarity.
Proceedings of the 8th International Workshop on Semantic Evaluation, 2014

A Robust Approach to Aligning Heterogeneous Lexical Resources.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

2013
Paving the Way to a Large-scale Pseudosense-annotated Dataset.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

2011
TEP: Tehran English-Persian Parallel Corpus.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2011

2009
Classification of Persian textual documents using learning vector quantization.
Proceedings of the 5th International Conference on Natural Language Processing and Knowledge Engineering, 2009


  Loading...