2025
SAEs <i>Can</i> Improve Unlearning: Dynamic Sparse Autoencoder Guardrails for Precision Unlearning in LLMs.
CoRR, April, 2025
CoRAG: Collaborative Retrieval-Augmented Generation.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025
Intrinsic Bias is Predicted by Pretraining Data and Correlates with Downstream Performance in Vision-Language Encoders.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
2024
Towards Global AI Inclusivity: A Large-Scale Multilingual Terminology Dataset.
CoRR, 2024
BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data.
CoRR, 2024
Analyzing the Role of Semantic Representations in the Era of Large Language Models.
CoRR, 2024
Emotion Classification in Low and Moderate Resource Languages.
CoRR, 2024
A Note on Bias to Complete.
CoRR, 2024
The FIGNEWS Shared Task on News Media Narratives.
Proceedings of The Second Arabic Natural Language Processing Conference, 2024
Analyzing the Role of Semantic Representations in the Era of Large Language Models.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024
Automatic Generation of Model and Data Cards: A Step Towards Responsible AI.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024
Can Large Language Models Infer Causation from Correlation?
Proceedings of the Twelfth International Conference on Learning Representations, 2024
GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Towards a Responsible Thinking in the New Era of Gen AI: Walking the Walk.
Proceedings of the 13th International Conference on Data Science, 2024
Rethinking Machine Learning Benchmarks in the Context of Professional Codes of Conduct.
Proceedings of the Symposium on Computer Science and Law, 2024
Depth-Wise Attention (DWAtt): A Layer Fusion Method for Data-Efficient Classification.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
Semantic Compression for Word and Sentence Embeddings using Discrete Wavelet Transform.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Evaluating Large Language Model Biases in Persona-Steered Generation.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Investigating Cultural Alignment of Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
Author Correction: Arabic natural language processing for Qur'anic research: a systematic review.
Artif. Intell. Rev., November, 2023
OPT-R: Exploring the Role of Explanations in Finetuning and Prompting for Reasoning Skills of Large Language Models.
CoRR, 2023
Arabic natural language processing for Qur'anic research: a systematic review.
Artif. Intell. Rev., 2023
Evaluating Multilingual Speech Translation under Realistic Conditions with Resegmentation and Terminology.
Proceedings of the 20th International Conference on Spoken Language Translation, 2023
Methods for Measuring, Updating, and Visualizing Factual Beliefs in Language Models.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023
ALERT: Adapt Language Models to Reasoning Tasks.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Care4Lang at MEDIQA-Chat 2023: Fine-tuning Language Models for Classifying and Summarizing Clinical Dialogues.
Proceedings of the 5th Clinical Natural Language Processing Workshop, 2023
2022
ALERT: Adapting Language Models to Reasoning Tasks.
CoRR, 2022
Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values.
CoRR, 2022
Text Characterization Toolkit.
CoRR, 2022
GisPy: A Tool for Measuring Gist Inference Score in Text.
CoRR, 2022
Meta AI at Arabic Hate Speech 2022: MultiTask Learning with Self-Correction for Hate Speech Classification.
CoRR, 2022
OPT: Open Pre-trained Transformer Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2022
A Review on Language Models as Knowledge Bases.
CoRR, 2022
CALCS 2021 Shared Task: Machine Translation for Code-Switched Data.
CoRR, 2022
AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022
BeSt: The Belief and Sentiment Corpus.
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022
Text Characterization Toolkit (TCT).
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022
Few-shot Learning with Multilingual Generative Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Efficient Large Scale Language Modeling with Mixtures of Experts.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Consistent Human Evaluation of Machine Translation across Language Pairs.
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), 2022
Towards Responsible Natural Language Annotation for the Varieties of Arabic.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022
A Quantitative and Qualitative Analysis of Schizophrenia Language.
Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis, 2022
2021
Efficient Large Scale Language Modeling with Mixtures of Experts.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2021
Few-shot Learning with Multilingual Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2021
Commonsense Knowledge-Augmented Pretrained Language Models for Causal Reasoning Classification.
CoRR, 2021
Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs.
CoRR, 2021
Multi-Perspective Abstractive Answer Summarization.
CoRR, 2021
Predicting Directionality in Causal Relations in Text.
CoRR, 2021
White Paper: Challenges and Considerations for the Creation of a Large Labelled Repository of Online Videos with Questionable Content.
CoRR, 2021
Active Learning for Rumor Identification on Social Media.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021
Detecting Hallucinated Content in Conditional Neural Sequence Generation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021
Gender bias amplification during Speed-Quality optimization in Neural Machine Translation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
Adapting High-resource NMT Models to Translate Low-resource Related Languages without Parallel Data.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
Discrete Cosine Transform as Universal Sentence Encoder.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
2020
Detecting Hallucinated Content in Conditional Neural Sequence Generation.
CoRR, 2020
Mutlitask Learning for Cross-Lingual Transfer of Semantic Dependencies.
CoRR, 2020
Learning to Classify Intents and Slot Labels Given a Handful of Examples.
CoRR, 2020
Information propagation in an era of Infodemics: The role of language content.
Proceedings of the Seventh International Conference on Social Networks Analysis, 2020
Diversity, Density, and Homogeneity: Quantitative Characteristic Metrics for Text Collections.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020
Data Paucity and Low Resource Scenarios: Challenges and Opportunities.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020
Multitask Learning for Cross-Lingual Transfer of Broad-coverage Semantic Dependencies.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020
Detecting Urgency Status of Crisis Tweets: A Transfer Learning Approach for Low Resource Languages.
Proceedings of the 28th International Conference on Computational Linguistics, 2020
DeSePtion: Dual Sequence Prediction and Adversarial Examples for Improved Fact-Checking.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
A Multitask Learning Approach for Diacritic Restoration.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
2019
Rumor Detection and Classification for Twitter Data.
CoRR, 2019
Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues.
CoRR, 2019
Leveraging Pretrained Word Embeddings for Part-of-Speech Tagging of Code Switching Data.
CoRR, 2019
GWU NLP Lab at SemEval-2019 Task 3: EmoContext: Effective Contextual Information in Models for Emotion Detection in Sentence-level in a Multigenre Corpus.
CoRR, 2019
The ARIEL-CMU Systems for LoReHLT18.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2019
Homograph Disambiguation through Selective Diacritic Restoration.
Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019
Scalable Cross-Lingual Transfer of Neural Sentence Embeddings.
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics, 2019
GWU NLP Lab at SemEval-2019 Task 3 : EmoContext: Effectiveness ofContextual Information in Models for Emotion Detection inSentence-level at Multi-genre Corpus.
Proceedings of the 13th International Workshop on Semantic Evaluation, 2019
GWU NLP at SemEval-2019 Task 7: Hybrid Pipeline for Rumour Veracity and Stance Classification on Social Media.
Proceedings of the 13th International Workshop on Semantic Evaluation, 2019
Does Causal Coherence Predict Online Spread of Social Media?
Proceedings of the Social, Cultural, and Behavioral Modeling, 2019
Context-Aware Cross-Lingual Mapping.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019
Cross-Lingual Transfer of Semantic Roles: From Raw Text to Semantic Roles.
Proceedings of the 13th International Conference on Computational Semantics, 2019
Investigating Input and Output Units in Diacritic Restoration.
Proceedings of the 18th IEEE International Conference On Machine Learning And Applications, 2019
Understanding Cohesion in Writings and Speech of Schizophrenia Patients.
Proceedings of the 18th IEEE International Conference On Machine Learning And Applications, 2019
Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019
CASA-NLU: Context-Aware Self-Attentive Natural Language Understanding for Task-Oriented Chatbots.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019
Efficient Convolutional Neural Networks for Diacritic Restoration.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019
Efficient Sentence Embedding using Discrete Cosine Transform.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019
2018
Unsupervised Word Mapping Using Structural Similarities in Monolingual Embeddings.
Trans. Assoc. Comput. Linguistics, 2018
Sentence and Clause Level Emotion Annotation, Detection, and Classification in a Multi-Genre Corpus.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018
WASA: A Web Application for Sequence Annotation.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018
Team SWEEPer: Joint Sentence Extraction and Fact Checking with Pointer Networks.
Proceedings of the First Workshop on Fact Extraction and VERification, 2018
Emotion Detection and Classification in a Multigenre Corpus with Joint Multi-Task Deep Learning.
Proceedings of the 27th International Conference on Computational Linguistics, 2018
Evaluation of Unsupervised Compositional Representations.
Proceedings of the 27th International Conference on Computational Linguistics, 2018
Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task.
Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching@ACL 2018, 2018
2017
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation.
CoRR, 2017
Arabic Textual Entailment with Word Embeddings.
Proceedings of the Third Arabic Natural Language Processing Workshop, 2017
A Layered Language Model based Hybrid Approach to Automatic Full Diacritization of Arabic.
Proceedings of the Third Arabic Natural Language Processing Workshop, 2017
The Columbia-GWU System at the 2017 TAC KBP BeSt Evaluation.
Proceedings of the 2017 Text Analysis Conference, 2017
Predictive Linguistic Features of Schizophrenia.
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics, 2017
SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation.
Proceedings of the 11th International Workshop on Semantic Evaluation, 2017
GW_QA at SemEval-2017 Task 3: Question Answer Re-ranking on Arabic Fora.
Proceedings of the 11th International Workshop on Semantic Evaluation, 2017
Transferring Semantic Roles Using Translation and Syntactic Information.
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017
2016
AMPN: a semantic resource for Arabic morphological patterns.
Int. J. Speech Technol., 2016
Rumor Identification and Belief Investigation on Twitter.
Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, 2016
The GW/LT3 VarDial 2016 Shared Task System for Dialects and Similar Languages Detection.
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects, 2016
Processing Dialectal Arabic: Exploiting Variability and Similarity to Overcome Challenges and Discover Opportunities.
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects, 2016
Automatic Verification and Augmentation of Multilingual Lexicons.
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects, 2016
The Columbia-GWU System at the 2016 TAC KBP BeSt Evaluation.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2016 Text Analysis Conference, 2016
The 2016 TAC KBP BeSt Evaluation.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2016 Text Analysis Conference, 2016
CU-GWU Perspective at SemEval-2016 Task 6: Ideological Stance Detection in Informal Text.
Proceedings of the 10th International Workshop on Semantic Evaluation, 2016
GWU NLP at SemEval-2016 Shared Task 1: Matrix Factorization for Crosslingual STS.
Proceedings of the 10th International Workshop on Semantic Evaluation, 2016
SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation.
Proceedings of the 10th International Workshop on Semantic Evaluation, 2016
Guidelines and Framework for a Large Scale Arabic Diacritized Corpus.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016
Explicit Fine grained Syntactic and Semantic Annotation of the Idafa Construction in Arabic.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016
Creating a Large Multi-Layered Representational Repository of Linguistic Code Switched Arabic Data.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016
SPLIT: Smart Preprocessing (Quasi) Language Independent Tool.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016
Computational Approaches to Linguistic Code Switching.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
LILI: A Simple Language Independent Approach for Language Identification.
Proceedings of the COLING 2016, 2016
The Power of Language Music: Arabic Lemmatization through Patterns.
Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon, 2016
Investigating the Impact of Various Partial Diacritization Schemes on Arabic-English Statistical Machine Translation.
Proceedings of the 12th Conferences of the Association for Machine Translation in the Americas: MT Researchers' Track, 2016
Addressing Annotation Complexity: The Case of Annotating Ideological Perspective in Egyptian Social Media.
Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016, 2016
Overview for the Second Shared Task on Language Identification in Code-Switched Data.
Proceedings of the Second Workshop on Computational Approaches to Code Switching@EMNLP 2016, 2016
Part of Speech Tagging for Code Switched Data.
Proceedings of the Second Workshop on Computational Approaches to Code Switching@EMNLP 2016, 2016
The George Washington University System for the Code-Switching Workshop Shared Task 2016.
Proceedings of the Second Workshop on Computational Approaches to Code Switching@EMNLP 2016, 2016
Using Ambiguity Detection to Streamline Linguistic Annotation.
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity, 2016
SAMER: A Semi-Automatically Created Lexical Resource for Arabic Verbal Multiword Expressions Tokens Paradigm and their Morphosyntactic Features.
Proceedings of the 12th Workshop on Asian Language Resources, 2016
2015
A Pilot Study on Arabic Multi-Genre Corpus Diacritization.
Proceedings of the Second Workshop on Arabic Natural Language Processing, 2015
GWU-HASP-2015$@$QALB-2015 Shared Task: Priming Spelling Candidates with Probability.
Proceedings of the Second Workshop on Arabic Natural Language Processing, 2015
Robust Part-of-speech Tagging of Arabic Text.
Proceedings of the Second Workshop on Arabic Natural Language Processing, 2015
GWU English TAC-KBP EL Diagnostic Task with Name Mention.
Proceedings of the 2015 Text Analysis Conference, 2015
A New Dataset and Evaluation for Belief/Factuality.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics, 2015
Ideological Perspective Detection Using Semantic Features.
Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics, 2015
Unsupervised False Friend Disambiguation Using Contextual Word Clusters and Parallel Word Alignments.
Proceedings of the Ninth Workshop on Syntax, 2015
SemEval-2015 Task 2: Semantic Textual Similarity, English, Spanish and Pilot on Interpretability.
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 9th International Workshop on Semantic Evaluation, 2015
Named Entity Recognition for Arabic Social Media.
Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, 2015
AIDA2: A Hybrid Approach for Token and Sentence Level Dialect Identification in Arabic.
Proceedings of the 19th Conference on Computational Natural Language Learning, 2015
Tharawat: A Vision for a Comprehensive Resource for Arabic Computational Processing.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2015
2014
Semantic Processing of Semitic Languages.
Proceedings of the Natural Language Processing of Semitic Languages, 2014
SAMAR: Subjectivity and sentiment analysis for Arabic social media.
Comput. Speech Lang., 2014
A hybrid system for code switch point detection in informal Arabic text.
XRDS, 2014
Named Entity Recognition System for Dialectal Arabic.
Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing, 2014
A Framework for the Classification and Annotation of Multiword Expressions in Dialectal Arabic.
Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing, 2014
GWU-HASP: Hybrid Arabic Spelling and Punctuation Corrector.
Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing, 2014
SemEval-2014 Task 10: Multilingual Semantic Textual Similarity.
Proceedings of the 8th International Workshop on Semantic Evaluation, 2014
MADAMIRA: A Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014
Tharwa: A Large Scale Dialectal Arabic - Standard Arabic - English Lexicon.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014
SANA: A Large Scale Multi-Genre, Multi-Dialect Lexicon for Arabic Subjectivity and Sentiment Analysis.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014
Fast Tweet Retrieval with Compact Binary Codes.
Proceedings of the COLING 2014, 2014
Arabic Multiword Expressions.
Proceedings of the Language, Culture, Computation. Computing of the Humanities, 2014
Sentence Level Dialect Identification for Machine Translation System Selection.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014
Overview for the First Shared Task on Language Identification in Code-Switched Data.
,
,
,
,
,
,
,
,
,
,
Proceedings of the First Workshop on Computational Approaches to Code Switching@EMNLP 2014, 2014
AIDA: Identifying Code Switching in Informal Arabic Text.
Proceedings of the First Workshop on Computational Approaches to Code Switching@EMNLP 2014, 2014
2013
LDC Arabic Treebanks and Associated Corpora: Data Divisions Manual.
CoRR, 2013
*SEM 2013 shared task: Semantic Textual Similarity.
Proceedings of the Second Joint Conference on Lexical and Computational Semantics, 2013
ASMA: A System for Automatic Segmentation and Morpho-Syntactic Disambiguation of Modern Standard Arabic.
Proceedings of the Recent Advances in Natural Language Processing, 2013
ANEAR: Automatic Named Entity Aliasing Resolution.
Proceedings of the Natural Language Processing and Information Systems, 2013
Code Switch Point Detection in Arabic.
Proceedings of the Natural Language Processing and Information Systems, 2013
Improving Lexical Semantics for Sentential Semantics: Modeling Selectional Preference and Similar Words in a Latent Variable Model.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013
DIRA: Dialectal Arabic Information Retrieval Assistant.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013
Multiword Expressions in the Context of Statistical Machine Translation.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013
Reranking with Linguistic and Semantic Features for Arabic Optical Character Recognition.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013
Linking Tweets to News: A Framework to Enrich Short Text Data in Social Media.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013
Sentence Level Dialect Identification in Arabic.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013
Identifying Opinion Subgroups in Arabic Online Discussions.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013
Semantic Textual Similarity: past present and future.
Proceedings of the Joint Symposium on Semantic Processing. Textual Inference and Structures in Corpora, 2013
2012
SAMAR: A System for Subjectivity and Sentiment Analysis of Arabic Social Media.
Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis, 2012
Weiwei: A Simple Unsupervised Latent Semantics based Approach for Sentence Similarity.
Proceedings of the 6th International Workshop on Semantic Evaluation, 2012
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity.
Proceedings of the 6th International Workshop on Semantic Evaluation, 2012
Predicting Overt Display of Power in Written Dialogs.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2012
Annotations for Power Relations on Email Threads.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012
Conventional Orthography for Dialectal Arabic.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012
Simplified guidelines for the creation of Large Scale Dialectal Arabic Annotations.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012
AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012
Who's (Really) the Boss? Perception of Situational Power in Written Interactions.
Proceedings of the COLING 2012, 2012
Token Level Identification of Linguistic Code Switching.
Proceedings of the COLING 2012, 2012
A Pilot PropBank Annotation for Quranic Arabic.
Proceedings of the Workshop on Computational Linguistics for Literature, 2012
Statistical Modality Tagging from Rule-based Annotations and Crowdsourcing.
Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics, 2012
Learning the Latent Semantics of a Concept from its Definition.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012
Modeling Sentences in the Latent Space.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012
Genre Independent Subgroup Detection in Online Discussion Threads: A Study of Implicit Attitude using Textual Latent Semantics.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012
Subgroup Detection in Ideological Discussions.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012
Building an Arabic Multiword Expressions Repository.
Proceedings of the Joint Workshop on Statistical Parsing and Semantic Processing of Morphologically Rich Languages, 2012
2011
Introduction to the Special Issue on Arabic Computational Linguistics.
ACM Trans. Asian Lang. Inf. Process., 2011
CODACT: Towards Identifying Orthographic Variants in Dialectal Arabic.
Proceedings of the Fifth International Joint Conference on Natural Language Processing, 2011
Semantic Topic Models: Combining Word Distributional Statistics and Dictionary Definitions.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011
Named Entity Transliteration Generation Leveraging Statistical Machine Translation Technology.
Proceedings of the 3rd Named Entities Workshop, 2011
Subjectivity and Sentiment Annotation of Modern Standard Arabic Newswire.
Proceedings of the Fifth Linguistic Annotation Workshop, 2011
Subjectivity and Sentiment Analysis of Modern Standard Arabic.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, 2011
Feasibility of Leveraging Crowd Sourcing for the Creation of a Large Scale Annotated Resource for Hindi English Code Switched Data: A Pilot Annotation.
Proceedings of the 9th Workshop on Asian Language Resources, 2011
2010
COLEPL and COLSLM: An Unsupervised WSD Approach to Multilingual Lexical Substitution, Tasks 2 and 3 SemEval 2010.
Proceedings of the 5th International Workshop on Semantic Evaluation, 2010
Task-based Evaluation of Multiword Expressions: a Pilot Study in Statistical Machine Translation.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2010
Automatic Committed Belief Tagging.
Proceedings of the COLING 2010, 2010
The Revised Arabic PropBank.
Proceedings of the Fourth Linguistic Annotation Workshop, 2010
Combining Orthogonal Monolingual and Multilingual Sources of Evidence for All Words WSD.
Proceedings of the ACL 2010, 2010
Arabic Named Entity Recognition: Using Features Extracted from Noisy Data.
Proceedings of the ACL 2010, 2010
2009
Arabic Named Entity Recognition: A Feature-Driven Study.
IEEE Trans. Speech Audio Process., 2009
Using Language Independent and Language Specific Features to Enhance Arabic Named Entity Recognition.
Int. Arab J. Inf. Technol., 2009
Improvements To Monolingual English Word Sense Disambiguation.
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions, 2009
Verb Noun Construction MWE Token Classification.
Proceedings of the Workshop on Multiword Expressions: Identification, 2009
Unsupervised Classification of Verb Noun Multi-Word Expression Tokens.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2009
Committed Belief Annotation and Tagging.
Proceedings of the Third Linguistic Annotation Workshop, 2009
Who, What, When, Where, Why? Comparing Multiple Approaches to the Cross-Lingual 5W Task.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the ACL 2009, 2009
2008
Proceedings of the International Conference on Language Resources and Evaluation, 2008
Arabic Named Entity Recognition using Optimized Feature Sets.
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008
Arabic Morphological Tagging, Diacritization, and Lemmatization Using Lexeme Models and Feature Ranking.
Proceedings of the ACL 2008, 2008
Semantic Role Labeling Systems for Arabic using Kernel Methods.
Proceedings of the ACL 2008, 2008
2007
CUNIT: A Semantic Role Labeling System for Modern Standard Arabic.
Proceedings of the 4th International Workshop on Semantic Evaluations, 2007
SemEval-2007 Task 18: Arabic Semantic Labeling.
Proceedings of the 4th International Workshop on Semantic Evaluations, 2007
Arabic Dialect Processing Tutorial.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007
Semi-automatic error analysis for large-scale statistical machine translation.
Proceedings of Machine Translation Summit XI: Papers, 2007
Arabic diacritization in the context of statistical machine translation.
Proceedings of Machine Translation Summit XI: Papers, 2007
Improved Arabic Base Phrase Chunking with a new enriched POS tag set.
Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, 2007
2006
Unsupervised Induction of Modern Standard Arabic Verb Classes.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2006
Developing and Using a Pilot Dialectal Arabic Treebank.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006
Proceedings of the EACL 2006, 2006
Unsupervised Induction of Modern Standard Arabic Verb Classes Using Syntactic Frames and LSA.
Proceedings of the ACL 2006, 2006
2004
Automatic Tagging of Arabic Text: From Raw Text to Base Phrase Chunks.
Proceedings of HLT-NAACL 2004: Short Papers, Boston, Massachusetts, USA, May 2-7, 2004, 2004
Relieving the data Acquisition Bottleneck in Word Sense Disambiguation.
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004
2003
Word Sense Disambiguation within a Multilingual Framework.
PhD thesis, 2003
2002
An Unsupervised Method for Word Sense Tagging using Parallel Corpora.
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002
2000
A statistical translation model using comparable corpora.
Proceedings of the Computer-Assisted Information Retrieval (Recherche d'Information et ses Applications), 2000
1999
The Bible as a Parallel Corpus: Annotating the 'Book of 2000 Tongues'.
Comput. Humanit., 1999