Anya Belz

Orcid: 0000-0002-0552-8096

According to our database1, Anya Belz authored at least 92 papers between 1998 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Reproducing the Metric-Based Evaluation of a Set of Controllable Text Generation Techniques.
CoRR, 2024

Assessing the Portability of Parameter Matrices Trained by Parameter-Efficient Finetuning Methods.
CoRR, 2024

(Mostly) Automatic Experiment Execution for Human Evaluations of NLP Systems.
Proceedings of the 17th International Natural Language Generation Conference, 2024

Filling Gaps in Wikipedia: Leveraging Data-to-Text Generation to Improve Encyclopedic Coverage of Underrepresented Groups.
Proceedings of the 17th International Natural Language Generation Conference, 2024

Differences in Semantic Errors Made by Different Types of Data-to-text Systems.
Proceedings of the 17th International Natural Language Generation Conference, 2024

QCET: An Interactive Taxonomy of Quality Criteria for Comparable and Repeatable Evaluation of NLP Systems.
Proceedings of the 17th International Natural Language Generation Conference, 2024

Assessing the Portability of Parameter Matrices Trained by Parameter-Efficient Finetuning Methods.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

High-quality Data-to-Text Generation for Severely Under-Resourced Languages with Out-of-the-box Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

Beyond Abstracts: A New Dataset, Prompt Design Strategy and Method for Biomedical Synthesis Generation.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 2024

2023
Data-to-text Generation for Severely Under-Resourced Languages with GPT-3.5: A Bit of Help Needed from Google Translate.
CoRR, 2023

Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP.
CoRR, 2023

PEFT-Ref: A Modular Reference Architecture and Typology for Parameter-Efficient Finetuning Techniques.
CoRR, 2023

How to Control Sentiment in Text Generation: A Survey of the State-of-the-Art in Sentiment-Control Techniques.
Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, 2023

Towards a Consensus Taxonomy for Annotating Errors in Automatically Generated Text.
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, 2023

Mod-D2T: A Multi-layer Dataset for Modular Data-to-Text Generation.
Proceedings of the 16th International Natural Language Generation Conference, 2023

Exploring Variation of Results from Different Experimental Conditions.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Non-Repeatable Experiments and Non-Reproducible Results: The Reproducibility Crisis in Human Evaluation in NLP.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
A Metrological Perspective on Reproducibility in NLP.
Comput. Linguistics, 2022

User-Driven Research of Medical Note Generation Software.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Consultation Checklists: Standardising the Human Evaluation of Medical Note Generation.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: EMNLP 2022 - Industry Track, Abu Dhabi, UAE, December 7, 2022

Human Evaluation and Correlation with Automatic Metrics in Consultation Note Generation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Quantified Reproducibility Assessment of NLP Results.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Quantifying Reproducibility in NLP and ML.
CoRR, 2021

The Human Evaluation Datasheet 1.0: A Template for Recording Details of Human Evaluation Experiments in NLP.
CoRR, 2021

A Reproduction Study of an Annotation-based Human Evaluation of MT Outputs.
Proceedings of the 14th International Conference on Natural Language Generation, 2021

Another PASS: A Reproduction Study of the Human Evaluation of a Football Report Generation System.
Proceedings of the 14th International Conference on Natural Language Generation, 2021

The ReproGen Shared Task on Reproducibility of Human Evaluations in NLG: Overview and Results.
Proceedings of the 14th International Conference on Natural Language Generation, 2021

A Systematic Review of Reproducibility Research in Natural Language Processing.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

2020
Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions.
Proceedings of the 13th International Conference on Natural Language Generation, 2020

Disentangling the Properties of Human Evaluation Methods: A Classification System to Support Comparability, Meta-Evaluation and Reproducibility Testing.
Proceedings of the 13th International Conference on Natural Language Generation, 2020

ReproGen: Proposal for a Shared Task on Reproducibility of Human Evaluations in NLG.
Proceedings of the 13th International Conference on Natural Language Generation, 2020

2019
Fully Automatic Journalism: We Need to Talk About Nonfake News Generation.
Proceedings of the 2019 Truth and Trust Online Conference (TTO 2019), 2019

The Second Multilingual Surface Realisation Shared Task (SR'19): Overview and Evaluation Results.
Proceedings of the 2nd Workshop on Multilingual Surface Realisation, 2019

Conceptualisation and Annotation of Drug Nonadherence Information for Knowledge Extraction from Patient-Generated Texts.
Proceedings of the 5th Workshop on Noisy User-generated Text, 2019

2018
From image to language and back again.
Nat. Lang. Eng., 2018

Underspecified Universal Dependency Structures as Inputs for Multilingual Surface Realisation.
Proceedings of the 11th International Conference on Natural Language Generation, 2018

Adding the Third Dimension to Spatial Relation Detection in 2D Images.
Proceedings of the 11th International Conference on Natural Language Generation, 2018

SpatialVOC2K: A Multilingual Dataset of Images with Annotations and Features for Spatial Relations between Objects.
Proceedings of the 11th International Conference on Natural Language Generation, 2018

2017
Learning to Generate Descriptions of Visual Data Anchored in Spatial Relations.
IEEE Comput. Intell. Mag., 2017

Shared Task Proposal: Multilingual Surface Realization Using Universal Dependency Trees.
Proceedings of the 10th International Conference on Natural Language Generation, 2017

2016
Effect of Data Annotation, Feature Selection and Model Choice on Spatial Description Generation in French.
Proceedings of the INLG 2016, 2016

Analysis of Twitter Data for Postmarketing Surveillance in Pharmacovigilance.
Proceedings of the 2nd Workshop on Noisy User-generated Text, 2016

Exploring Different Preposition Sets, Models and Feature Sets in Automatic Generation of Spatial Image Descriptions.
Proceedings of the 5th Workshop on Vision and Language, 2016

2015
Generating Descriptions of Spatial Relations between Objects in Images.
Proceedings of the ENLG 2015, 2015

Describing Spatial Relationships between Objects in Images in English and French.
Proceedings of the Fourth Workshop on Vision and Language, 2015

2014
A Comparative Evaluation Methodology for NLG in Interactive Systems.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

The Last 10 Metres: Using Visual Analysis and Verbal Communication in Guiding Visually Impaired Smartphone Users to Entrances.
Proceedings of the Third Workshop on Vision and Language, 2014

Comparative evaluation and shared tasks for NLG in interactive systems.
Proceedings of the Natural Language Generation in Interactive Systems, 2014

2012
LG-Eval: A Toolkit for Creating Online Language Evaluation Experiments.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

A Repository of Data and Evaluation Resources for Natural Language Generation.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

The Surface Realisation Task: Recent Developments and Future Plans.
Proceedings of the INLG 2012 - Proceedings of the Seventh International Natural Language Generation Conference, 30 May 2012, 2012

2011
The First Surface Realisation Shared Task: Overview and Evaluation Results.
Proceedings of the ENLG 2011, 2011

Generation Challenges 2011 Preface.
Proceedings of the ENLG 2011, 2011

Discrete vs. Continuous Rating Scales for Language Evaluation in NLP.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, 2011

Unsupervised Alignment of Comparable Data and Text Resources.
Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web, 2011

2010
A Game-based Approach to Transcribing Images of Text.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

Finding Common Ground: Towards a Surface Realisation Shared Task.
Proceedings of the INLG 2010, 2010

The GREC Challenges 2010: Overview and Evaluation Results.
Proceedings of the INLG 2010, 2010

Extracting Parallel Fragments from Comparable Corpora for Data-to-text Generation.
Proceedings of the INLG 2010, 2010

Comparing Rating Scales and Preference Judgements in Language Evaluation.
Proceedings of the INLG 2010, 2010

Generation Challenges 2010 Preface.
Proceedings of the INLG 2010, 2010

Introducing Shared Tasks to NLG: The TUNA Shared Task Evaluation Challenges.
Proceedings of the Empirical Methods in Natural Language Generation: Data-oriented Methods and Empirical Evaluation, 2010

Generating Referring Expressions in Context: The GREC Task Evaluation Challenges.
Proceedings of the Empirical Methods in Natural Language Generation: Data-oriented Methods and Empirical Evaluation, 2010

Assessing the Trade-Off between System Building Cost and Output Quality in Data-to-Text Generation.
Proceedings of the Empirical Methods in Natural Language Generation: Data-oriented Methods and Empirical Evaluation, 2010

2009
An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems.
Comput. Linguistics, 2009

That's Nice ... What Can You Do With It?
Comput. Linguistics, 2009

The TUNA-REG Challenge 2009: Overview and Evaluation Results.
Proceedings of the ENLG 2009, 2009

System Building Cost vs. Output Quality in Data-to-Text Generation.
Proceedings of the ENLG 2009, 2009

Generation Challenges 2009: Preface.
Proceedings of the ENLG 2009, 2009

2008
Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models.
Nat. Lang. Eng., 2008

The TUNA Challenge 2008: Overview and Evaluation Results.
Proceedings of the INLG 2008, 2008

Attribute Selection for Referring Expression Generation: New Algorithms and Evaluation Methods.
Proceedings of the INLG 2008, 2008

The GREC Challenge 2008: Overview and Evaluation Results.
Proceedings of the INLG 2008, 2008

REG Challenge Preface.
Proceedings of the INLG 2008, 2008

Intrinsic vs. Extrinsic Evaluation Measures for Referring Expression Generation.
Proceedings of the ACL 2008, 2008

2007
Probabilistic Generation of Weather Forecast Texts.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007

Modelling control in generation.
Proceedings of the Eleventh European Workshop on Natural Language Generation, 2007

Generation of repeated references to discourse entities.
Proceedings of the Eleventh European Workshop on Natural Language Generation, 2007

2006
GENEVAL: A Proposal for Shared-task Evaluation in NLG.
Proceedings of the INLG 2006, 2006

Shared-Task Evaluations in HLT: Lessons for NLG.
Proceedings of the INLG 2006, 2006

Introduction to the INLG'06 Special Session on Sharing Data and Comparative Evaluation.
Proceedings of the INLG 2006, 2006

Comparing Automatic and Human Evaluation of NLG Systems.
Proceedings of the EACL 2006, 2006

2005
Statistical Generation: Three Methods Compared and Evaluated.
Proceedings of the Tenth European Workshop on Natural Language Generation, 2005

2002
PILLS: Multilingual generation of medical information documents with overlapping content.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

PCFG Learning by Nonterminal Partition Search.
Proceedings of the Grammatical Inference: Algorithms and Applications, 2002

Learning Grammars for Different Parsing Tasks by Partition Search.
Proceedings of the 19th International Conference on Computational Linguistics, 2002

2001
Multi-Syllable Phonotactic Modelling
CoRR, 2001

Learning Computational Grammars.
Proceedings of the ACL 2001 Workshop on Computational Natural Language Learning, 2001

2000
Computational learning of finite-state models for natural language processing.
PhD thesis, 2000

1998
An Approach to the Automatic Acquisition of Phonotactic Constraints.
Proceedings of the Workshop on Computation of Phonological Constraints, 1998

A Few English Words Can Help Improve Your Russian.
Proceedings of the 13th European Conference on Artificial Intelligence, 1998

Discovering Phonotactic Finite-State Automata by Generic Search.
Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, 1998


  Loading...