Paul Rayson

Orcid: 0000-0002-1257-2191

Affiliations:
  • Lancaster University, School of Computing and Communications, UK


According to our database1, Paul Rayson authored at least 109 papers between 1998 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
A Comparative Study on Automatic Coding of Medical Letters with Explainability.
CoRR, 2024

The IgboAPI Dataset: Empowering Igbo Language Technologies through Multi-dialectal Enrichment.
CoRR, 2024

The Geography of 'Fear', 'Sadness', 'Anger' and 'Joy': Exploring the Emotional Landscapes in the Holocaust Survivors' Testimonies.
Proceedings of Text2Story, 2024

The IgboAPI Dataset: Empowering Igbo Language Technologies through Multi-dialectal Enrichment.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

2023
UNLT: Urdu Natural Language Toolkit.
Nat. Lang. Eng., July, 2023

Cross-lingual Text Reuse Detection at Document Level for English-Urdu Language Pair.
ACM Trans. Asian Low Resour. Lang. Inf. Process., June, 2023

Semantic Tagging for the Urdu Language: Annotated Corpus and Multi-Target Classification Methods.
ACM Trans. Asian Low Resour. Lang. Inf. Process., June, 2023

The ParlaMint corpora of parliamentary proceedings.
Lang. Resour. Evaluation, March, 2023

A Comparative Study of Evaluation Metrics for Long-Document Financial Narrative Summarization with Transformers.
Proceedings of the Natural Language Processing and Information Systems, 2023

FinAraT5: A text to text model for financial Arabic text understanding and generation.
Proceedings of the 4th Conference on Language, Data and Knowledge, 2023

Open-Source Thesaurus Development for Under-Resourced Languages: a Welsh Case Study.
Proceedings of the 4th Conference on Language, Data and Knowledge, 2023

Towards an Extensible Framework for Understanding Spatial Narratives.
Proceedings of the 7th ACM SIGSPATIAL International Workshop on Geospatial Humanities, 2023

Extracting Imprecise Geographical and Temporal References from Journey Narratives.
Proceedings of Text2Story, 2023

Igboner 2.0: Expanding Named Entity Recognition Datasets via Projection.
Proceedings of the 4th Workshop on African Natural Language Processing, 2023

2022
Textual variations affect human judgements of sentiment values.
Electron. Commer. Res. Appl., 2022

CoFiF Plus: A French Financial Narrative Summarisation Corpus.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

IgboBERT Models: Building and Training Transformer Models for the Igbo Language.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

2021
Textual Variations Affect Human Judgements of Sentiment Values.
Dataset, April, 2021

MasakhaNER: Named Entity Recognition for African Languages.
Trans. Assoc. Comput. Linguistics, 2021

Uncovering Environmental Change in the English Lake District: Using Computational Techniques to Trace the Presence and Documentation of Historical Flora.
Digit. Scholarsh. Humanit., 2021

GIBBON: General-purpose Information-Based Bayesian Optimisation.
J. Mach. Learn. Res., 2021

Understanding who uses Reddit: Profiling individuals with a self-reported bipolar disorder diagnosis.
CoRR, 2021

Multilingual Financial Word Embeddings for Arabic, English and French.
Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), 2021

2020
Known and unknown requirements in healthcare.
Requir. Eng., 2020

The National Corpus of Contemporary Welsh: Project Report | Y Corpws Cenedlaethol Cymraeg Cyfoes: Adroddiad y Prosiect.
CoRR, 2020

BOSH: Bayesian Optimization by Sampling Hierarchically.
CoRR, 2020

Igbo-English Machine Translation: An Evaluation Benchmark.
Proceedings of the 1st AfricaNLP Workshop Proceedings, 2020

MUMBO: MUlti-task Max-Value Bayesian Optimization.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2020

BOSS: Bayesian Optimization over String Spaces.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Infrastructure for Semantic Annotation in the Genomics Domain.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

LexiDB: Patterns & Methods for Corpus Linguistic Database Management.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Developing an Arabic Infectious Disease Ontology to Include Non-Standard Terminology.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

2019
A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2019

A word sense disambiguation corpus for Urdu.
Lang. Resour. Evaluation, 2019

CLEU - A Cross-language english-urdu corpus and benchmark for text reuse experiments.
J. Assoc. Inf. Sci. Technol., 2019

In Search of Meaning: Lessons, Resources and Next Steps for Computational Analysis of Financial Discourse.
CoRR, 2019

Leveraging Pre-Trained Embeddings for Welsh Taggers.
Proceedings of the 4th Workshop on Representation Learning for NLP, 2019

FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Measuring Short Text Reuse for the Urdu Language.
IEEE Access, 2018

Deep Mapping Tarn Hows: Automated Generation of 3D Historic Landscapes.
Proceedings of the GCH 2018, 2018

Towards a Welsh Semantic Annotation System.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Profiling Medical Journal Articles Using a Gene Ontology Semantic Tagger.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Arabic Dialect Identification in the Context of Bivalency and Code-Switching.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Using J-K-fold Cross Validation To Reduce Variance When Tuning NLP Models.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

Bringing replication and reproduction together with generalisability in NLP: Three reproduction studies for Target Dependent Sentiment Analysis.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

2017
COUNTER: corpus of Urdu news text reuse.
Lang. Resour. Evaluation, 2017

A time-sensitive historical thesaurus-based semantic tagger for deep semantic annotation.
Comput. Speech Lang., 2017

Lancaster A at SemEval-2017 Task 5: Evaluation metrics matter: predicting sentiment from financial news headlines.
Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

A deeply annotated testbed for geographical text analysis: The Corpus of Lake District Writing.
Proceedings of the 1st ACM SIGSPATIAL Workshop on Geospatial Humanities, 2017

Exploring Deep Mapping Concepts: Crosthwaite's Map and West's Picturesque Stations.
Proceedings of Workshops and Posters at the 13th International Conference on Spatial Information Theory, 2017

2016
Towards Interactive Multidimensional Visualisations for Corpus Linguistics.
J. Lang. Technol. Comput. Linguistics, 2016

Reversing the Polarity with Emoticons.
Proceedings of the Natural Language Processing and Information Systems, 2016

UPPC - Urdu Paraphrase Plagiarism Corpus.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Learning Tone and Attribution for Financial Text Mining.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

OSMAN ― A Novel Arabic Readability Metric.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Combining Mouse and Keyboard Events with Higher Level Desktop Actions to Detect Mild Cognitive Impairment.
Proceedings of the 2016 IEEE International Conference on Healthcare Informatics, 2016

Sampling labelled profile data for identity resolution.
Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

lexiDB: A scalable corpus database management system.
Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

2015
Automatically Analyzing Large Texts in a GIS Environment: The Registrar General's Reports and Cholera in the 19th Century.
Trans. GIS, 2015

Metaphor, Popular Science, and Semantic Tagging: Distant Reading with the Historical Thesaurus of English.
Digit. Scholarsh. Humanit., 2015

Geoparsing, GIS, and Textual Analysis: Current Developments in Spatial Humanities Research.
Int. J. Humanit. Arts Comput., 2015

A Systematic Survey of Online Data Mining Technology Intended for Law Enforcement.
ACM Comput. Surv., 2015

Development of the Multilingual Semantic Annotation System.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Sentiment analysis tools should take account of the number of exclamation marks!!!
Proceedings of the 17th International Conference on Information Integration and Web-based Applications & Services, 2015

Dementia and Social Sustainability: Challenges for Software Engineering.
Proceedings of the 37th IEEE/ACM International Conference on Software Engineering, 2015

Scaling out for extreme scale corpus data.
Proceedings of the 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, USA, October 29, 2015

2014
Language Independent Evaluation of Translation Style and Consistency: Comparing Human and Machine Translations of Camus' Novel "The Stranger".
Proceedings of the Text, Speech and Dialogue - 17th International Conference, 2014

Discovering affect-laden requirements to achieve system acceptance.
Proceedings of the IEEE 22nd International Requirements Engineering Conference, 2014

Concept Vocabularies in Programmer Sociolects.
Proceedings of the 25th Annual Workshop of the Psychology of Programming Interest Group, 2014

Experiences with Parallelisation of an Existing NLP Pipeline: Tagging Hansard.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Detecting Document Structure in a Very Large Corpus of UK Financial Reports.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

A Service-Indepenent Model for Linking Online User Profile Information.
Proceedings of the IEEE Joint Intelligence and Security Informatics Conference, 2014

Digital approaches to understanding the geographies in literary and historical texts.
Proceedings of the 9th Annual International Conference of the Alliance of Digital Humanities Organizations, 2014

Metaphor, Popular Science and Semantic Tagging: Distant Reading with the Historical Thesaurus of English.
Proceedings of the 9th Annual International Conference of the Alliance of Digital Humanities Organizations, 2014

Dealing with heterogeneous big data when geoparsing historical corpora.
Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014

2013
Who Am I? Analyzing Digital Personas in Cybercrime Investigations.
Computer, 2013

Customising geoparsing and georeferencing for historical texts.
Proceedings of the 2013 IEEE International Conference on Big Data (IEEE BigData 2013), 2013

2012
Experiments in 17th century English: manual versus automatic conceptual history.
Lit. Linguistic Comput., 2012

What is middleware made of?: exploring abstractions, concepts, and class names in modern middleware.
Proceedings of the 11th Workshop on Adaptive and Reflective Middleware, 2012

Document Attrition in Web Corpora: an Exploration.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Understanding Actionable Knowledge in Social Media: BBC Question Time and Twitter, a Case Study.
Proceedings of the Sixth International Conference on Weblogs and Social Media, 2012

2011
Analyzing the semantic content and persuasive composition of extremist media: A case study of texts produced during the Gaza conflict.
Inf. Syst. Frontiers, 2011

2010
Multiword expressions: hard going or plain sailing?
Lang. Resour. Evaluation, 2010

Classification of Short Text Comments by Sentiment and Actionability for VoiceYourView.
Proceedings of the 2010 IEEE Second International Conference on Social Computing, 2010

2008
Corpus Tools and Methods, Today and Tomorrow: Incorporating Linguists' Manual Annotations.
Lit. Linguistic Comput., 2008

The Identification of Spelling Variants in English and German Historical Texts: Manual or Automatic?
Lit. Linguistic Comput., 2008

A flexible framework to experiment with ontology learning techniques.
Knowl. Based Syst., 2008

A framework for P2P application development.
Comput. Commun., 2008

An Exploratory Study of Information Retrieval Techniques in Domain Analysis.
Proceedings of the Software Product Lines, 12th International Conference, 2008

Supporting Law Enforcement in Digital Communities through Natural Language Analysis.
Proceedings of the Computational Forensics, Second International Workshop, 2008

2007
EA-Miner: Towards Automation in Aspect-Oriented Requirements Engineering.
LNCS Trans. Aspect Oriented Softw. Dev., 2007

Semantics-based composition for aspect-oriented requirements engineering.
Proceedings of the 6th International Conference on Aspect-Oriented Software Development, 2007

2006
ASSIST: Automated Semantic Assistance for Translators.
Proceedings of the EACL 2006, 2006

Tagging Historical Corpora - the problem of spelling variation.
Proceedings of the Digital Historical Corpora - Architecture, Annotation, and Retrieval, 03.12., 2006

2005
Shallow Knowledge as an Aid to Deep Understanding in Early Phase Requirements Engineering.
IEEE Trans. Software Eng., 2005

Artefacts as designed, artefacts as used: resources for uncovering activity dynamics.
Cogn. Technol. Work., 2005

Comparing and combining a semantic tagger and a statistical tool for MWE extraction.
Comput. Speech Lang., 2005

Early-AIM: An Approach for Identifying Aspects in Requirements.
Proceedings of the 13th IEEE International Conference on Requirements Engineering (RE 2005), 29 August, 2005

EA-Miner: a tool for automating aspect-oriented requirements identification.
Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering (ASE 2005), 2005

2004
P2P-4-DL: Digital Library over Peer-to-Peer.
Proceedings of the 4th International Conference on Peer-to-Peer Computing (P2P 2004), 2004

Language Resources and Tools for Supporting the System Engineering Process.
Proceedings of the Natural Language Processing and Information Systems, 2004

Evaluating Lexical Resources for a Semantic Tagger.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

2003
Matrix : a statistical method and software tool for linguistic analysis through corpus comparison.
PhD thesis, 2003

Tracker: A Framework to Support Reducing Rework Through Decision Management.
Proceedings of the ICEIS 2003, 2003

2002
REVERE: Support for Requirements Synthesis from Documents.
Inf. Syst. Frontiers, 2002

2000
Assisting requirements engineering with semantic document analysis.
Proceedings of the Computer-Assisted Information Retrieval (Recherche d'Information et ses Applications), 2000

The REVERE Project: Experiments with the Application of Probabilistic NLP to Systems Engineering.
Proceedings of the Natural Language Processing and Information Systems, 2000

1998
Supporting Information Evolution on the WWW.
World Wide Web, 1998


  Loading...