Marcos André Gonçalves

Orcid: 0000-0002-2075-3363

  • Federal University of Minas Gerais, Belo Horizonte, Brazil

According to our database1, Marcos André Gonçalves authored at least 335 papers between 1998 and 2025.

Collaborative distances:




In proceedings 
PhD thesis 


Online presence:



A Human-Centered Multiperspective and Interactive Visual Tool For Explainable Machine Learning.
J. Braz. Comput. Soc., 2025

Intellectual dark web, alt-lite and alt-right: Are they really that different? a multi-perspective analysis of the textual content produced by contrarians.
Soc. Netw. Anal. Min., December, 2024

A network-driven study of hyperprolific authors in computer science.
Scientometrics, April, 2024

Pipelining Semantic Expansion and Noise Filtering for Sentiment Analysis of Short Documents - CluSent Method.
J. Interact. Syst., 2024

On Representation Learning-based Methods for Effective, Efficient, and Scalable Code Retrieval.
Neurocomputing, 2024

A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification.
CoRR, 2024

A Novel Two-Step Fine-Tuning Pipeline for Cold-Start Active Learning in Text Classification Tasks.
CoRR, 2024

A Quantum Annealing-Based Instance Selection Approach for Transformer Fine-Tuning.
Proceedings of the 14th Italian Information Retrieval Workshop, 2024

A Quantum Annealing Instance Selection Approach for Efficient and Effective Transformer Fine-Tuning.
Proceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval, 2024

Explainable Stacking Models based on Complementary Traffic Embeddings.
Proceedings of the IEEE European Symposium on Security and Privacy Workshops, 2024

Reducing the user labeling effort in effective high recall tasks by fine-tuning active learning.
J. Intell. Inf. Syst., October, 2023

Using Active Learning for Segmentation and Semantic Classification of Legal Acts Extracted from Official Diaries.
J. Inf. Data Manag., October, 2023

Contextual Reinforcement, Entity Delimitation and Generative Data Augmentation for Entity Recognition and Relation Extraction in Official Documents.
J. Inf. Data Manag., October, 2023

The rise of hyperprolific authors in computer science: characterization and implications.
Scientometrics, May, 2023

On the class separability of contextual embeddings representations - or "The classifier does not matter when the (text) representation is so good!".
Inf. Process. Manag., 2023

A Comparative Survey of Instance Selection Methods applied to Non-Neural and Transformer-Based Text Classification.
ACM Comput. Surv., 2023

TPDR: A Novel Two-Step Transformer-based Product and Class Description Match and Retrieval Method.
CoRR, 2023

CluSent - Combining Semantic Expansion and De-Noising for Dataset-Oriented Sentiment Analysis of Short Texts.
Proceedings of the 29th Brazilian Symposium on Multimedia and the Web, 2023

Evaluating the Limits of the Current Evaluation Metrics for Topic Modeling.
Proceedings of the 29th Brazilian Symposium on Multimedia and the Web, 2023

An Effective, Efficient, and Scalable Confidence-based Instance Selection Framework for Transformer-Based Text Classification.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

Uma Metodologia para Tratamento do Viés da Maioria em Modelos de Stacking via Identificação de Documentos Difíceis.
Proceedings of the 38th Brazilian Symposium on Databases, 2023

PromptNER: Uma Abordagem para Reconhecimento de Entidades Nomeadas em Dados Sensíveis a Partir de Instâncias Rotuladas Automaticamente.
Proceedings of the 38th Brazilian Symposium on Databases, 2023

Contrasting Explain-ML with Interpretability Machine Learning Tools in Light of Interactive Machine Learning Principles.
J. Interact. Syst., November, 2022

Semantic Academic Profiler (SAP): a framework for researcher assessment based on semantic topic modeling.
Scientometrics, 2022

How to build high quality L2R training data: Unsupervised compression-based selective sampling for learning to rank.
Inf. Sci., 2022

A reinforcement learning approach for single redundant view co-training text classification.
Inf. Sci., 2022

On the Presence of Abusive Language in Mis/Disinformation.
Proceedings of the Social Informatics - 13th International Conference, 2022

Risk-Sensitive Deep Neural Learning to Rank.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

DedupeGov: Uma Plataforma para Integração de Grandes Volumes de Dados de Pessoas Físicas e Jurídicas em Âmbito Governamental.
Proceedings of the 37th Brazilian Symposium on Databases, 2022

Segmentação e Classificação Semântica de Trechos de Diários Oficiais Usando Aprendizado Ativo.
Proceedings of the 37th Brazilian Symposium on Databases, 2022

Reforço e Delimitação Contextual para Reconhecimento de Entidades e Relações em Documentos Oficiais.
Proceedings of the 37th Brazilian Symposium on Databases, 2022

Characterizing and Understanding Temporal Effects in COVID-19 Data.
Proceedings of the 1st Workshop on Healthcare AI and COVID-19, 2022

Deduplicating Large Volumes of Data from Natural and Legal Entities in the Governmental Field.
Proceedings of the IEEE International Conference on Big Data, 2022

Individualized extreme dominance (IndED): A new preference-based method for multi-objective recommender systems.
Inf. Sci., 2021

On the cost-effectiveness of neural and non-neural approaches and representations for text classification: A comprehensive comparative study.
Inf. Process. Manag., 2021

A bias-variance analysis of state-of-the-art random forest text classifiers.
Adv. Data Anal. Classif., 2021

Evaluating Recognizing Question Entailment Methods for a Portuguese Community Question-Answering System about Diabetes Mellitus.
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), 2021

Analysis of the User Experience with a Multiperspective Tool for Explainable Machine Learning in Light of Interactive Principles.
Proceedings of the IHC '21: XX Brazilian Symposium on Human Factors in Computing Systems, 2021

Profiling Hate Speech Spreaders on Twitter: Exploiting Textual Analysis of Tweets and Combination of Textual Representations.
Proceedings of the Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum, Bucharest, Romania, September 21st - to, 2021

Analyzing topic attention in online small groups.
Proceedings of the ASONAM '21: International Conference on Advances in Social Networks Analysis and Mining, Virtual Event, The Netherlands, November 8, 2021

On the Cost-Effectiveness of Stacking of Neural and Non-Neural Methods for Text Classification: Scenarios and Performance Prediction.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Automatic Disambiguation of Author Names in Bibliographic Repositories
Synthesis Lectures on Information Concepts, Retrieval, and Services, Morgan & Claypool Publishers, ISBN: 978-3-031-02322-4, 2020

A pragmatic approach to hierarchical categorization of research expertise in the presence of scarce information.
Int. J. Digit. Libr., 2020

Exploiting semantic relationships for unsupervised expansion of sentiment lexicons.
Inf. Syst., 2020

Fine-grained tourism prediction: Impact of social and environmental features.
Inf. Process. Manag., 2020

Extended pre-processing pipeline for text classification: On the role of meta-feature representations, sparsification and selective sampling.
Inf. Process. Manag., 2020

"Fixing the curse of the bad product descriptions" - Search-boosted tag recommendation for E-commerce products.
Inf. Process. Manag., 2020

Automatic Content Quality Estimation Using Deep Neural Networks in Collaborative Encyclopedias on the Web.
Proceedings of the WebMedia '20: Brazillian Symposium on Multimedia and the Web, São Luís, Brazil, November 30, 2020

"Keep it Simple, Lazy" - MetaLazy: A New MetaStrategy for Lazy Text Classification.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

CluHTM - Semantic Hierarchical Topic Modeling based on CluWords.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Risk-Sensitive Learning to Rank with Evolutionary Multi-Objective Feature Selection.
ACM Trans. Inf. Syst., 2019

Hierarchical Clustering-Based Graphs for Large Scale Approximate Nearest Neighbor Search.
Pattern Recognit., 2019

10SENT: A stable sentiment analysis method based on the combination of off-the-shelf approaches.
J. Assoc. Inf. Sci. Technol., 2019

Bag of textual graphs (BoTG): A general graph-based text representation model.
J. Assoc. Inf. Sci. Technol., 2019

Exploiting syntactic and neighbourhood attributes to address cold start in tag recommendation.
Inf. Process. Manag., 2019

Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation.
Expert Syst. Appl., 2019

Parallel rule-based selective sampling and on-demand learning to rank.
Concurr. Comput. Pract. Exp., 2019

CluWords: Exploiting Semantic Word Clustering Representation for Enhanced Topic Modeling.
Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, 2019

Characterizing Attention Cascades in WhatsApp Groups.
Proceedings of the 11th ACM Conference on Web Science, 2019

Similarity-Based Synthetic Document Representations for Meta-Feature Generation in Text Classification.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Image Aesthetics and its Effects on Product Clicks in E-Commerce Search.
Proceedings of the SIGIR 2019 Workshop on eCommerce, 2019

Automatic Generation of Initial Reading Lists: Requirements and Solutions.
Proceedings of the 19th ACM/IEEE Joint Conference on Digital Libraries, 2019

Document Performance Prediction for Automatic Text Classification.
Proceedings of the Advances in Information Retrieval, 2019

A Thorough Evaluation of Distance-Based Meta-Features for Automated Text Classification.
IEEE Trans. Knowl. Data Eng., 2018

NetClass: A network-based relational model for document classification.
Inf. Sci., 2018

Improving random forests by neighborhood projection for effective text classification.
Inf. Syst., 2018

BLOSS: Effective meta-blocking with almost no effort.
Inf. Syst., 2018

Exploiting efficient and effective lazy Semi-Bayesian strategies for text classification.
Neurocomputing, 2018

A Genetic Programming approach for feature selection in highly dimensional skewed data.
Neurocomputing, 2018

A Feature-Oriented Sentiment Rating for Mobile App Reviews.
Proceedings of the 2018 World Wide Web Conference on World Wide Web, 2018

User-Oriented Objective Prioritization for Meta-Featured Multi-Objective Recommender Systems.
Proceedings of the Adjunct Publication of the 26th Conference on User Modeling, 2018

Improving Tourism Prediction Models Using Climate and Social Media Data: A Fine-Grained Approach.
Proceedings of the Twelfth International Conference on Web and Social Media, 2018

Semantically-Enhanced Topic Modeling.
Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018

Incremental author name disambiguation by exploiting domain-specific heuristics.
J. Assoc. Inf. Sci. Technol., 2017

A general multiview framework for assessing the quality of collaboratively created content on web 2.0.
J. Assoc. Inf. Sci. Technol., 2017

A survey on tag recommendation methods.
J. Assoc. Inf. Sci. Technol., 2017

Ranked batch-mode active learning.
Inf. Sci., 2017

A Two-Stage Machine learning approach for temporally-robust text classification.
Inf. Syst., 2017

Diversity-based interactive learning meets multimodality.
Neurocomputing, 2017

Stacking Bagged and Boosted Forests for Effective Automated Classification.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

A framework for unexpectedness evaluation in recommendation.
Proceedings of the Symposium on Applied Computing, 2017

Rank Fusion and Multimodal Per-topic Adaptiveness for Diverse Image Retrieval.
Proceedings of the Working Notes Proceedings of the MediaEval 2017 Workshop co-located with the Conference and Labs of the Evaluation Forum (CLEF 2017), 2017

A Multicriteria Evaluation of Hybrid Recommender Systems: On the Usefulness of Input Data Characteristics.
Proceedings of the ICEIS 2017, 2017

Automatic Hierarchical Categorization of Research Expertise Using Minimum Information.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2017

Beyond Relevance: Explicitly Promoting Novelty and Diversity in Tag Recommendation.
ACM Trans. Intell. Syst. Technol., 2016

A quantitative analysis of the temporal effects on automatic text classification.
J. Assoc. Inf. Sci. Technol., 2016

On cold start for associative tag recommendation.
J. Assoc. Inf. Sci. Technol., 2016

TrendLearner: Early prediction of popularity trends of user generated content.
Inf. Sci., 2016

A multimodal query expansion based on genetic programming for visually-oriented e-commerce applications.
Inf. Process. Manag., 2016

On interactive learning-to-rank for IR: Overview, recent advances, challenges, and directions.
Neurocomputing, 2016

SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods.
EPJ Data Sci., 2016

Exploiting New Sentiment-Based Meta-level Features for Effective Sentiment Analysis.
Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, 2016

Generalized BROOF-L2R: A General Framework for Learning to Rank Based on Boosting and Random Forests.
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

The LExR Collection for Expertise Retrieval in Academia.
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

BERT: Melhorando Classificação de Texto com Árvores Extremamente Aleatórias, Bagging e Boosting.
Proceedings of the 31º Simpósio Brasileiro de Banco de Dados, 2016

On the combination of "off-the-shelf" sentiment analysis methods.
Proceedings of the 31st Annual ACM Symposium on Applied Computing, 2016

Early Prediction of Scholar Popularity.
Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, 2016

Dissecting a Scholar Popularity Ranking into Different Knowledge Areas.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2016

Incorporating Risk-Sensitiveness into Feature Selection for Learning to Rank.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

Compression-Based Selective Sampling for Learning to Rank.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

A Practical and Effective Sampling Selection Strategy for Large Scale Deduplication.
IEEE Trans. Knowl. Data Eng., 2015

On the combination of domain-specific heuristics for author name disambiguation: the nearest cluster method.
Int. J. Digit. Libr., 2015

Predicting the popularity of micro-reviews: A Foursquare case study.
Inf. Sci., 2015

A Benchmark Comparison of State-of-the-Practice Sentiment Analysis Methods.
CoRR, 2015

On Tag Recommendation for Expertise Profiling: A Case Study in the Scientific Domain.
Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, 2015

BROOF: Exploiting Out-of-Bag Errors, Boosting and Random Forests for Effective Automated Classification.
Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2015

An Efficient and Scalable MetaFeature-based Document Classification Approach based on Massively Parallel Computing.
Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2015

G-KNN: an efficient document classification algorithm for sparse datasets on GPUs using KNN.
Proceedings of the 30th Annual ACM Symposium on Applied Computing, 2015

Recod @ MediaEval 2015: Diverse Social Images Retrieval.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Combining Classifiers and User Feedback for Disambiguating Author Names.
Proceedings of the 15th ACM/IEEE-CE Joint Conference on Digital Libraries, 2015

Automatic Methods for Disambiguating Author Names in Bibliographic Data Repositories.
Proceedings of the 15th ACM/IEEE-CE Joint Conference on Digital Libraries, 2015

On the Impact of Academic Factors on Scholar Popularity: A Cross-Area Study.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2015

A genealogy of the work of collector: The document and its image.
Proceedings of the 2nd Digital Heritage International Congress, 2015

Parallel Lazy Semi-Naive Bayes Strategies for Effective and Efficient Document Classification.
Proceedings of the 24th ACM International Conference on Information and Knowledge Management, 2015

A Soft Computing Approach for Learning to Aggregate Rankings.
Proceedings of the 24th ACM International Conference on Information and Knowledge Management, 2015

Disambiguating Author Names using Minimum Bibliographic Information.
World Digit. Libr., 2014

On the Dynamics of Social Media Popularity: A YouTube Case Study.
ACM Trans. Internet Techn., 2014

Multimodal retrieval with relevance feedback based on genetic programming.
Multim. Tools Appl., 2014

Reducing Fragmentation in Incremental Author Name Disambiguation.
J. Inf. Data Manag., 2014

A Two-stage active learning method for learning to rank.
J. Assoc. Inf. Sci. Technol., 2014

Self-training author name disambiguation for information scarce scenarios.
J. Assoc. Inf. Sci. Technol., 2014

Personalized and object-centered tag recommendation methods for Web 2.0 applications.
Inf. Process. Manag., 2014

Improving the Effectiveness of Content Popularity Prediction Methods using Time Series Trends.
CoRR, 2014

Noticing the other gender on Google+.
Proceedings of the ACM Web Science Conference, 2014

What makes your opinion popular?: predicting the popularity of micro-reviews in foursquare.
Proceedings of the Symposium on Applied Computing, 2014

Combining domain-specific heuristics for author name disambiguation.
Proceedings of the IEEE/ACM Joint Conference on Digital Libraries, 2014

Characterizing scholar popularity: A case study in the Computer Science research community.
Proceedings of the IEEE/ACM Joint Conference on Digital Libraries, 2014

Quality assessment of collaborative content with minimal information.
Proceedings of the IEEE/ACM Joint Conference on Digital Libraries, 2014

Diversity-driven learning for multimodal image retrieval with relevance feedback.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

How you post is who you are: characterizing google+ status updates across social groups.
Proceedings of the 25th ACM Conference on Hypertext and Social Media, 2014

Popularity dynamics of foursquare micro-reviews.
Proceedings of the second ACM conference on Online social networks, 2014

On Efficient Meta-Level Features for Effective Text Classification.
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 2014

Key Issues Regarding Digital Libraries: Evaluation and Integration
Synthesis Lectures on Information Concepts, Retrieval, and Services, Morgan & Claypool Publishers, ISBN: 978-3-031-02283-8, 2013

Is Learning to Rank Worth it? A Statistical Analysis of Learning to Rank Methods in the LETOR Benchmarks.
J. Inf. Data Manag., 2013

A Comparative Study of Learning-to-Rank Techniques for Tag Recommendation.
J. Inf. Data Manag., 2013

Evaluation of parameters for combining multiple textual sources of evidence for Web image retrieval using genetic programming.
J. Braz. Comput. Soc., 2013

Temporal contexts: Effective text classification in evolving document collections.
Inf. Syst., 2013

An evolutionary approach to complex schema matching.
Inf. Syst., 2013

Assessing the quality of textual features in social media.
Inf. Process. Manag., 2013

Using early view patterns to predict the popularity of youtube videos.
Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, 2013

A formal approach for the specification of digital complex objects.
Proceedings of the 19th Brazilian Symposium on Multimedia and the Web, 2013

Polarity analysis of micro reviews in foursquare.
Proceedings of the 19th Brazilian Symposium on Multimedia and the Web, 2013

Measuring and addressing the impact of cold start on associative tag recommenders.
Proceedings of the 19th Brazilian Symposium on Multimedia and the Web, 2013

Tuning large scale deduplication with reduced effort.
Proceedings of the Conference on Scientific and Statistical Database Management, 2013

Polarity Detection of Foursquare Tips.
Proceedings of the Social Informatics - 5th International Conference, 2013

Exploiting user feedback to learn to rank answers in q&a forums: a case study with stack overflow.
Proceedings of the 36th International ACM SIGIR conference on research and development in Information Retrieval, 2013

UDRB: Uma Nova Heurística Eficaz para Deduplicação de Referências Bibliográficas.
Proceedings of the XXVIII Simpósio Brasileiro de Banco de Dados - Short Papers, Recife, Pernambuco, Brasil, September 30, 2013

GPU-NB: A Fast CUDA-Based Implementation of Naïve Bayes.
Proceedings of the 25th International Symposium on Computer Architecture and High Performance Computing, 2013

Topic diversity in tag recommendation.
Proceedings of the Seventh ACM Conference on Recommender Systems, 2013

Multimodal Image Geocoding: The 2013 RECOD's Approach.
Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, 2013

A relevance feedback approach for the author name disambiguation problem.
Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, 2013

Adaptive spammer detection at the source network.
Proceedings of the 2013 IEEE Global Communications Conference, 2013

Exploiting Novelty and Diversity in Tag Recommendation.
Proceedings of the Advances in Information Retrieval, 2013

Theoretical Foundations for Digital Libraries: The 5S (Societies, Scenarios, Spaces, Structures, Streams) Approach
Synthesis Lectures on Information Concepts, Retrieval, and Services, Morgan & Claypool Publishers, ISBN: 978-3-031-02279-1, 2012

Practical Detection of Spammers and Content Promoters in Online Video Sharing Systems.
IEEE Trans. Syst. Man Cybern. Part B, 2012

A Genetic Programming Approach to Record Deduplication.
IEEE Trans. Knowl. Data Eng., 2012

A brief survey of automatic methods for author name disambiguation.
SIGMOD Rec., 2012

Time-Aware Ranking in Sport Social Networks.
J. Inf. Data Manag., 2012

Improving Author Name Disambiguation with User Relevance Feedback.
J. Inf. Data Manag., 2012

A Multi-view Approach for the Quality Assessment of Wiki Articles.
J. Inf. Data Manag., 2012

Sentiment-based influence detection on Twitter.
J. Braz. Comput. Soc., 2012

A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
Inf. Sci., 2012

Cost-effective on-demand associative author name disambiguation.
Inf. Process. Manag., 2012

Improving On-Demand Learning to Rank through Parallelism.
Proceedings of the Web Information Systems Engineering - WISE 2012, 2012

Advertisement selection for online videos.
Proceedings of the Brazilian Symposium on Multimedia and the Web, 2012

Analysis of vulnerability to facebook users.
Proceedings of the Brazilian Symposium on Multimedia and the Web, 2012

Exploiting relevance, novelty and diversity in tag recommendation.
Proceedings of the Brazilian Symposium on Multimedia and the Web, 2012

Seleção de Atributos Utilizando Algoritmos Genéticos para Detecção do Vandalismo na Wikipedia.
Proceedings of the XXVII Simpósio Brasileiro de Banco de Dados, 2012

Is Learning to Rank Worth it? A Statistical Analysis of Learning to Rank Methods.
Proceedings of the XXVII Simpósio Brasileiro de Banco de Dados, 2012

Ranqueamento Supervisionado de Autores em Redes de Colaboração Científica.
Proceedings of the XXVII Simpósio Brasileiro de Banco de Dados, 2012

UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task.
Proceedings of the Working Notes Proceedings of the MediaEval 2012 Workshop, 2012

Active associative sampling for author name disambiguation.
Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries, 2012

A gender based study of tagging behavior in twitter.
Proceedings of the 23rd ACM Conference on Hypertext and Social Media, 2012

Automatic Vandalism Detection in Wikipedia with Active Associative Classification.
Proceedings of the Theory and Practice of Digital Libraries, 2012

On MultiView-Based Meta-learning for Automatic Quality Assessment of Wiki Articles.
Proceedings of the Theory and Practice of Digital Libraries, 2012

Automatic query expansion based on tag recommendation.
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

Aggressive and effective feature selection using genetic programming.
Proceedings of the IEEE Congress on Evolutionary Computation, 2012

Relevance feedback based on genetic programming for image retrieval.
Pattern Recognit. Lett., 2011

Information Retrieval Research at UFMG.
J. Inf. Data Manag., 2011

Competence-Conscious Associative Rank Aggregation.
J. Inf. Data Manag., 2011

Tackling Temporal Effects in Automatic Document Classification.
J. Inf. Data Manag., 2011

Towards a Formal Theory for Complex Objects and Content-Based Image Retrieval.
J. Inf. Data Manag., 2011

Evaluating Retrieval Effectiveness of Descriptors for Searching in Large Image Databases.
J. Inf. Data Manag., 2011

Incremental Unsupervised Name Disambiguation in Cleaned Digital Libraries.
J. Inf. Data Manag., 2011

Automatic Assessment of Document Quality in Web Collaborative Digital Libraries.
ACM J. Data Inf. Qual., 2011

A generic Web-based entity resolution framework.
J. Assoc. Inf. Sci. Technol., 2011

Calibrated lazy associative classification.
Inf. Sci., 2011

A relevance feedback method based on genetic programming for classification of remote sensing images.
Inf. Sci., 2011

Word co-occurrence features for text classification.
Inf. Syst., 2011

An unsupervised heuristic-based approach for bibliographic metadata deduplication.
Inf. Process. Manag., 2011

GreenMeter: a tool for assessing the quality and recommending tags for web 2.0 applications.
Proceedings of the Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011

Associative tag recommendation exploiting multiple textual features.
Proceedings of the Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011

Uma Abordagem Multi-Visão para a Estimativa de Qualidade de Artigos de Wikis.
Proceedings of the XXVI Simpósio Brasileiro de Banco de Dados, 2011

Rule-Based Active Sampling for Learning to Rank.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2011

A source independent framework for research paper recommendation.
Proceedings of the 2011 Joint International Conference on Digital Libraries, 2011

GreenWiki: a tool to support users' assessment of the quality of Wikipedia articles.
Proceedings of the 2011 Joint International Conference on Digital Libraries, 2011

Assessing documents' credibility with genetic programming.
Proceedings of the IEEE Congress on Evolutionary Computation, 2011

Assessing the quality of scientific conferences based on bibliographic citations.
Scientometrics, 2010

Learning to Rank using Query-Level Rules.
J. Inf. Data Manag., 2010

Automatic Document Classification Temporally Robust.
J. Inf. Data Manag., 2010

Estimating the Credibility of Examples in Automatic Document Classification.
J. Inf. Data Manag., 2010

A Multi-view Approach for Detecting Non-Cooperative Users in Online Video Sharing Systems.
J. Inf. Data Manag., 2010

Automatic Selection of Training Examples for a Record Deduplication Method Based on Genetic Programming.
J. Inf. Data Manag., 2010

Characterization and Analysis of User Profiles in Online Video Sharing Systems.
J. Inf. Data Manag., 2010

WCL2R: A Benchmark Collection for Learning to Rank Research with Clickthrough Data.
J. Inf. Data Manag., 2010

Equal but different: a contextual analysis of duplicated videos on YouTube.
J. Braz. Comput. Soc., 2010

Using structural information to improve search in Web collections.
J. Assoc. Inf. Sci. Technol., 2010

An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations.
J. Assoc. Inf. Sci. Technol., 2010

PaMS: A component-based service for finding the missing full text of articles cataloged in a digital library.
Inf. Syst., 2010

Classifying documents with link-based bibliometric measures.
Inf. Retr., 2010

On Popularity in the Blogosphere.
IEEE Internet Comput., 2010

On the Quality of Information for Web 2.0 Services.
IEEE Internet Comput., 2010

Video Pollution on the Web.
First Monday, 2010

Assessing the Value of Contributions in Tagging Systems.
Proceedings of the 2010 IEEE Second International Conference on Social Computing, 2010

ONDUX: on-demand unsupervised learning for information extraction.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2010

Temporally-aware algorithms for document classification.
Proceedings of the Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010

Demand-Driven Tag Recommendation.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2010

Learning to rank for content-based image retrieval.
Proceedings of the 11th ACM SIGMM International Conference on Multimedia Information Retrieval, 2010

Effective self-training author name disambiguation in scholarly digital libraries.
Proceedings of the 2010 Joint International Conference on Digital Libraries, 2010

Geographical classification of documents using evidence from Wikipedia.
Proceedings of the 6th Workshop on Geographic Information Retrieval, 2010

Exploiting co-occurrence and information quality metrics to recommend tags in web 2.0 applications.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

Tuning Genetic Programming parameters with factorial designs.
Proceedings of the IEEE Congress on Evolutionary Computation, 2010

Active Learning Genetic programming for record deduplication.
Proceedings of the IEEE Congress on Evolutionary Computation, 2010

A Genre-Aware Approach to Focused Crawling.
World Wide Web, 2009

Competence-conscious associative classification.
Stat. Anal. Data Min., 2009

A genetic programming framework for content-based image retrieval.
Pattern Recognit., 2009

Automatic evaluation of digital libraries with 5SQual.
J. Informetrics, 2009

A flexible approach for extracting metadata from bibliographic citations.
J. Assoc. Inf. Sci. Technol., 2009

An evolutionary approach for combining different sources of evidence in search engines.
Inf. Syst., 2009

Finding what is missing from a digital library: A case study in the Computer Science field.
Inf. Process. Manag., 2009

A contextual analysis of the YouTube duplicate content.
Proceedings of the XV Brazilian Symposium on Multimedia and the Web, 2009

Characterizing use and quality of textual attributes in Web 2.0 applications.
Proceedings of the XV Brazilian Symposium on Multimedia and the Web, 2009

Evaluation of users access and navigation profiles on web video sharing environments.
Proceedings of the XV Brazilian Symposium on Multimedia and the Web, 2009

On-Demand Associative Cross-Language Information Retrieval.
Proceedings of the String Processing and Information Retrieval, 2009

Detecting spammers and content promoters in online video social networks.
Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009

The Metric Dilemma: Competence-Conscious Associative Classification.
Proceedings of the SIAM International Conference on Data Mining, 2009

Recuperação de Imagens da Web Utilizando Múltiplas Evidências Textuais e Programação Genética.
Proceedings of the XXIV Simpósio Brasileiro de Banco de Dados, 2009

Classificação Automática de Documentos Robusta Temporalmente.
Proceedings of the XXIV Simpósio Brasileiro de Banco de Dados, 2009

Seleção Automática de Exemplos de Treino para um Método de Deduplicação de Registros baseado em Programação Genética.
Proceedings of the XXIV Simpósio Brasileiro de Banco de Dados, 2009

Exploiting contexts to deal with uncertainty in classification.
Proceedings of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data, 2009

Using web information for author name disambiguation.
Proceedings of the 2009 Joint International Conference on Digital Libraries, 2009

Learning to assess the quality of scientific conferences: a case study in computer science.
Proceedings of the 2009 Joint International Conference on Digital Libraries, 2009

Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia.
Proceedings of the 2009 Joint International Conference on Digital Libraries, 2009

SyGAR - A Synthetic Data Generator for Evaluating Name Disambiguation Methods.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2009

Evidence of quality of textual features on the web 2.0.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

A Web services-based framework for building componentized digital libraries.
J. Syst. Softw., 2008

Towards a digital library theory: a formal digital library ontology.
Int. J. Digit. Libr., 2008

A digital library environment for integrating, disseminating and exploring ecological data.
Ecol. Informatics, 2008

Understanding temporal aspects in document classification.
Proceedings of the International Conference on Web Search and Web Data Mining, 2008

Detectando usuários maliciosos em interações via vídeos no YouTube.
Proceedings of the 14th Brazilian Symposium on Multimedia and Web Systems, 2008

Learning to rank at query-time using association rules.
Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2008

From concepts to implementation and visualization: tools from a team-based approach to ir.
Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2008

Image Retrieval with Relevance Feedback based on Genetic Programming.
Proceedings of the XXIII Simpósio Brasileiro de Banco de Dados, 2008

The Impact of Parameters Setup on a Genetic Programming Approach to Record Deduplication.
Proceedings of the XXIII Simpósio Brasileiro de Banco de Dados, 2008

Uma Abordagem Efetiva e Eficiente para Deduplicação de Metadados Bibliográficos de Objetos Digitais.
Proceedings of the XXIII Simpósio Brasileiro de Banco de Dados, 2008

Replica identification using genetic programming.
Proceedings of the 2008 ACM Symposium on Applied Computing (SAC), 2008

Analyzing the impact of churn and malicious behavior on the quality of peer-to-peer web search.
Proceedings of the 2008 ACM Symposium on Applied Computing (SAC), 2008

The impact of term selection in genre-aware focused crawling.
Proceedings of the 2008 ACM Symposium on Applied Computing (SAC), 2008

Keeping a digital library clean: new solutions to old problems.
Proceedings of the 2008 ACM Symposium on Document Engineering, 2008

Exploiting temporal contexts in text classification.
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

Transitioning from the ecological fieldwork to an online repository: a digital library solution and evaluation.
Int. J. Digit. Libr., 2007

Evaluating a digital library self-archiving service: The BDBComp user case study.
Inf. Process. Manag., 2007

"What is a good digital library?" - A quality model for digital libraries.
Inf. Process. Manag., 2007

Exploiting Genre in Focused Crawling.
Proceedings of the String Processing and Information Retrieval, 2007

A combined component approach for finding collection-adapted ranking functions based on genetic programming.
Proceedings of the SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007

A Component-Based Digital Library Service for Finding Missing Documents.
Proceedings of the XXII Simpósio Brasileiro de Banco de Dados, 2007

A Heuristic-based Hierarchical Clustering Method for Author Name Disambiguation in Digital Libraries.
Proceedings of the XXII Simpósio Brasileiro de Banco de Dados, 2007

Multi-label Lazy Associative Classification.
Proceedings of the Knowledge Discovery in Databases: PKDD 2007, 2007

FLUX-CIM: flexible unsupervised extraction of citation metadata.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2007

5SQual: a quality assessment tool for digital libraries.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2007

PIM through a 5S perspective.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2007

Evaluating Digital Libraries with 5SQual.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2007

Personal digital library: pim through a 5s perspective.
Proceedings of the First Ph.D. Workshop in CIKM, 2007

Computing block importance for searching on web sites.
Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, 2007

A Genetic Programming Approach for Combining Structural and Citation-Based Evidence for Text Classification in Web Digital Libraries.
Proceedings of the Soft Computing in Web Information Retrieval - Models and Applications, 2006

A digital library framework for biodiversity information systems.
Int. J. Digit. Libr., 2006

Link-based similarity measures for the classification of Web documents.
J. Assoc. Inf. Sci. Technol., 2006

Learning to advertise.
Proceedings of the SIGIR 2006: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2006

On RDBMS and Workflow Support for Componentized Digital Libraries.
Proceedings of the XXI Simpósio Brasileiro de Banco de Dados, 2006

A comparative study of citations and links in document classification.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2006

Learning to deduplicate.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2006

A Content-Based Image Retrieval Service for Archaeology Collections.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2006

Design, Implementation, and Evaluation of a Wizard Tool for Setting Up Component-Based Digital Libraries.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2006

Multi-evidence, multi-criteria, lazy associative document classification.
Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management, 2006

Schema Mapper: A Visualization Tool for Digital Library Integration.
Bull. IEEE Tech. Comm. Digit. Libr., 2005

Intelligent fusion of structural and citation-based evidence for text classification.
Proceedings of the SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2005

Remoção de Ambiguidades na Identificação de Autoria de Objetos Bibliográficos.
Proceedings of the 20° Simpósio Brasileiro de Bancos de Dados, 2005

Bibliotecas Digitais/Digital Libraries.
Proceedings of the 20° Simpósio Brasileiro de Bancos de Dados, 2005

Uma Biblioteca Digital Georreferenciada para Dados Ecológicos.
Proceedings of the 20° Simpósio Brasileiro de Bancos de Dados, 2005

A usability evaluation study of a digital library self-archiving service.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2005

Schema mapper: a visualization tool for DL integration.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2005

Introduction to (teaching/learning about) digital libraries.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2005

Requirements Gathering and Modeling of Domain-Specific Digital Libraries with the 5S Framework: An Archaeological Case Study with ETANA.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2005

Incremental, Semi-automatic, Mapping-Based Integration of Heterogeneous Collections into Archaeological Digital Libraries: Megiddo Case Study.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2005

Intelligent GP fusion from multiple sources for text classification.
Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management, Bremen, Germany, October 31, 2005

A new framework to combine descriptors for content-based image retrieval.
Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management, Bremen, Germany, October 31, 2005

The Role of Digital Libraries in Moving Toward Knowledge Environments.
Proceedings of the From Integrated Publication and Information Systems to Virtual Information and Knowledge Environments, 2005

Streams, Structures, Spaces,Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications.
PhD thesis, 2004

Streams, structures, spaces, scenarios, societies (5s): A formal model for digital libraries.
ACM Trans. Inf. Syst., 2004

Recommender Systems Research: A Connection-Centric Survey.
J. Intell. Inf. Syst., 2004

Modelagem de Bibliotecas Digitais usando a Abordagem 5S: Um Estudo de Caso.
Proceedings of the XIX Simpósio Brasileiro de Bancos de Dados, 2004

An OAI compliant content-based image search component.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2004

Using digital library components for biodiversity systems.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2004

ETANA-DL: managing complex information applications - an archaeology digital library.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2004

ETANA-DL: a digital library for integrated handling of heterogeneous archaeological data.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2004

BDBComp: building a digital library for the Brazilian computer science community.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2004

The effectiveness of automatically structured queries in digital libraries.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2004

Prototyping Digital Libraries Handling Heterogeneous Data Sources - The ETANA-DL Case Study.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2004

Combining structural and citation-based evidence for text classification.
Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management, 2004

Harvesting: Broadening the Field of Distributed Information Retrieval.
Proceedings of the Distributed Multimedia Information Retrieval, 2003

5SGraph Demo: A Graphical Modeling Tool for Digital Libraries.
Proceedings of the ACM/IEEE 2003 Joint Conference on Digital Libraries (JCDL 2003), 2003

The XML Log Standard for Digital Libraries: Analysis, Evolution, and Deployment.
Proceedings of the ACM/IEEE 2003 Joint Conference on Digital Libraries (JCDL 2003), 2003

The Web-DL Environment for Building Digital Libraries from the Web.
Proceedings of the ACM/IEEE 2003 Joint Conference on Digital Libraries (JCDL 2003), 2003

An OAI-Based Filtering Service for CITIDEL from NDLTD.
Proceedings of the Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access, 2003

Visual Semantic Modeling of Digital Libraries.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2003

Scenario-Based Generation of Digital Library Services.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2003

Combining link-based and content-based methods for web document classification.
Proceedings of the 2003 ACM CIKM International Conference on Information and Knowledge Management, 2003

The networked digital library of theses and dissertations: Changes in the university community.
J. Comput. High. Educ., 2002

A Connection-Centric Survey of Recommender Systems Research
CoRR, 2002

Java MARIAN: From an OPAC to a Modern Digital Library System.
Proceedings of the String Processing and Information Retrieval, 2002

5SL: a language for declarative specification and generation of digital libraries.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2002

An XML Log Standard and Tool for Digital Library Logging Analysis.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2002

Web-DL: an experience in building digital libraries from the web.
Proceedings of the 2002 ACM CIKM International Conference on Information and Knowledge Management, 2002

Networked Digital Library of Theses and Dissertations: Bridging the Gaps for Global Access - Part 2: Services and Research.
D Lib Mag., 2001

Networked Digital Library of Theses and Dissertations: Bridging the Gaps for Global Access - Part 1: Mission and Progress.
D Lib Mag., 2001

Building Interoperable Digital Library Services: MARIAN, Open Archives and NDLTD.
Proceedings of the SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2001

MARIAN: Flexible Interoperability for Federated Digital Libraries.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2001

Modeling and Building Personalized Digital Libraries with PIPE and 5SL.
Proceedings of the Second DELOS Network of Excellence Workshop on Personalisation and Recommender Systems in Digital Libraries, 2001

NDLTD: una biblioteca digital global de tesis doctorales y de licenciatura.
Proceedings of the I Jornadas de Bibliotecas Digitales, 2000

MARIAN Searching and Querying across Heterogeneous Federated Digital Libraries.
Proceedings of the First DELOS Network of Excellence Workshop on Information Seeking, 2000

Constructing Geographic Digital Libraries using a Hypermedia Framework.
Multim. Tools Appl., 1999

A Framework for Designing and Implementing the User Interface of a Geographic Digital Library.
Int. J. Digit. Libr., 1999

Initiatives That Center on Scientific Dissemination.
Commun. ACM, 1998
