SENAI: Towards Software Engineering Native Generative Artificial Intelligence.
CoRR, March, 2025
On Inter-Dataset Code Duplication and Data Leakage in Large Language Models.
IEEE Trans. Software Eng., January, 2025
ALPINE: An adaptive language-agnostic pruning method for language models for code.
CoRR, 2024
CONCORD: Towards a DSL for Configurable Graph Code Representation.
CoRR, 2024
Enhancing Identifier Naming Through Multi-Mask Fine-Tuning of Language Models of Code.
Proceedings of the IEEE International Conference on Source Code Analysis and Manipulation, 2024
Naturalness of Attention: Revisiting Attention in Code Language Models.
Proceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results, 2024
On Inter-dataset Code Duplication and Data Leakage in Large Language Models.
Dataset, December, 2023
Calibrating Deep Learning-based Code Smell Detection using Human Feedback.
Proceedings of the 23rd IEEE International Working Conference on Source Code Analysis and Manipulation, 2023
DACOS - A Manually Annotated Dataset of Code Smells.
Proceedings of the 20th IEEE/ACM International Conference on Mining Software Repositories, 2023