Nghi D. Q. Bui

Orcid: 0000-0003-1984-4329

According to our database1, Nghi D. Q. Bui authored at least 38 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs.
CoRR, 2024

HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale.
CoRR, 2024

XMainframe: A Large Language Model for Mainframe Modernization.
CoRR, 2024

Learning to Predict Program Execution by Modeling Dynamic Dependency on Code Graphs.
CoRR, 2024

REPOEXEC: Evaluate Code Generation with a Repository-Level Executable Benchmark.
CoRR, 2024

AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology.
CoRR, 2024

Envisioning the Next-Generation AI Coding Assistants: Insights & Proposals.
CoRR, 2024

RepoHyper: Better Context Retrieval Is All You Need for Repository-Level Code Completion.
CoRR, 2024

Dopamin: Transformer-based Comment Classifiers through Domain Post-Training and Multi-level Layer Aggregation.
Proceedings of the Third ACM/IEEE International Workshop on NL-based Software Engineering, 2024

DocChecker: Bootstrapping Code Large Language Model for Detecting and Resolving Code-Comment Inconsistencies.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

2023
Neural Rankers for Code Generation via Inter-Cluster Modeling.
CoRR, 2023

Bootstrapping Code-Text Pretrained Language Model to Detect Inconsistency Between Code and Comment.
CoRR, 2023

CodeTF: One-stop Transformer Library for State-of-the-art Code LLM.
CoRR, 2023

The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation.
CoRR, 2023

Class based Influence Functions for Error Detection.
CoRR, 2023

CodeT5+: Open Code Large Language Models for Code Understanding and Generation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Better Language Models of Code through Self-Improvement.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Class based Influence Functions for Error Detection.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

2022
Detect-Localize-Repair: A Unified Framework for Learning to Debug with CodeT5.
CoRR, 2022

Learning to Represent Programs with Code Hierarchies.
CoRR, 2022

Towards Using Data-Centric Approach for Better Code Representation Learning.
CoRR, 2022

Towards Using Data-Influence Methods to Detect Noisy Samples in Source Code Corpora.
Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, 2022

Towards Robust Models of Code via Energy-Based Learning on Auxiliary Datasets.
Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, 2022

2021
On the generalizability of Neural Program Models with respect to semantic-preserving program transformations.
Inf. Softw. Technol., 2021

Self-Supervised Contrastive Learning for Code Retrieval and Summarization via Semantic-Preserving Transformations.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees.
Proceedings of the 43rd IEEE/ACM International Conference on Software Engineering, 2021

TreeCaps: Tree-Based Capsule Networks for Source Code Processing.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Efficient Framework for Learning Code Representations through Semantic-Preserving Program Transformations.
CoRR, 2020

On the Generalizability of Neural Program Analyzers with respect to Semantic-Preserving Program Transformations.
CoRR, 2020

2019
TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing.
CoRR, 2019

Bilateral Dependency Neural Networks for Cross-Language Algorithm Classification.
Proceedings of the 26th IEEE International Conference on Software Analysis, 2019

SAR: learning cross-language API mappings with little knowledge.
Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019

AutoFocus: Interpreting Attention-Based Neural Networks by Code Perturbation.
Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, 2019

Towards zero knowledge learning for cross language API mappings.
Proceedings of the 41st International Conference on Software Engineering: Companion Proceedings, 2019

2018
Hierarchical learning of cross-language mappings through distributed vector representations for code.
Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging Results, 2018

Cross-Language Learning for Program Classification Using Bilateral Tree-Based Convolutional Neural Networks.
Proceedings of the Workshops of the The Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Cross-Language Learning for Program Classification using Bilateral Tree-Based Convolutional Neural Networks.
CoRR, 2017


  Loading...