Explaining Software Bugs Leveraging Code Structures in Neural Machine Translation

May 03, 2024

Explaining Software Bugs Leveraging Code Structures in Neural Machine Translation

Parvez Mahbub, Ohiduzzaman Shuvo, and M. Masudur Rahman. Explaining Software Bugs Leveraging Code Structures in Neural Machine Translation. In Proceeding of The 45th IEEE/ACM International Conference on Software Engineering (ICSE 2023), pp. 12, Melbourne, Australia, May 2023

Masud Rahman

May 03, 2024

More Decks by Masud Rahman

See All by Masud Rahman

Predicting Line-Level Defects by Capturing Code Contexts with Hierarchical Transformers

masud2336

Can We Identify Stack Overflow Questions Requiring Code Snippets? Investigating the Cause & Effect of Missing Code Snippets

masud2336

Recommending Code Reviews Leveraging Code Changes with Structured Information Retrieval

masud2336

Bugsplainer: Leveraging Code Structures to Explain Software Bugs with Neural Machine Translation

masud2336

Featured

See All Featured

Keith and Marios Guide to Fast Websites

keithpitt

408

22k

Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure

mongodb

8.9k

How to name files

jennybc

94k

RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub

eileencodes

127

32k

The Language of Interfaces

destraynor

151

23k

For a Future-Friendly Web

brad_frost

172

9.1k

Web development in the modern age

philhawksworth

203

10k

jQuery: Nuts, Bolts and Bling

dougneiner

7.2k

Why You Should Never Use an ORM

jnunemaker

PRO

8.7k

Build The Right Thing And Hit Your Dates

maggiecrowley

2.1k

Documentation Writing (for coders)

carmenintech

What's in a price? How to price your products and services

michaelherold

238

11k

Transcript

Bugsplainer: Explaining Software Bugs Leveraging Code Structures in Neural Machine
Translation Parvez Mahbub Dalhousie University [email protected] Ohiduzzaman Shuvo Dalhousie University [email protected] Mohammad Masudur Rahman Dalhousie University [email protected]
2 Explaining Software Bugs 🪲 2 Why should we care?
[1] T. Roehm, R. Tiarks, R. Koschke, and W. Maalej, “How do professional developers comprehend software?” ICSE 2012 Numerous approaches to automatically find the location of a bug Identify certain parts of the code as buggy without offering any meaningful explanation Developers spend ≈ 50% of their time comprehending the code during software maintenance [1]
We present Bugsplainer — a novel transformer-based generative model Can
leverage code structures Trained using both buggy and fixed source code 3
4 In: Bug-fix Commit Generate diffSBT Discriminatory Pre-train diffSBT of
Bug-free Nodes diffSBT of Buggy Nodes Fine-tune Commit Message Out: Fine-tuned Model In: Buggy Code Generate diffSBT Fine-tuned Model Out: Generated Explanation In: Line Numbers to Explain diffSBT How does it work? Training Bugsplainer Explanation Generation
5 diffSBT
Experiments Automatic Metrics & Human Evaluation 6
Experimental Design Dataset ► 10,000 repository ► 150,000 bugfix commit
► 110,000 commits for training Model ► RoBERTa tokenizer ► T5 architecture ► 60M parameters 7
Evaluation using Metrics 8 Model BLEU Semantic Similarity Exact Match
pyflakes 0.49 5.68 0.00 CommitGen 9.94 35.39 1.04 NNGen 24.16 47.33 14.17 Fine-tuned CodeT5 26.19 54.52 8.85 Bugsplainer 32.90 55.22 18.14
Human Evaluation 9 # Developers # Countries Programming Experience Bugfix
Experience 20 6 1-10 years 1-7 years
How does it look like? 10 Technique Generated Explanation Ground
Truth Fix a bug where the lyricswiki fetcher would try to unescape an empty (None) response and crash CommitGen Small bug fix for error handling NNGen fix UnicodeDecodeError with non-ASCII text Fine-tuned CodeT5 Don’t try to get lyrics if we are licensed pyflakes no error found Bugsplainer fix crash when lyrics not found
Bugsplainer Meets ChatGPT 11 Ground Truth Fix a bug where
the lyricswiki fetcher would try to unescape an empty (None) response and crash
12 12 ► Software bugs not only claim precious development
time but also cost billions every year ► We propose Bugsplainer, a novel technique that generates explanations for buggy code segments ► Bugsplainer outperforms the baselines. ► This work was supported by Mitacs Accelerate International Program and our industry partner — Metabob Inc. Take-Home Messages
Thank You! Questions? PARVEZMROBIN.COM
14