WorryFree Computers   »   [go: up one dir, main page]

Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Explaining Software Bugs Leveraging Code Structures in Neural Machine Translation

Masud Rahman
May 03, 2024
8

Explaining Software Bugs Leveraging Code Structures in Neural Machine Translation

Parvez Mahbub, Ohiduzzaman Shuvo, and M. Masudur Rahman. Explaining Software Bugs Leveraging Code Structures in Neural Machine Translation. In Proceeding of The 45th IEEE/ACM International Conference on Software Engineering (ICSE 2023), pp. 12, Melbourne, Australia, May 2023

Masud Rahman

May 03, 2024
Tweet

Transcript

  1. Bugsplainer: Explaining Software Bugs Leveraging Code Structures in Neural Machine

    Translation Parvez Mahbub Dalhousie University [email protected] Ohiduzzaman Shuvo Dalhousie University [email protected] Mohammad Masudur Rahman Dalhousie University [email protected]
  2. 2 Explaining Software Bugs 🪲 2 Why should we care?

    [1] T. Roehm, R. Tiarks, R. Koschke, and W. Maalej, “How do professional developers comprehend software?” ICSE 2012 Numerous approaches to automatically find the location of a bug Identify certain parts of the code as buggy without offering any meaningful explanation Developers spend ≈ 50% of their time comprehending the code during software maintenance [1]
  3. We present Bugsplainer — a novel transformer-based generative model Can

    leverage code structures Trained using both buggy and fixed source code 3
  4. 4 In: Bug-fix Commit Generate diffSBT Discriminatory Pre-train diffSBT of

    Bug-free Nodes diffSBT of Buggy Nodes Fine-tune Commit Message Out: Fine-tuned Model In: Buggy Code Generate diffSBT Fine-tuned Model Out: Generated Explanation In: Line Numbers to Explain diffSBT How does it work? Training Bugsplainer Explanation Generation
  5. Experimental Design Dataset ► 10,000 repository ► 150,000 bugfix commit

    ► 110,000 commits for training Model ► RoBERTa tokenizer ► T5 architecture ► 60M parameters 7
  6. Evaluation using Metrics 8 Model BLEU Semantic Similarity Exact Match

    pyflakes 0.49 5.68 0.00 CommitGen 9.94 35.39 1.04 NNGen 24.16 47.33 14.17 Fine-tuned CodeT5 26.19 54.52 8.85 Bugsplainer 32.90 55.22 18.14
  7. How does it look like? 10 Technique Generated Explanation Ground

    Truth Fix a bug where the lyricswiki fetcher would try to unescape an empty (None) response and crash CommitGen Small bug fix for error handling NNGen fix UnicodeDecodeError with non-ASCII text Fine-tuned CodeT5 Don’t try to get lyrics if we are licensed pyflakes no error found Bugsplainer fix crash when lyrics not found
  8. Bugsplainer Meets ChatGPT 11 Ground Truth Fix a bug where

    the lyricswiki fetcher would try to unescape an empty (None) response and crash
  9. 12 12 ► Software bugs not only claim precious development

    time but also cost billions every year ► We propose Bugsplainer, a novel technique that generates explanations for buggy code segments ► Bugsplainer outperforms the baselines. ► This work was supported by Mitacs Accelerate International Program and our industry partner — Metabob Inc. Take-Home Messages
  10. 14