Skip to content

In this project, we seek to examine the reported symptoms of exceptions and assess their impact on community support in resolving failures. This study uses seven popular ML libraries on Stack Overflow.

Notifications You must be signed in to change notification settings

mooselab/MLExceptionSymptoms

Repository files navigation

On the Ambiguity of Machine Learning Exception Symptoms: Insights from Stack Overflow Posts

Software stack traces are a primary source of information for developers that aim to correct erroneous software behavior. While Machine Learning (ML) applications can also produce stack traces, their stack traces are the object of few studies. In this paper, we seek to examine the reported symptoms of exceptions and assess their impact on community support in resolving failures. Our study uses seven popular ML libraries on Stack Overflow (SO). Our focus extends beyond call stacks to contain the additional aspects of exceptions, specifically the exception type and error message. This study reveals the distribution of ML exception symptoms on SO, highlighting that approximately half of the questions that include stack traces involve all three symptoms, while others feature specific combinations. The exploration of symptom ambiguity indicates that individual error message templates or exception types do not definitively identify ML exception-related questions. Notably, questions solely featuring call stack symptoms are less likely to receive prompt community support. However, incorporating exception types and error messages alongside call stacks enhances the likelihood and speed of obtaining accepted answers.

Overview

An overview of the study design

image

Project File Structure

├── MLExceptionSymptoms-Main.ipynb   <~~~~ Main File
├── Drain3-master.zip                <~~~~ All necessary files related to running the Drain
├── Input_data/                      <~~~~ Adding from the dependency files link (Project Dump)
├── Pickle_data/                     <~~~~ Adding from the dependency files link (Project Dump)
├── Result/                          <~~~~ All the Figures
├── readme_figures/                  <~~~~ readme.md JPG Figures 
├── conda_requirements.txt           <~~~~ requirements file generated by Conda
├── requirements.txt                 <~~~~ Regular requirements file
└── README.md

Project Dump

We uploaded the code with all the necessary data (zenodo).

Installation

The command below will help you install all the necessary packages.

  conda install --file conda_requirements.txt

or

  conda create --name myenv --file conda_requirements.txt

Appendix: Results

A truncated example of a SO post (ID=60798712)

image

The trend of questions with at least one EM and without any CS

image

Covered Python ML library tags

image

The cumulative frequency distribution of ML-related questions vs the percentage of Exception types

image

The cumulative frequency distribution of ML-related questions vs the percentage of Error Message Template

image

Top 30 most frequent Error Message Templates

image

Top 20 most frequent Exception Types

image

The level of community support for questions with different combinations of exception symptoms

image

The status of community support among questions with and without similarity between Error Message and the tile of exception questions. Numbers on the box plots illus- trate their medians.

image

Distribution of Error types

image

Acknowledgements

Citing & Contacts

Please cite our work if you find it helpful to your research.

Paper information

@article{article,
year = {2024},
month = {},
pages = {},
title = {},
volume = {},
journal = {},
doi = {}
}

License

MIT License

About

In this project, we seek to examine the reported symptoms of exceptions and assess their impact on community support in resolving failures. This study uses seven popular ML libraries on Stack Overflow.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published