GitHub - jeandersonbc/predictive-method-logging: Study on the placement of log statements and predicting whether a method should be logged.

Predictive Method Logging

Learning whether a method should be logged or not based on code metrics.

Minimum requirements

Bash environment to run scripts
Java >= 8.0
Python >= 3.6 with pip available

The subject selection step in our experiments relies on a R script with the dplyr package available. This is only required if you are interested in replicating our study or trying different selection criteria.

Important

We highly recommend you to create a Python virtual environment in your working directory before starting:
- python3 -m venv .venv && source .venv/bin/activate
"Where is the dataset?"
- Raw data is not provided for practical reasons; however, the process to generate and analyze data is fully automated.
- Source files and data related to our industry partner are confidential and unavailable.

Getting Started

Getting started is easy as 1, 2, 3:

Step	What	How
1	Install the Python dependencies	`pip3 install -r requirements.txt`
2	Build the project components	`./gradlew deploy-aux-tools`
3	Get the selected list of Apache projects (~3.5 Gb)	`./gradlew fetch-projects-paper`

Some Python scripts depends on the dependencies installed from Step 1 (e.g., Pandas and Numpy) to analyze intermediate data during experimentation. Step 3 will download 29 Apache projects from a CSV file into a apache-downloads dir. Also, it will generate a apache-projects dir with scripts that exports the absolute path and revision of a given project. If you want to download the initial list of all 69 Apache projects, run ./gradlew fetch-apache-projects instead. Keep in mind that the full list contains ~7GB.

Hello World

By now, you should be able have some fun. Try to process the project Apache Commons BeansUtils:

> ./run-single.sh apache-projects/commons-beanutils.sh

What happenend?

Scripts classified java source files
Scripts analyzed the presence of log statements in those files
- Num. of log statements, where they were placed, and how they are distributed
Created a copy of the analyzed files and removed log statements for feature extraction

This is the expected output:

> find out -d 2 -type d
out/analysis/commons-beanutils      # Analysis of source files
out/dataset/commons-beanutils       # Contains the final dataset for machine learning experiments
out/log-removal/commons-beanutils   # Contains a copy of source files without log statements
out/codemetrics/commons-beanutils   # Contains code metrics of the analyzed files

With the generated CSV, you can reuse our machine learning package in a interactive environment:

>>> import logpred_method
>>> X, y = logpred_method.load_dataset("out/dataset/commons-beanutils/dataset_full.csv")
>>> from sklearn.model_selection import train_test_split
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, test_size=0.2, random_state=0)
>>> out = logpred_method.run("rf", X_train, X_test, y_train, y_test, output_to="output.log", tuning_enabled=False)

For details about replicating our study, see docs/Paper Evaluation - Tuning Enabled.pdf.

Feel free to use the generated dataset with your preferred ML library. We encourage you to explore your own ML training process and compare it with our results.

Components

log-identifier: Cross-project utility that identifies log statements based on regex
log-placement-analyzer: Analyzes the placement of log statements in a project given a list of source files.
log-remover: Removes log statement from a project
log-prediction: ML component for experimentation
java-token-extractor: Utility for tokens and method calls extraction

Hungry for more?

Feel free to post a question in the Q&A Section in the Discussions tab.

Please, keep the Issues for bugs/code-related concerns/etc.

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
docs		docs
gradle/wrapper		gradle/wrapper
java-token-extractor		java-token-extractor
log-identifier		log-identifier
log-placement-analyzer		log-placement-analyzer
log-prediction		log-prediction
log-remover		log-remover
tools		tools
.gitignore		.gitignore
.travis.yml		.travis.yml
Paper Evaluation.ipynb		Paper Evaluation.ipynb
README.md		README.md
apache-projects-all.csv		apache-projects-all.csv
apache-projects-paper.csv		apache-projects-paper.csv
build.gradle.kts		build.gradle.kts
gradlew		gradlew
gradlew.bat		gradlew.bat
reqs-updates.txt		reqs-updates.txt
requirements.txt		requirements.txt
run-all-apache.sh		run-all-apache.sh
run-single.sh		run-single.sh
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predictive Method Logging

Minimum requirements

Important

Getting Started

Hello World

Components

Hungry for more?

About

Releases

Packages

Languages

jeandersonbc/predictive-method-logging

Folders and files

Latest commit

History

Repository files navigation

Predictive Method Logging

Minimum requirements

Important

Getting Started

Hello World

Components

Hungry for more?

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages