Algerian Forest Fire Prediction

Introduction
Import Libraries and Load Data
Data Preprocessing
- Drop Unnecessary Columns
- Encoding Categorical Variables
Define Independent and Dependent Features
Train-Test Split
Feature Selection Based on Correlation
Feature Scaling
- Visualize Effect of Scaling
Linear Regression Model
Lasso Regression
Hyperparameter Tuning for Lasso Regression
Ridge Regression
Hyperparameter Tuning for Ridge Regression
Elastic Net
Hyperparameter Tuning for Elastic Net
Save Models and Scaler
Results
Conclusion
Future Work

Introduction

This project aims to predict the Fire Weather Index (FWI) using various regression models. The dataset used is the Algerian Forest Fires dataset, which has been cleaned and preprocessed.

Import Libraries and Load Data

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

df = pd.read_csv(r'N:\Personal_Projects\Machine-Learning\Algerianforestfire\Algerian_forest_fires_cleaned_dataset.csv')
df.head()
df.tail()
df.info()
df.dtypes
df.describe()
df.info()
df.columns

Data Preprocessing

Drop Unnecessary Columns

df.drop(['day', 'month', 'year'], axis=1, inplace=True)
df.head()

Encoding Categorical Variables

df['Classes'] = np.where(df['Classes'].str.contains("not fire"), 0, 1)
df.head()
df['Classes'].value_counts()

Define Independent and Dependent Features

X = df.drop('FWI', axis=1)
y = df['FWI']

X.head()
y

Train-Test Split

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
X_train.shape, X_test.shape

Feature Selection Based on Correlation

Calculate Correlation Matrix

X_train.corr()

Check for Multicollinearity

plt.figure(figsize=(12,10))
corr = X_train.corr()
sns.heatmap(corr, annot=True)

Drop Highly Correlated Features

def correlation(dataset, threshold):
    col_corr = set()
    corr_matrix = dataset.corr()
    for i in range(len(corr_matrix.columns)):
        for j in range(i):
            if abs(corr_matrix.iloc[i, j]) > threshold:
                colname = corr_matrix.columns[i]
                col_corr.add(colname)
    return col_corr

corr_features = correlation(X_train, 0.85)
X_train.drop(corr_features, axis=1, inplace=True)
X_test.drop(corr_features, axis=1, inplace=True)
X_train.shape, X_test.shape

Feature Scaling

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
X_train_scaled
X_test_scaled

Visualize Effect of Scaling

plt.subplots(figsize=(15, 5))
plt.subplot(1, 2, 1)
sns.boxplot(data=X_train)
plt.title('X_train Before Scaling')
plt.subplot(1, 2, 2)
sns.boxplot(data=X_train_scaled)
plt.title('X_train After Scaling')

Linear Regression Model

from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, r2_score

linearreg = LinearRegression()
linearreg.fit(X_train_scaled, y_train)
y_pred = linearreg.predict(X_test_scaled)
mae = mean_absolute_error(y_test, y_pred)
score = r2_score(y_test, y_pred)
print("Mean absolute error", mae)
print("R2 score", score)
plt.scatter(y_test, y_pred)

Lasso Regression

from sklearn.linear_model import Lasso

lasso = Lasso()
lasso.fit(X_train_scaled, y_train)
y_pred = lasso.predict(X_test_scaled)
mae = mean_absolute_error(y_test, y_pred)
score = r2_score(y_test, y_pred)
print("Mean absolute error", mae)
print("R2 Score", score)
plt.scatter(y_test, y_pred)

Hyperparameter Tuning for Lasso Regression

from sklearn.linear_model import LassoCV

lass = LassoCV(cv=5)
lass.fit(X_train_scaled, y_train)
y_pred = lass.predict(X_test_scaled)
plt.scatter(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
score = r2_score(y_test, y_pred)
print("Mean absolute error", mae)
print("R2 Score", score)

Ridge Regression

from sklearn.linear_model import Ridge

rid = Ridge()
rid.fit(X_train_scaled, y_train)
y_pred = rid.predict(X_test_scaled)
mae = mean_absolute_error(y_test, y_pred)
score = r2_score(y_test, y_pred)
print("Mean absolute error", mae)
print("R2 Score", score)
plt.scatter(y_test, y_pred)

Hyperparameter Tuning for Ridge Regression

from sklearn.linear_model import RidgeCV

ridgecv = RidgeCV(cv=5)
ridgecv.fit(X_train_scaled, y_train)
y_pred = ridgecv.predict(X_test_scaled)
plt.scatter(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
score = r2_score(y_test, y_pred)
print("Mean absolute error", mae)
print("R2 Score", score)

Elastic Net

from sklearn.linear_model import ElasticNet

elnet = ElasticNet()
elnet.fit(X_train_scaled, y_train)
y_pred = elnet.predict(X_test_scaled)
mae = mean_absolute_error(y_test, y_pred)
score = r2_score(y_test, y_pred)
print("Mean absolute error", mae)
print("R2 Score", score)
plt.scatter(y_test, y_pred)

Hyperparameter Tuning for Elastic Net

from sklearn.linear_model import ElasticNetCV

elasticcv = ElasticNetCV(cv=5)
elasticcv.fit(X_train_scaled, y_train)
y_pred = elasticcv.predict(X_test_scaled)
plt.scatter(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
score = r2_score(y_test, y_pred)
print("Mean absolute error", mae)
print("R2 Score", score)

Save Models and Scaler

import pickle
pickle.dump(scaler, open('scaler.pkl', 'wb'))
pickle.dump(rid, open('rid.pkl', 'wb'))

Results

Linear Regression

Mean Absolute Error: 0.5468236465249978
R2 Score: 0.9847657384266951

Lasso Regression

Mean Absolute Error: 1.1331759949144085
R2 Score: 0.9492020263112388

LassoCV

Mean Absolute Error: 0.6199701158263433
R2 Score: 0.9820946715928275

Ridge Regression

Mean Absolute Error: 0.5642305340105693
R2 Score: 0.9842993364555513

RidgeCV

Mean Absolute Error: 0.5642305340105693
R2 Score: 0.9842993364555513

Elastic Net

Mean Absolute Error: 1.8822353634896
R2 Score: 0.8753460589519703

ElasticNetCV

Mean Absolute Error: 0.6575946731430904
R2 Score: 0.9814217587854941

Conclusion

This project demonstrates the application of various regression models to predict the Fire Weather Index (FWI). The models were evaluated based on Mean Absolute Error and R2 Score. Hyperparameter tuning was performed to improve model performance. The best model can be selected based on the evaluation metrics.

Future Work

Explore more advanced models such as Random Forest, Gradient Boosting, or Neural Networks.
Perform feature engineering to create new features that might improve model performance.
Conduct a more thorough hyperparameter tuning using GridSearchCV or RandomizedSearchCV.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
webapp		webapp
READ.md		READ.md
READ1.md		READ1.md
README.md		README.md
model_Training.ipynb		model_Training.ipynb
reidgeandlassoregg.ipynb		reidgeandlassoregg.ipynb
rid.pkl		rid.pkl
scaler.pkl		scaler.pkl

niharraju4/Algerianforestfire

Folders and files

Latest commit

History

Repository files navigation

Algerian Forest Fire Prediction

Table of Contents

Introduction

Import Libraries and Load Data

Data Preprocessing

Drop Unnecessary Columns

Encoding Categorical Variables

Define Independent and Dependent Features

Train-Test Split

Feature Selection Based on Correlation

Calculate Correlation Matrix

Check for Multicollinearity

Drop Highly Correlated Features

Feature Scaling

Visualize Effect of Scaling

Linear Regression Model

Lasso Regression

Hyperparameter Tuning for Lasso Regression

Ridge Regression

Hyperparameter Tuning for Ridge Regression

Elastic Net

Hyperparameter Tuning for Elastic Net

Save Models and Scaler

Results

Linear Regression

Lasso Regression

LassoCV

Ridge Regression

RidgeCV

Elastic Net

ElasticNetCV

Conclusion

Future Work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages