Skip to content

ElmiraOn/ML-Penguin-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Differentiate Penguin Species

In this project we are going to perform exploartory data analysis and ccreate machine learning models using following algorithms:

  • Logistic regression
  • SVM
  • MLP
  • RF
  • Gradient Boosting

to predict the species of a given penguin.

Dataset

The Penguin dataset contains following attributes:

  • species: penguin species (Chinstrap, Adélie, or Gentoo)
  • culmen_length_mm: culmen length (mm)
  • culmen_depth_mm: culmen depth (mm)
  • flipper_length_mm: flipper length (mm)
  • body_mass_g: body mass (g)
  • island: island name (Dream, Torgersen, or Biscoe) in the Palmer Archipelago (Antarctica)
  • sex: penguin sex

The target attribute: species

Installation

To develop this project Jupyter Notebooks and Anaconda are used. You can install Anaconda from here. Then either use Jupyter Labs or jupyter notebook extension to open the files. You can also view the project on Kaggle here

Credits

Palmer Archipelago (Antarctica) penguin data:

Data were collected and made available by Dr. Kristen Gorman and the Palmer Station, Antarctica LTER, a member of the Long Term Ecological Research Network. License & citation • Data are available by CC-0 license in accordance with the Palmer Station LTER Data Policy and the LTER Data Access Policy for Type I data. • Please cite this data using: Gorman KB, Williams TD, Fraser WR (2014) Ecological Sexual Dimorphism and Environmental Variability within a Community of Antarctic Penguins (Genus Pygoscelis). PLoS ONE 9(3): e90081. doi:10.1371/journal.pone.0090081 citation

License

CC-0 license

About

ML project predicting species of penguins

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published