Skip to content

Latest commit

 

History

History
20 lines (14 loc) · 1.6 KB

README.md

File metadata and controls

20 lines (14 loc) · 1.6 KB

TxMM-PawpularityContest

This repository contains the following files:

  • Part I: TxMM_TabularFeatures.ipynb
  • Part II: TxMM_VisionAPI.ipynb
  • Part III: TxMM_Analysis.ipynb
  • The created dataset: pawpularityVision.csv

As illustration of this research, we analysed a dataset of 9912 cat and dog images taken from PetFinder.my's Kaggle competition. Each photo had a popularity score derived by user-clicks on their online adoption platform. We use this score to determine which features make cats and dogs more popular or less popular. In the plots below, one can see that dogs were in general more popular, because they relatively had higher scores than cats.

cats plot dogs plot

After splitting the data in separate cats and dogs data, we investigated which breeds were more popular than others. The plots below show the most and least popular cat and dog breed (which were recognized by the Google Cloud Vision API). The heights of the bars are the normalised sums (how often the breed was found in the bin divided by the number of photos in the bin). This normalisation makes comparing different bins and breeds possible. At the bottom of the bars, the absolute number of labels found is shown. Recall the imbalanced distribution, so these numbers are not comparable, but give an indication of size. See the notebooks for more information such as the correlation coefficients.

most popular cat breed least popular cat breed most popular dog breed least popular dog breed