GitHub - john-zhang-uoft/hotel_price_prediction

Hotel Price Prediction

By Fernando Assad, John Zhang, and Nathan Henry

Table of Contents

About The Project
- Built With
Getting Started
- Prerequisites
- Installation
Usage
Acknowledgments

About The Project

Our Hotel Price Prediction project aims to predict the price of hotels given a listing. Our model takes as inputs images, text reviews and descriptions, ratings in many categories, location, and outputs a price prediction. The aim is to be able to use our model as a tool to quickly combine information found in the form of many reviews and numerical, and even image data to tell users what a hotel's price should be. This way, they can discern whether a hotel is overpriced, priced fairly, or a good deal. We trained our model with supervised learning, using a tuned pretrained text transformers on the text data, a tuned pretrained CNN on image data, and fed all of the embeddings of these networks into a fully connected network, along with numerical data.

In addition to the price prediction model, we explored training a causal model to learn the latent variable "quality", which is the instrinsic value of a hotel detached from everything else. We explored using an autoencoder-like architecture, and analyzed different approaches of solving the problem of extracting a latent quality variable which is not directly expressed by any of the features in the data found on hotel listings.

(back to top)

Tools and Technologies

(back to top)

Getting Started

To get a local copy up and running follow these simple example steps.

Prerequisites

Required packages

pip

pip install torch torchvision torchaudio transformers datasets pandas numpy matplotlib seaborn scikit-learn

Installation

Clone the repo

git clone https://github.com/john-zhang-uoft/hotel_price_prediction

Install python packages

pip install torch torchvision torchaudio transformers datasets pandas numpy matplotlib seaborn scikit-learn

(back to top)

Usage

causal_model.ipynb is the training for the quality encoding model.

multi-modal_network.ipynb is the training of the final price prediction multi-modal network.

extract_json.ipynb contains unpacking the json data.

final_dataset.ipynb contains the construction of the final dataset after filtering images.

image_heuristics.ipynb contains the calculation for different predictive heuristics we used for images to simplify the task of regressing on images.

(back to top)

Acknowledgments

We would like to thank professor Michael Guerzhoy and our TA Parsa Farinneya for their excellent teaching and help this semester.

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.idea		.idea
data		data
figures		figures
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
CNN_hyperparameter_testing.ipynb		CNN_hyperparameter_testing.ipynb
README.md		README.md
causal_nn.png		causal_nn.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hotel Price Prediction

About The Project

Tools and Technologies

Getting Started

Prerequisites

Installation

Usage

Acknowledgments

About

Releases

Packages

Contributors 2

Languages

john-zhang-uoft/hotel_price_prediction

Folders and files

Latest commit

History

Repository files navigation

Hotel Price Prediction

About The Project

Tools and Technologies

Getting Started

Prerequisites

Installation

Usage

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages