Skip to content

TimMolleman/funda-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Funda Housing Price Prediction Project (API)

Practice project for setting up a data infrastructure for gathering data, doing some operations on it, training a model, and then making it available via an API. Primary tooling used for this are Apache Airflow for orchestration of several AWS services (mainly Lambdas), AWS Lambda in combination with Python and FastAPI for creating the API endpoints. This FastAPI service is being hosted via AWS API Gateway and Lambda.

For the project, basic housing information of the Dutch real-estate website Funda is scraped and saved to an Amazon S3 bucket. After this some transformations are done, and a model is trained, all using AWS Lambda functions (see repository). The trained model is also saved to S3 and is then exposed for predictions via the API (see repository). To schedule all lambdas and to do a number of other transformations Apache Airflow is used (see repository).

For managing AWS infrastructure reliably and assure re-usability, Terraform is used (see repository).

Description

This repository contains the code for the lambda functions that are invoked via Apache Airflow. To be exact, it contains four modules of code with accessory Dockerfiles and requirements.txt for creating the lambda image:

  • link_scraper: For scraping and storing the information from Funda.
  • link_cleaner: For cleaning up links. Checks if links not in historic data to avoid duplicates and updates historic S3 file to keep accurate track of links.
  • history_link_cleaner: Periodically dropping links from file that are older than N days.
  • model_trainer: Lambda for training model on house data. Is a simple regression model.

Getting Started

Dependencies

The Python version recommended for running this project is 3.8. It is possible to test the API locally. It is recommended to create a virtual environment to install the requirements.txt if you want to test the API.

Executing program

To run the Lambda scripts locally it is possible to simply run:

uvicorn api_handler:app --reload

From within the 'app' directory.

Authors

Tim Molleman

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published