Skip to content

melekmsakni/RealStateDataEngineering

Repository files navigation

🏠🇹🇳 Tunisia Real Estate Data Pipeline

Welcome to our cutting-edge Tunisian Real Estate Data Pipeline! This project seamlessly integrates the power of Apache Kafka, Spark, Airflow, and Superset to process and visualize real estate data in real-time. By combining data from diverse sources, we offer a holistic view of the Tunisian property market, enabling data-driven decision-making for investors, agents, and policymakers alike. 🚀 Our pipeline is designed to:

  • Aggregate data from multiple Tunisian real estate sources
  • Process and analyze information in near real-time
  • Generate comprehensive insights into market trends
  • Provide interactive visualizations for intuitive data exploration

Whether you're tracking property prices, monitoring market fluctuations, or identifying emerging hotspots, our Tunisia Real Estate Data Pipeline is your go-to solution for staying ahead in the dynamic world of Tunisian real estate.

Made with Apache Kafka Powered by Apache Spark Orchestrated with Apache Airflow Visualized with Apache Superset Azure Cloud Services Stored in Apache Cassandra

📚 Table of Contents

🌟 Introduction

Dive into the dynamic world of Tunisian real estate analytics with our comprehensive data pipeline! This project harnesses data from multiple prominent sources including Tecnocasa, Remxx, and other key players in the Tunisian real estate market. Our robust pipeline ensures real-time analytics and delivers actionable insights into the ever-evolving property landscape. From data ingestion to insightful visualizations, we've got you covered! 📊🏘️

🏗️ Architecture

Here's a high-level overview of our pipeline architecture:

data_archi

📊 Visualization Dashboard

Our interactive visualization dashboard, powered by Apache Superset, provides real-time insights into the Tunisian real estate market.

dashborad

🛠️ Prerequisites

Before embarking on this data journey, make sure you have:

  • Git 🐙

  • Docker 🐳 (Install Docker)

  • Docker Compose 🐋 (For Linux users):

    sudo curl -L "https://github.com/docker/compose/releases/download/v2.20.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
    sudo chmod +x /usr/local/bin/docker-compose
    docker-compose --version

🚀 Installation

  1. Clone this repository:

    git clone https://github.com/yourusername/real-estate-pipeline.git
    cd real-estate-pipeline
  2. Install Python requirements (for local development):

    pip install -r requirements.txt

🔧 Project Setup

  1. Initialize Airflow:

    docker-compose up airflow-init
  2. Launch the pipeline:

    docker-compose up --build -d
  3. Access the services:

  4. Stop the pipeline:

    docker-compose down

🌈 Environment Variables

Customize your pipeline by tweaking the .env file. It's like choosing the perfect paint color for your house! 🎨

💾 Data Persistence

We use Docker volumes to keep your data safe and sound, even when containers take a nap. 😴

🐛 Troubleshooting

If things go sideways, check the logs:

docker-compose logs -f <service_name>

For Airflow logs:

docker-compose logs -f airflow

🤝 Contributing

Got ideas? We love them! Fork the repo, make your changes, and send us a pull request. Let's build something amazing together! 🤜🤛

📄 License

This project is licensed under the MIT License. Check out the LICENSE file for the fine print.


Built with ❤️ by Melek Msakni

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published