Welcome to the Paradime dbt™ Data Modeling Challenge - Movie Edition!
- Submit Your Application: Fill out the registration form.
- Verification by Paradime: We'll review your application against the entry requirements.
After verification, you'll receive two emails confirmation eamils from Paradime:
- Snowflake Account Credentials: Contains your Snowflake account details. Search for an email with subject line "Start Your Movie Data Modeling Challenge – Your Snowflake Credentials."
- Paradime Platform Invitation: An invitation to access the Paradime Platform. Search for an email with the subject line "[Paradime] Activate your account."
Using the the infomations outlinied in the confirmation emails, set up the following accounts:
- Paradime: Join the Paradime workspace using the invite email.
- Snowflake: Within Paradime, Add Snowflake credentials (Username, Password, Role, Database, Warehouse)
- Lightdash: Register for a new Lightdash account and follow the instructions in this step-by-step tutorial
Note: A step-by-step tutorial for setting up Paradime and Snowflake is available in your Snowflake credentials email, "Start Your Movie Data Modeling Challenge – Your Snowflake Credentials".
- Create a New Branch: Open the Paradime Editor and create a new branch. Your branch name should follow this format: "movie-<your_email>". For guidance, see this step-by-step tutorial
- Start Developing: Begin crafting SQL queries, developing dbt™ models, and generating insights!
Need Additional Help? join the #movie-competition channel on Slack and contact the team.
Now that you're set up, you have until May 26th, 2024, to complete and submit your project!
- Paradime:
- Dive into the Paradime Editor with this step-by-step, interactive guide. It's designed to familiarize you with the core functionalities and of the editor and get you familiar with the Project. You can also watch our YouTube videos:
- All features in our intuitive IDE Apps Panel
- AI-enabled IDE for dbt™ development | DinoAI | Paradime.io
- Paradime Help Docs: For a comprehensive understanding of all the features and how to make the most of Paradime for your project, explore the Paradime Help Docs.
- Snowflake Data Warehouse: Learn about the data warehouse and the pre-loaded data in this step-by-step, interactive guide.
- Lightdash: Discover how to setup and utilize Lightdash in this YouTube video.
Paradime has pre-loaded your Snowflake account with 3 Movie datasets. These data sets contain roughly 1,700,000 rows of detailed Movie and TV Show data. Please understand that these data sets are not entirely accurate; They're simply a starting point - you will need to bring in your datasets to truly excel in this challenge.
- In Snowflake: Directly explore the datasets in Snowflake for hands-on analysis.
- GitHub Repository Resources:
- Staging Files: These files provide a preliminary view and structure of the datasets available in this repository.
- schema.yml File: This file contains schema definitions, helping you understand the data models and their relationships.
- Paradime Catalog UI: Use the Paradime Catalog UI for an interactive exploration of the datasets, featuring intuitive search and navigation.
Your primary goal is to construct dbt™ models that unearth compelling insights, captivating Movie fans. These three datasets are your starting point, and as you bring in additional data, the possibilities for discovery are virtually limitless. This is your playground to innovate and explore the depths of Movie and TV data.
Before diving in, ensure you're familiar with the Judging Criteria so you have a chance to win the $500-$1500 Amazon gift cards!
Once you've generated insights, you're required to use Lightdash for data vizualizations. Utilize Lightdash's provided YouTube video and documentation for best practices.
Submission Deadline: May 26th, 2024 Once your project is complete, please submit the following materials to Parker Rogers (parker@paradime.io) with Subject Line "<your_name> - Movie Data Modeling Challenge Submission":
- GitHub Branch: Send the link to your GitHub branch containing your dbt™ models.
- README.md: Include a README file that narrates your project's story, methodology, and insights. Check out this example README from our previous NBA Data Modeling challenge.
- Data Visualizations and Insights: Showcase your findings, ideally within your README.md. For inspiration, refer to these example visualizations from our previous NBA Data Modeling challenge.
-
Blog - Winning Strategies for Paradime's Movie Data Modeling Challenge
-
Explore top submissions Paradime's recent NBA Data Modeling Challenge
- First Place - Spence Perry's Submission
- Second Place - Chris Hughes' Submission
- Third Place - István Mózes' Submission
-
Additionally, Here are some questions you might consider answering:
- Highest grossing films of all time:
- Data Required: omdb_movies and/or tmdb_movies.
- You might also consider bringing in third party data to understand highest grossing films by country.
- Highest/lowest ROI films of all time:
- Data Required: omdb_movies and/or tmdb_movies. See columns "budget", "revenue", and "box office".
- Actors who appear in most films:
- Data Required: omdb_movies. See column "actors"
- Highest grossing directors and writers:
- Data Required: omdb_movies. See columns "director" and "writer"
- Highest grossing films of all time:
If you're having issues submitting your project, watch this interactive tutorial.
We look forward to seeing your creative and insightful analyses!
Here's an example project that fulfills all requirements and would be elligble eligible for cash prizes. Feel free to use this template for your submission. We also recommend diving into the the winner's submissions from our recent NBA Data Modeling Challenge for inspiration.
A simple intro. Example - "Explore my project for the dbt™ data modeling challenge - Movie Edition, Hosted by Paradime! This project dives into the analysis and visualization of Movie and TV data!"
My analysis leverages four key data sets:
- data set name #1
- data set name #2
- data set name #3
- data set name #4
- Copy and paste your data lineage image here. Watch this YouTube Tutorial to learn how.
- Paradime for SQL, dbt™.
- Snowflake for data storage and computing.
- Lightdash for data visualization.
- Other tool(s) used and why.
My analysis leverages four key data sets:
- data set name #1
- data set name #2
- data set name #3
- data set name #4
[Image]
Share a clear and concise conclusion of your findings!