DataHacks 2020

Welcome to DataHacks 2020! Out of hundreds of applicants, you’ve been selected because you display true potential for solving complex problems and exude a passion for comprehending and transforming data. Let’s begin the Hackathon!

READ BEFORE HACKING STARTS

Useful tools and websites

Anaconda: Python 3.7, Graphical Installer recommended but not required
Slack: Mac Windows
Live site: Schedule is listed here.
Slack Channel: Our main tool of communication.
Devpost: The website to submit your report
Visual Studio Code: one text editor
Github Desktop: If you don’t know Github or don’t have a Github account, please look at this post first

Rules

Each team consists of up to THREE people (≤ 3)
For beginner track, ONLY beginners (students who have not taken DSC 80 or any CSE/COGS/DSC upper-div class) can form a team

Competition format

Each team pick a track to work on
You have 24 hours until Sunday noon to work on your dataset
Follow the prompt/README file for each track
Prepare for a report with all of your findings in a reasonable length
Zip your report (pdf) and code and submit as a group to Devpost (link above, come up with an appropriate team name!).
Judges will read through your reports and pick the top three teams from each track
Selected nine teams will go on stage and present their findings (maximum five minutes per team)
Judges will announce one winner per track based on the presentations

Track information

Beginner Track

This is a beginner-friendly track! We will give you a dataset that contains San Diego Housing information from the 1970s. You may work on problems such as the relationship between housing quality and ethnic groups/genders, predictions of housing prices based on given conditions, etc. Don't worry if you don't have any knowledge in data science (including python, pandas, EDA, machine learning...). We'll have a series of workshops to help you build your project!

Find more information here.

Science Track

In this challenge, we’re interested in using Data Visualization and NLP (Natural Language Processing) to analyze chronic illnesses through accumulated survey data. The data was retrieved from the CDC website. The data is real-world data and can be messy, preprocessing may be required to extract trends and patterns in the data. The end goal is to create a report with at least 3 data visualizations and incorporate NLP to send an important message about a specific chronic illness to an audience. Also, make sure your report contains what you did (cleaning, processing, any modeling, etc) and is submitted as well to ensure good data science practices. This prompt is relatively open-ended: other data may be incorporated as deemed necessary and the message you decide to convey is up to you (however, do make sure to back it up with evidence and visuals).

Find more information here.

Business Track

Over the past decade, the transportation industry has become one of the most promising areas for careers in Data Science and/or Data Engineering. At UBER, the world’s most popular ride-share service, data scientists have access to billions of rows of data and are expected to showcase mastery over-processing, visualizing, and analyzing the company’s data. In this track, you will have the opportunity to work with real-world UBER time-series data from the San Francisco area, spanning across the first and second quarters of 2019. The time-series data will be centered on travel times for UBER trips in the overall San Francisco area.

Find more information here.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
Dataset		Dataset
Workshop_Notebooks		Workshop_Notebooks
.gitignore		.gitignore
Helpful Links and References.pdf		Helpful Links and References.pdf
README.md		README.md
Rady_slides_2020_2_8_2020.pptx		Rady_slides_2020_2_8_2020.pptx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DataHacks 2020

READ BEFORE HACKING STARTS

Useful tools and websites

Rules

Competition format

Track information

Beginner Track

Science Track

Business Track

About

Releases

Packages

Contributors 2

Languages

DataHacksDS3/DataHacks2020

Folders and files

Latest commit

History

Repository files navigation

DataHacks 2020

READ BEFORE HACKING STARTS

Useful tools and websites

Rules

Competition format

Track information

Beginner Track

Science Track

Business Track

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages