Skip to content

kampungkat/GA-DataSci-2023-2024-Cohort

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Part-Time 10 Week Cohort

This Cohort is meant as an expedited ramp up on the skills for Data Science. The intent is to help someone break into Data Science. The material covered is in a crash course fashion and is by no means comprehensive. The materials covered are a "best of".

Data Science has been evolving and shaping over the last decade and can be defined differently even within one company. Some may define it as AI and ML others as Data Engineering with Predictive Modeling. While others just grab-bag of Technical Project Management and Data Wrangling.

The content in this class is meant to give a bootcamp styled foundational knowledge and application to begin a journey into the field of Data Science.

The goal everyone should have in this class is to move the needle at least 1 or 2 points on the below self-evaluation. Teaching and mentoring is a proven way to grow or reinforce knowledge. Highly encourage study groups and/or volunteering to share on a subject of your interest related to course content or complimentary to it. Please reach out if you are interested in sharing or if you feel you are at a 4 or 5 volunteered presentations to fortify your own knowledge while sharing are welcome. Self Evaluation

  • 0 - Do not know
  • 1 - Aquiring knowledge ( studied or attempted in last 30 days)
  • 2 - Can and have applied ~ 3x with use of reference material(notes, google, stackoverflow) < 80% of the time
  • 3 - Can and have applied applied with little use of reference <20% of the time
  • 4 - Can teach or mentor
  • 5 - Design, Optimize, Code Review, Improve

Class times Tuesday PST: 4-7 Thursday PST:4-7

PROJECT DATES

The 4 units will be accented with workstyled meetings.

  • 12/21 Chipotle Obtain and Understand data update

  • 1/16 Chiptle EDA with Python update & Final Project + Dataset proposal

  • 2/1 Project EDA Brief, Linear regression Lunch and Learn(demo and knowledge share)

  • 2/22 Proof of concept with tech team and executive sponsors.

  • Updating your team on what you understand about the data.

  • Proposal making a use case and seeking feedback from your team before moving forward.

  • Brief Write up with stats about progress with project, blockers, challenges and refined scope

  • Lunch and Learn informal setting to share what you know, hear something a new way, or actively learn as a teacher

  • Technical report This is the final brief with more detail on how it was built, where there are issues either with data or code, what is still needed, why the approach was taken, what other options were considered, Next steps. This will have stats about the data that was excluded and a profile of what was included. **Executive Presentation a 1 page or less document in common terms that tell a narrative that answers the business questions, allows for business acumene detailed questions, and offers next steps to the relevent audience. Answers the question, "So what?"

Office Hours

Slack

Email

Course outline may adjust depending on time. There is a lot of content to cover in a short period of time.

DATE CLASS DATE CLASS
01 Orientation and Review home, slides 02 Development Environment home, slides
03 Jupyter Numpy Pandas home, slides 04 Lab Presentations Catchup slides
05 Intro Exploratory Data Analysis(EDA) in Pandas 06 Statistics in Python
07 More EDA Data Visualization in Python 08 Experiments and Hypothesis Testing
09 Presentations 10 KNN/ Classification
11 Train-Test Split & Bias Variance 12 Linear Regression
13 Logistic Regression 14 Presentations
15 Working with Data APIs 16 Intro to Natural Language Processing(NLP)
17 Intro to Time Series 18 Flex subject and Class time
19 Flex day, review, catchup, workshop 20 PRESENTATION

Python Resources

Download Anaconda with python 3.6 or 7, Pycharm or code editor for exercises

Name Description
Learn Python the Hardway walk through This is a great way to dig into deep basic syntax with a guide!
Learn Python the Hardway Remember that walkthrough video, try it without the video, gets a bit more real after about exercise 15.
Codecademy Repeat, repeat repeat, just another avenue to reinforce everything your learning
Automate the Boring Stuff Review and then a lot more
Python Language Reference Good as reference
Python Standard Library Library reference
Python Tutorial Point Good Navigation + additional links
W3 Schools good tutorial and reference
LearningPython Review of the above but then begins to progress into NumPy and Pandas

Pandas

Name Description
10 minutes to Pandas Excellent starter into Pandas
Pandas tutorial Data Frames in Python Data Frames explained
Pandas getting started Fundamentals at a deeper level
Data Munging in Python with Pandas SQL of Python
How to clean data with Pandas Bottom of page how to clean data with Pandas
Cast object to specified Pandas datatype Good code examples
Pandas Top 10 Useful and hard to find features
Essential Basics Build fluency and understanding
Summerizing, Aggregating, Grouping in Pandas Nice write up on subject
Missing data Good for troubleshooting
Official Pandas Tutorials Wes & Company's selection of tutorials and lectures
Julia Evans Pandas Cookbook Great resource with examples from weather, bikes and 311 calls
Learn Pandas Tutorials A great series of Pandas tutorials from Dave Rojas
Research Computing Python Data PYNBs A super awesome set of python notebooks from a meetup-based course exclusively devoted to pandas

more resources

MORE DATA

Public Data Sources

About

My personal fork from Matt's original repo.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 98.6%
  • Python 1.4%