generated from hackforla/.github-hackforla-base-repo-template
-
-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Labels
complexity: mediumdocumentationImprovements or additions to documentationImprovements or additions to documentationfeature: guideAll issues related to guideAll issues related to guidemilestone: missingrole: Data Engineersize: 1ptCan be done in 6 hours or lessCan be done in 6 hours or less
Description
Overview
We need to have a Data Ops Tutorial page that will answer any q
Action Items
- Create a Google Doc in the folder provided under resources
- Draft an introductory paragraph explaining what the tutorial resources cover and why a new data scientist would use them for working with data at Hack For LA
- Identify resources with vetted tutorials covering important skills within the tutorial area, adding to the draft
- Write Clear Learning Objectives
- State what the learner will be able to do by the end of the tutorial (e.g., “Set up a basic DataOps workflow with GitHub and Google Sheets”).
- Write Prerequisites
- Required skills (e.g., basic Python, Git/GitHub familiarity).
- Required tools/accounts (e.g., GitHub, Google Drive access, Looker viewer permissions).
- Write Setup Instructions
- Links to repos, sample datasets, or starter scripts.
- Environment setup (e.g., clone repo, install dependencies).
- Write Step-by-Step Walkthrough
- Guided example showing one DataOps pipeline end-to-end.
- Screenshots or code snippets illustrating key steps.
- Write Hands-On Exercise
- A small project/task learners can replicate (e.g., build a lightweight validation script, update a dashboard, or run a data ingestion pipeline).
- Write Common Pitfalls & Troubleshooting
- Document frequent mistakes (e.g., schema mismatch, GitHub permissions).
- Tips for debugging errors in automation or dashboards.
- Write Discussion / Reflection
- Prompts for learners to connect the tutorial to civic tech use cases.
- Questions like: “How would you adapt this pipeline for a different dataset?”
- Write Next Steps / Advanced Topics
- Suggested follow-ups (e.g., CI/CD for ETL pipelines, adding alerting, scaling beyond Google Sheets).
- Write Glossary / Key Concepts
- Definitions of terms like ETL, schema governance, human-in-the-loop.
- Add additional content to References & Resources
- link to relevant Hack for LA repos or dashboards (where possible).
- Review the draft with the Data Science CoP
- Add to the wiki page
Resources/Instructions
Wiki page
Location for any files you might need to upload (drafts, images, etc.)
Tools that are core that should be mentioned:
- EC2
- Lambda
- RDS
- Athena/Hive
- Flask
Examples of resources that would be useful to include:
- Web how-to/tutorial/walk-throughs
- Youtube playlists or videos demonstrating tools
- Links to blogs or platforms with subject matter experts
Metadata
Metadata
Assignees
Labels
complexity: mediumdocumentationImprovements or additions to documentationImprovements or additions to documentationfeature: guideAll issues related to guideAll issues related to guidemilestone: missingrole: Data Engineersize: 1ptCan be done in 6 hours or lessCan be done in 6 hours or less
Type
Projects
Status
In progress (actively working)
Status
Currently Recruiting