Skip to content

CoP: Data Science: Create Data Ops Tutorial #154

@akhaleghi

Description

@akhaleghi

Overview

We need to have a Data Ops Tutorial page that will answer any q

Action Items

  • Create a Google Doc in the folder provided under resources
  • Draft an introductory paragraph explaining what the tutorial resources cover and why a new data scientist would use them for working with data at Hack For LA
  • Identify resources with vetted tutorials covering important skills within the tutorial area, adding to the draft
  • Write Clear Learning Objectives
    • State what the learner will be able to do by the end of the tutorial (e.g., “Set up a basic DataOps workflow with GitHub and Google Sheets”).
  • Write Prerequisites
    • Required skills (e.g., basic Python, Git/GitHub familiarity).
    • Required tools/accounts (e.g., GitHub, Google Drive access, Looker viewer permissions).
  • Write Setup Instructions
    • Links to repos, sample datasets, or starter scripts.
    • Environment setup (e.g., clone repo, install dependencies).
  • Write Step-by-Step Walkthrough
    • Guided example showing one DataOps pipeline end-to-end.
    • Screenshots or code snippets illustrating key steps.
  • Write Hands-On Exercise
    • A small project/task learners can replicate (e.g., build a lightweight validation script, update a dashboard, or run a data ingestion pipeline).
  • Write Common Pitfalls & Troubleshooting
    • Document frequent mistakes (e.g., schema mismatch, GitHub permissions).
    • Tips for debugging errors in automation or dashboards.
  • Write Discussion / Reflection
    • Prompts for learners to connect the tutorial to civic tech use cases.
    • Questions like: “How would you adapt this pipeline for a different dataset?”
  • Write Next Steps / Advanced Topics
    • Suggested follow-ups (e.g., CI/CD for ETL pipelines, adding alerting, scaling beyond Google Sheets).
  • Write Glossary / Key Concepts
    • Definitions of terms like ETL, schema governance, human-in-the-loop.
  • Add additional content to References & Resources
    • link to relevant Hack for LA repos or dashboards (where possible).
  • Review the draft with the Data Science CoP
  • Add to the wiki page

Resources/Instructions

Wiki page

Data Ops Tutorial

Location for any files you might need to upload (drafts, images, etc.)

Tools that are core that should be mentioned:

  • EC2
  • Lambda
  • RDS
  • Athena/Hive
  • Flask

Examples of resources that would be useful to include:

  • Web how-to/tutorial/walk-throughs
  • Youtube playlists or videos demonstrating tools
  • Links to blogs or platforms with subject matter experts

Metadata

Metadata

Assignees

Type

No type

Projects

Status

In progress (actively working)

Status

Currently Recruiting

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions