Skip to content

e-dzia/large-scale-data-processing-course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Large Scale Data Processing Course tasks

l1

  1. Linux - bash, ssh, scp, tmux, htop, kill, killall, pipe operator, ls, sed, vim, cat
  2. Docker - Dockerfile, docker-compose, containers in general
  3. Python - pip, virtualenv, requirements, tox
  4. Parallelize computation in Python

l2

  1. Docker - Dockerfile, docker-compose, containers in general
  2. Python - pip, requirements
  3. Celery
  4. Task queue

l3

  1. Text embedding
  2. Data persistency (MongoDB)
  3. Data analysis (Redash)

l4

  1. pySpark
  2. Linear regression
  3. Binary classification
  4. Multi-class classification

l5

  1. Kubernetes
  2. K3s
  3. Helm
  4. Docker
  5. Application deployment

About

Large Scale Data Processing Course

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published