Skip to content

Latest commit

 

History

History
 
 

opends4all-scalable-data-processing

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Open Data Science for All Scalable Data Processing

Scalable Data Processing contains modules on reasoning about performance, parallel and distributed processing, as well as big data computations on graphs, e.g. PageRank.

Directory Contents

  • Reasoning about performance

    • EFFICIENT-DATA-PROCESSING-architecture-algorithms-intermediate slides.
  • Parallel and distributed processing:

    • CLUSTER-DATA-PROCESSING-parallelism-partitioning-advanced slides.
    • CLUSTER-GRAPH-PROCESSING-centrality slides
  • Big data:

    • GRAPHS-adjacency-matrices slides
    • GRAPHS-PAGERANK-centrality slides
  • Lecture notebook

  • Homework:

    • EFFICIENT-DATA-PROCESSING-Homework notebook
    • CLUSTER-DATA-PROCESSING-Homework-Local, run locally notebook
    • CLUSTER-DATA-PROCESSING-Homework-Cloud, run in the Cloud notebook

Release History

  • Initial release, Susan Davidson and Zachary Ives, University of Pennsylvania, February 2020.