Skip to content

marion59000/datascience

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data Science

Data Science is a field whose purpose is to extract knowledge from large-scale data. It is based on techniques from various domains such as data mining, machine learning, artificial intelligence, visualization, and optimization. These techniques are adapted to large scale datasets thanks to parallel data processing, distributed systems, and suitable databases.

These techniques are applied in various domains such as:

  • Computer security : spam filtering, network monitoring, anomaly detection, intrusion detection, etc.
  • Social network analysis: community detection, trend analysis and prediction, etc.
  • Marketing : targeted advertising, recommender systems, etc.
  • Epidemiology and public health : determining risk factors, drug response prediction, etc.

Outline

Based on the use of the Python programming language, this course address the following topics:

  • Data acquisition, visualisation, and analysis
  • Machine learning : supervised learning (classification, regression), unsupervised learning (clustering, decomposition)
  • Network analysis : PageRank, mining social-network graphs
  • Recommendation Systems

Outcome

  • Understand key algorithms and techniques of data science
  • Implement these techniques in python
  • Understand their limitations
  • Select appropriate techniques for a particular problem
  • Apply these techniques for modeling and analysing large scale datasets

References

Lectures and labs materials are based on the following resources :

  • Boston University CS591 "Tools and Techniques for Data Mining and Applications" course
  • Mining of Massive Datasets, by Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman, Cambridge University Press, 2014
  • The Elements of Statistical Learning: Data Mining, Inference, and Prediction, by Trevor Hastie and Robert Tibshirani, Springer, 2009
  • Data Science from Scratch, by Joel Grus, O'Reilly, 2015
  • Dhar, V., Data Science and Prediction, Communications of the ACM, Vol. 56 No. 12, December 2013.
  • https://cloud.google.com/bigquery/public-data/

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%