What is this Book?   How to Contribute   YouTube   Twitter   Amazon Shop
Check out my Data Engineering Academy and personal Coaching at LearnDataEngineering.com
Visit learndataengineering.com: Click Here
- Learn Data Engineering with our online Academy
- Perfect for becoming a Data Engineer or add Data Engineering to your skillset
- Proven process based on years of experience and hundreds of hours of personal coaching
- Prepared courses on the most important fundamentals, tools and platforms plus our
- Associate Data Engineer Certification
- Private Slack workgroup with over 500 members
- Amazon: Click Here buy whatever you like from Amazon using this link* (Also check out my complete podcast gear and books)
- Introduction
- Basic Engineering Skills
- Advanced Engineering Skills
- Hands On Course‚
- Case Studies
- Best Practices Cloud Platforms
- 130+ Data Sources Data Science
- 1001 Interview Questions
- Recommended Books and Courses
- What is this Cookbook
- Data Engineer vs Data Scientist
- My Data Science Platform Blueprint
- Who Companies Need
- Learn To Code
- Get Familiar With Git
- Agile Development
- Software Engineering Culture
- Learn how a Computer Works
- Data Network Transmission
- Security and Privacy
- Linux
- Docker
- The Cloud
- Security Zone Design
- Data Science Platform
- Hadoop Platforms
- Connect
- Buffer
- Processing Frameworks
- Lambda and Kappa Architecture
- Batch Processing
- Stream Processing
- Should You do Stream or Batch Processing
- Is ETL still relevant for Analytics?
- MapReduce
- Apache Spark
- What is the Difference to MapReduce?
- How Spark Fits to Hadoop
- Spark vs Hadoop
- Spark and Hadoop a Perfect Fit
- Spark on YARn
- My Simple Rule of Thumb
- Available Languages
- Spark Driver Executor and SparkContext
- Spark Batch vs Stream processing
- How Spark uses Data From Hadoop
- What are RDDs and How to Use Them
- SparkSQL How and Why to Use It
- What are Dataframes and How to Use Them
- Machine Learning on Spark (TensorFlow)
- MLlib
- Spark Setup
- Spark Resource Management
- AWS Lambda
- Apache Flink
- Elasticsearch
- Apache Drill
- StreamSets
- Store
- Visualize
- Machine Learning
- How to do Machine Learning in production
- Why machine learning in production is harder then you think
- Models Do Not Work Forever
- Where are The Platforms That Support Machine Learning
- Training Parameter Management
- How to Convince People That Machine Learning Works
- No Rules No Physical Models
- You Have The Data. Use It!
- Data is Stronger Than Opinions
- AWS Sagemaker
- What We Want To Do
- Thoughts On Choosing A Development Environment
- A Look Into the Twitter API
- Ingesting Tweets with Apache Nifi
- Writing from Nifi to Apache Kafka
- Apache Zeppelin Data Processing
- Switch Processing from Zeppelin to Spark
- Data Science @Airbnb
- Data Science @Amazon
- Data Science @Baidu
- Data Science @Blackrock
- Data Science @BMW
- Data Science @Booking.com
- Data Science @CERN
- Data Science @Disney
- Data Science @DLR
- Data Science @Drivetribe
- Data Science @Dropbox
- Data Science @Ebay
- Data Science @Expedia
- Data Science @Facebook
- Data Science @Google
- Data Science @Grammarly
- Data Science @ING Fraud
- Data Science @Instagram
- Data Science @LinkedIn
- Data Science @Lyft
- Data Science @NASA
- Data Science @Netflix
- Data Science @OLX
- Data Science @OTTO
- Data Science @Paypal
- Data Science @Pinterest
- Data Science @Salesforce
- Data Science @Siemens Mindsphere
- Data Science @Slack
- Data Science @Spotify
- Data Science @Symantec
- Data Science @Tinder
- Data Science @Twitter
- Data Science @Uber
- Data Science @Upwork
- Data Science @Woot
- Data Science @Zalando
- General And Academic
- Content Marketing
- Crime
- Drugs
- Education
- Entertainment
- Environmental And Weather Data
- Financial And Economic Data
- Government And World
- Health
- Human Rights
- Labor And Employment Data
- Politics
- Retail
- Social
- Travel And Transportation
- Various Portals
- Source Articles and Blog Posts
- Free Data Sources Data Science
If you have some cool links or topics for the cookbook, please become a contributor.
Simply pull the repo, add your ideas and create a pull request. You can also open an issue and put your thoughts there.
Please use the "Issues" function for comments.
Everything is free, but please support what you like! Join my Patreon and become a plumber yourself: Link to my Patreon
Or support me and send a message I read on the next livestream through Paypal.me: Link to my Paypal.me/feedthestream
Subscribe to my Plumbers of Data Science YouTube channel for regular updates: Link to YouTube
Check out my blog and get updated via mail by joining my mailing list: andreaskretz.com
I have a Medium publication where you can publish your data engineer articles to reach more people: Medium publication
*(As an Amazon Associate I earn from qualifying purchases from Amazon This is free of charge for you, but super helpful for supporting this channel)