-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add blog on apache spark and its core. #71
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for infraspec ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
@@ -0,0 +1,224 @@ | |||
--- | |||
title: "Apache Spark: Unleashing Big Data with Rdds, DataFrames and Beyond." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling issue: Rdds
Suggestion: RDDs
weight: 1 | ||
--- | ||
|
||
Have you ever wondered how companies like Netflix recommend your favourite movies or how e-commerce platforms handle vast amounts of data to personalize your shopping experience 🤔? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling issue: favourite
Suggestion: favorite
|
||
`Computing Engine`: It focuses on computation rather than storage, allowing it to work with various storage systems like Hadoop, Amazon S3, and Apache Cassandra. This flexibility makes Spark suitable for diverse environments, including cloud and streaming applications. | ||
|
||
`Libraries`: It provides a unified API for common data analysis tasks. It supports both standard libraries that ship with the engine as well as external libraries published as third-party packages by the open-source communities. The standard libraries includes libraries for SQL (Spark SQL), machine learning (MLlib), stream processing (Structured Streaming), and graph analytics (GraphX). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammar issue: includes
Suggestion: include
# Collect the RDD data | ||
rdd_data = rdd.collect() | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling issue: Sparks’s
Suggestion: Spark’s
This blog provides an overview of Apache Spark, covering its core components and architecture