Skip to content
This repository has been archived by the owner on Jun 18, 2020. It is now read-only.

Fix broken headings in Markdown files #71

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ AWS Data Pipeline is a web service that you can use to automate the movement and


# Running the samples
##Setup
## Setup
1 Get the samples by cloning this repository.
```sh
$> git clone https://github.com/awslabs/data-pipeline-samples.git
Expand Down Expand Up @@ -39,11 +39,11 @@ When you are finished experimenting with the examples, deactivate the virtual en
$> aws datapipeline create-default-roles
```

##Run the Hello World sample
## Run the Hello World sample

The hello world sample demonstrates a pipeline that creates an EC2 instance and runs `echo Hello World!`. It can be used as a reference template for executing arbitriy shell commands.

###Step 1
### Step 1
Create the pipelineId by calling the *aws data pipeline create-pipeline* command. We'll use this pipelineId to host the pipeline definition document and ultimately to run and monitor the pipeline. The commands in this section should be called from within the virtual environment that you created above.

```sh
Expand All @@ -59,7 +59,7 @@ You will receive a pipelineId like this.
# +-------------+--------------------------+
```

###Step 2
### Step 2
Upload the helloworld.json sample pipeline definition by calling the *aws datapipeline put-pipeline-definition* command. This will upload and validate your pipeline definition.

```sh
Expand All @@ -76,7 +76,7 @@ You will receive a validation messages like this
# | errored | False |
# +-----------+---------+
```
###Step 3
### Step 3
Activate the pipeline by calling the *aws datapipeline activate-pipeline* command. This will cause the pipeline to start running on its defined schedule.

```sh
Expand All @@ -100,7 +100,7 @@ You will receive status information on the pipeline.
# @ShellCommandActivity_HelloWorld_2015-07-19T22:48: 2015-07-19T22:48:34

```
##Examine the contents of the sample pipeline definition
## Examine the contents of the sample pipeline definition
Let's look at the Hello world example pipeline located at samples/helloworld/helloworld.json.

```json
Expand Down Expand Up @@ -175,13 +175,13 @@ Let's look at the Hello world example pipeline located at samples/helloworld/hel
}
```

##Check out the other samples
## Check out the other samples
This reposity contains a collection of Data Pipeline templates that should help you get started quickly. Browse the content of the /samples folder to discover what samples exist. Also, feel free to submit samples a pull requests.




##Disclaimer
## Disclaimer
The samples in this repository are meant to help users get started with Data Pipeline. They may not be sufficient for production environments. Users should carefully inspect samples before running them.

_Use at your own risk._
Expand Down
6 changes: 3 additions & 3 deletions samples/DynamoDBExport/readme.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
#DynamoDB to CSV export
# DynamoDB to CSV export

##About the sample
## About the sample
The pipeline definition is used for exporting DynamoDB data to a CSV format.

##Running the pipeline
## Running the pipeline

Example DynamoDB table with keys: customer_id, income, demographics, financial

Expand Down
4 changes: 2 additions & 2 deletions samples/DynamoDBImport/readme.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#XML to DynamoDB Import
# XML to DynamoDB Import

##Running the sample pipeline
## Running the sample pipeline
The json format could be either directly imported in the Console -> Create Pipeline or used in the aws datapipeline cli.<br/>
The Pipeline definition would copy an example xml from s3://data-pipeline-samples/dynamodbxml/input/serde.xml to local. This step is required for creating a temporary xml table using hive. The hive script is configured for running on a DynamoDB table with keys as "customer_id, financial, income, demographics". It finally performs an import from the temporary xml table to dynamodb<br/>
The data from the xml file is parsed using hive xml serde. The parsing functionality is similar to parsing in xpath<br/>
Expand Down
6 changes: 3 additions & 3 deletions samples/DynamoDBImportCSV/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
#DynamoDB to CSV import
# DynamoDB to CSV import

##About the sample
## About the sample
The pipeline definition is used to import DynamoDB data to a CSV format.

##Running the pipeline
## Running the pipeline

Example DynamoDB table with keys: id

Expand Down
10 changes: 5 additions & 5 deletions samples/DynamoDBToRedshiftConvertDataUsingHive/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#DynamoDBToRedshiftConvertDataUsingHive Sample
# DynamoDBToRedshiftConvertDataUsingHive Sample

This sample demonstrates how you can use Data Pipeline's HiveActivity and RedshiftCopyActivity to copy data from a DynamoDB table to a Redshift table while performing data conversion using Hive (for data transformation) and S3 (for staging). This sample was motivated by a use case where one wishes to convert the data type of one column to another data type. In this sample, we will be converting a column from binary to base64 string. To make this sample to work, you must ensure you have the following:

Expand All @@ -14,7 +14,7 @@ We will use the [Handling Binary Type Attributes Using the AWS SDK for Java Docu

The column mappings used in this sample are meant to match the table definition used in the above example.

##Hive queries
## Hive queries
The following queries will be used to convert the ExtendedMessage column from binary to base64 string.
```sql
# tempHiveTable will receive the data from DynamoDB as-is
Expand All @@ -38,7 +38,7 @@ INSERT OVERWRITE TABLE s3TempTable SELECT Id,ReplyDateTime,Message,base64(Extend

You will need to provide the above information in the "put-pipeline-definition" command below.

##Before running the sample
## Before running the sample
To simplify the example, the pipeline uses the following EMR cluster configuration:
* Release label: emr-4.4.0
* Master instance type: m3.xlarge
Expand All @@ -47,7 +47,7 @@ To simplify the example, the pipeline uses the following EMR cluster configurati

Please feel free to modify this configuration to suite your needs.

##Running this sample
## Running this sample

```sh
$> aws datapipeline create-pipeline --name data_conversion_using_hive --unique-id data_conversion_using_hive
Expand Down Expand Up @@ -111,7 +111,7 @@ $> aws datapipeline list-runs --pipeline-id df-0554887H4KXKTY59MRJ
# @TableBackupActivity_2016-03-31T23:38:34 2016-03-31T23:38:38
```

##Related documentation
## Related documentation
* [HiveActivity](http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-hiveactivity.html)
* [RedshiftCopyActivity](https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-redshiftcopyactivity.html)

2 changes: 1 addition & 1 deletion samples/EFSBackup/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# EFSBackup

#####A collection of AWS Data Pipeline templates and scripts used to backup & restore Amazon EFS file systems
##### A collection of AWS Data Pipeline templates and scripts used to backup & restore Amazon EFS file systems

If you need to be able to recover from unintended changes or deletions in your Amazon EFS file systems, you'll need to implement a backup solution. Once such backup solution is presented in the EFS documentation, and can be found here: http://docs.aws.amazon.com/efs/latest/ug/efs-backup.html.

Expand Down
6 changes: 3 additions & 3 deletions samples/LoadTsvFilesInS3ToRedshift/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
#Data Pipeline Load Tab Separated Files in S3 to Redshift
# Data Pipeline Load Tab Separated Files in S3 to Redshift

##About the sample
## About the sample
This pipeline definition when imported would instruct Redshift to load TSV files under the specified S3 Path into a specified Redshift Table. Table insert mode is OVERWRITE_EXISTING.

##Running this sample
## Running this sample
The pipeline requires the following user input point:

1. The S3 folder where the input TSV files are located.
Expand Down
Loading