data-catalog-ddl-generator

This repository is to help engineers to generate table ddl from

A specific AWS Glue Database & Table name
A specific AWS Glue Database
All Tables in AWS Glue Data Catalog

To run this on your local machine, you will need to have to configure aws credential on your local computer.

Dependency

Please setup AWS CLI Configuration on your local machine.

Start

python generate_specific_table_ddl.py

Information Input

> Enter Database Name temp
> Enter Table Name temp
> Enter Query Output Bucket Name Athena Query Output S3 Bucket

Results

Execution ID: bla-bla-bla
QUEUED
SUCCEEDED
Query "SHOW CREATE TABLE temp.temp;" finished.
CREATE EXTERNAL TABLE `temp.temp`(
  `temp0` string COMMENT 'temp0',
  `temp1` bigint COMMENT 'temp1',
  `temp2` string COMMENT 'temp2')
PARTITIONED BY (
  `dt` string COMMENT 'temp partition')
ROW FORMAT SERDE
  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
  's3://temp/temp'
TBLPROPERTIES (
  'classification'='parquet',
  'has_encrypted_data'='false',
  'parquet.compress'='SNAPPY')

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

data-catalog-ddl-generator

Dependency

Start

Information Input

Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

data-catalog-ddl-generator

Dependency

Start

Information Input

Results