Overview

If you have PARQUET files stored on an external Windows or Linux server, this python script can be executed locally to load the parquet files to an Exasol database.

Description

This script scans all the sub directories within the root directory you specify and finds all .parquet files.
It creates SCHEMAS based on directory names
It creates tables based on the sub-directories of the SCHEMA directories
And finally loads the respective parquet files
The script parses the data from the level 2 directory. It expects the data format of YYYYMMDD to be contained somewhere in the directory. For example, export_US_f_20211017_50
You can select how many days back in which the files will be loaded. It uses this date to make that determination
Loaded files are stored in a log table in Exasol. Duplicate files will not be loaded. If a file with the same key exists, then the record will be updated with the latest value.


Expected directory structure
1. YOUR ROOT DIRECTORY/
    ├── 2. DIRECTORY WITH NAME CONTAINING DATE (e.g. YYYYMMDD)/
        ├── 3. SCHEMA_NAME/
            ├── 4. TABLE_NAME/
                └── 5. <file_name>.parquet

Note: Parallelization will NOT work when running this code in a Jupyter notebook

Potential To-do's:

If days back = 0, then load all files
Change connection fetch_dict = True for the create_delta_list() function. That way same connection string can be used.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
export_US_f_20211017_50		export_US_f_20211017_50
export_US_f_20211018_50		export_US_f_20211018_50
export_US_f_20211019_50		export_US_f_20211019_50
.DS_Store		.DS_Store
.pyexasol.ini		.pyexasol.ini
FTP Parquet Remote Importer.py		FTP Parquet Remote Importer.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview

Description

Potential To-do's:

About

Uh oh!

Releases

Packages

Languages

zachadda/Exasol-db_Parquet_FTP_Parallel_Importer

Folders and files

Latest commit

History

Repository files navigation

Overview

Description

Potential To-do's:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages