SparkPipelineFramework.Tests

Testing framework that can tests SparkPipelineFramework library by just providing input files to setup before running the transformer and output files to use for verifying the output

Usage

Create a folder structure similar to the folder structure of your library in SparkPipelineFramework (This is how the Testing Framework finds the Transformer to run)
Create an input folder and put in files that represent the input views. These files can be csv, json or parquet
(Optionally) Create an input_schema folder and put in any schemas you want applied to the above views. This follows the Spark Json Schema format.
(Optional) Create an output folder and put in files that represent the output views you expect. These files can be csv, json or parquet
(Optional) Create an output_schema folder and put in any schemas you want applied to the output views
Copy the following test code and put it in a test file in this folder

from pathlib import Path

from pyspark.sql import SparkSession

from spark_pipeline_framework_testing.test_runner import SparkPipelineFrameworkTestRunner


def test_folder(spark_session: SparkSession) -> None:
    data_dir: Path = Path(__file__).parent.joinpath('./')

    SparkPipelineFrameworkTestRunner.run_tests(spark_session=spark_session, folder_path=data_dir)

Now just run this test.

Note: the test finds files in sub-folders too.

Example

For the transformer defined here: https://github.com/imranq2/SparkPipelineFramework.Testing/tree/main/library/features/people/my_people_feature You can find the test here: https://github.com/imranq2/SparkPipelineFramework.Testing/tree/main/tests/library/features/people/my_people_feature

Publishing a new package

Create a new release
The GitHub Action should automatically kick in and publish the package
You can see the status in the Actions tab

Name		Name	Last commit message	Last commit date
Latest commit History 479 Commits
.github/workflows		.github/workflows
docs		docs
docsrc		docsrc
keycloak-config		keycloak-config
library		library
spark_pipeline_framework_testing		spark_pipeline_framework_testing
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
VERSION		VERSION
conftest.py		conftest.py
create_spark_session.py		create_spark_session.py
docker-compose.yml		docker-compose.yml
pre-commit-hook		pre-commit-hook
pre-commit.Dockerfile		pre-commit.Dockerfile
setup.cfg		setup.cfg
setup.py		setup.py
spark.Dockerfile		spark.Dockerfile
spark_json_schema.json		spark_json_schema.json
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SparkPipelineFramework.Tests

Usage

Example

Publishing a new package

About

Uh oh!

Releases 154

Packages

Uh oh!

Contributors 13

Uh oh!

Languages

License

icanbwell/SparkPipelineFramework.Testing

Folders and files

Latest commit

History

Repository files navigation

SparkPipelineFramework.Tests

Usage

Example

Publishing a new package

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 154

Packages 0

Uh oh!

Contributors 13

Uh oh!

Languages

Packages