Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement schema presets #146

Merged
merged 35 commits into from
Apr 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
c52a7af
Implement schema inheritance
SmetDenis Apr 5, 2024
9dccfa1
Refactor code for clearer architecture and alias validation
SmetDenis Apr 5, 2024
894090f
Refactored and validated alias inheritance in schemas
SmetDenis Apr 5, 2024
d2e941e
Remove inheritance functionality in src/SchemaDataPrep.php
SmetDenis Apr 5, 2024
ae3e85a
Replace 'inherit' with 'preset' in SchemaDataPrep and update tests
SmetDenis Apr 5, 2024
eb877dc
Refactor 'includes' to 'presets' across files
SmetDenis Apr 6, 2024
aadc22c
Refactor 'includes' to 'presets' across files
SmetDenis Apr 6, 2024
367d9c0
Add new sample and usage schemas and update ContainsNone rule
SmetDenis Apr 6, 2024
733d78d
Update preset_usage.yml with more descriptive names and explanations
SmetDenis Apr 6, 2024
650ce56
Refactor scripts to increase CSV schemas manageability
SmetDenis Apr 6, 2024
dcb15c5
Update preset regex and simplify Readme test
SmetDenis Apr 6, 2024
eaa839b
Update preset regex and simplify Readme test
SmetDenis Apr 6, 2024
18ed698
Refine schema examples, README, and readability of tests
SmetDenis Apr 6, 2024
d98fd84
Improve README and schema example clarity
SmetDenis Apr 6, 2024
95fb688
Improve README and schema example clarity
SmetDenis Apr 6, 2024
41bfee7
Refine preset usage in schema examples and update related instructions
SmetDenis Apr 6, 2024
1f425fc
Handle schema data preparation exceptions
SmetDenis Apr 6, 2024
c21112b
Add ability to dump final schema
SmetDenis Apr 6, 2024
cfd0b50
Highlight "Important notes," add troubleshooting section in README
SmetDenis Apr 6, 2024
0460bb7
Refactor schema validation and dump, update README
SmetDenis Apr 6, 2024
9a0b46a
Update preset feature settings in yml files and README
SmetDenis Apr 6, 2024
60bfd1d
Update preset feature settings in yml files and README
SmetDenis Apr 6, 2024
7c3180d
Add real example test and update README for preset usage
SmetDenis Apr 6, 2024
dcbaa7d
Add real example test and update README for preset usage
SmetDenis Apr 6, 2024
d9268f5
Add real example test and update README for preset usage
SmetDenis Apr 6, 2024
c4189ec
Add real example test and update README for preset usage
SmetDenis Apr 6, 2024
9ba8fd2
Add real example test and update README for preset usage
SmetDenis Apr 6, 2024
5a8c088
Add real example test and update README for preset usage
SmetDenis Apr 6, 2024
492a118
Add real example test and update README for preset usage
SmetDenis Apr 6, 2024
912fd8e
Add real example test and update README for preset usage
SmetDenis Apr 6, 2024
9da769c
Add real example test and update README for preset usage
SmetDenis Apr 6, 2024
7801328
Add real example test and update README for preset usage
SmetDenis Apr 6, 2024
c47d649
Update GitHub workflow with enhanced test steps
SmetDenis Apr 6, 2024
7e7550e
Update GitHub workflow with enhanced test steps
SmetDenis Apr 6, 2024
a0b7da8
Update GitHub workflow with enhanced test steps
SmetDenis Apr 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
588 changes: 547 additions & 41 deletions README.md

Large diffs are not rendered by default.

28 changes: 17 additions & 11 deletions action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,16 @@ branding:

inputs:
csv:
description: >
Path(s) to validate. You can specify path in which CSV files will be searched.
Feel free to use glob pattrens. Usage examples:
/full/path/file.csv, p/file.csv, p/*.csv, p/**/*.csv, p/**/name-*.csv, **/*.csv, etc.
description: |
Specify the path(s) to the CSV files you want to validate.
This can include a direct path to a file or a directory to search with a maximum depth of 10 levels.
Examples: /full/path/name.csv; p/file.csv; p/*.csv; p/**/*.csv; p/**/name-*.csv; **/*.csv
required: true
schema:
description: 'Schema filepath. It can be a YAML, JSON or PHP. See examples on GitHub.'
description: |
Specify the path(s) to the schema file(s), supporting YAML, JSON, or PHP formats.
Similar to CSV paths, you can direct to specific files or search directories with glob patterns.
Examples: /full/path/name.yml; p/file.yml; p/*.yml; p/**/*.yml; p/**/name-*.yml; **/*.yml
required: true
report:
description: 'Report format. Available options: text, table, github, gitlab, teamcity, junit.'
Expand All @@ -43,12 +46,15 @@ inputs:

# Only for debbuging and profiling
extra:
description: >
ANSI output. You can disable ANSI colors if you want with `--no-ansi`.
Verbosity level: Available options: `-v`, `-vv`, `-vvv`
Add flag `--profile` if you want to see profiling info. Add details with `-vvv`.
Add flag `--debug` if you want to see more really deep details.
default: 'extra: --ansi'
description: |
Extra options for the CSV Blueprint. Only for debbuging and profiling.
Available options:
ANSI output. You can disable ANSI colors if you want with `--no-ansi`.
Verbosity level: Available options: `-v`, `-vv`, `-vvv`.
Add flag `--profile` if you want to see profiling info. Add details with `-vvv`.
Add flag `--debug` if you want to see more really deep details.
Add flag `--dump-schema` if you want to see the final schema after all includes and inheritance.
default: 'options: --ansi'

runs:
using: 'docker'
Expand Down
11 changes: 10 additions & 1 deletion schema-examples/full.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,14 @@
"name" : "CSV Blueprint Schema Example",
"description" : "This YAML file provides a detailed description and validation rules for CSV files\nto be processed by CSV Blueprint tool. It includes specifications for file name patterns,\nCSV formatting options, and extensive validation criteria for individual columns and their values,\nsupporting a wide range of data validation rules from basic type checks to complex regex validations.\nThis example serves as a comprehensive guide for creating robust CSV file validations.\n",

"presets" : {
"my-preset" : ".\/preset_users.yml"
},

"filename_pattern" : "\/demo(-\\d+)?\\.csv$\/i",

"csv" : {
"preset" : "my-preset",
"header" : true,
"delimiter" : ",",
"quote_char" : "\\",
Expand All @@ -14,18 +19,21 @@
},

"structural_rules" : {
"preset" : "my-preset",
"strict_column_order" : true,
"allow_extra_columns" : false
},

"columns" : [
{
"preset" : "my-preset/login",
"name" : "Column Name (header)",
"description" : "Lorem ipsum",
"example" : "Some example",
"required" : true,

"rules" : {
"preset" : "my-preset/login",
"not_empty" : true,
"exact_value" : "Some string",
"allow_values" : ["y", "n", ""],
Expand Down Expand Up @@ -147,9 +155,9 @@
"is_luhn" : true,

"phone" : "ALL",
"postal_code" : "US",
"is_iban" : true,
"is_bic" : true,
"postal_code" : "US",
"is_imei" : true,
"is_isbn" : true,

Expand All @@ -166,6 +174,7 @@
"credit_card" : "Any"
},
"aggregate_rules" : {
"preset" : "my-preset/login",
"is_unique" : true,
"sorted" : ["asc", "natural"],

Expand Down
12 changes: 11 additions & 1 deletion schema-examples/full.php
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,14 @@
This example serves as a comprehensive guide for creating robust CSV file validations.
',

'presets' => [
'my-preset' => './preset_users.yml',
],

'filename_pattern' => '/demo(-\\d+)?\\.csv$/i',

'csv' => [
'preset' => 'my-preset',
'header' => true,
'delimiter' => ',',
'quote_char' => '\\',
Expand All @@ -35,18 +40,21 @@
],

'structural_rules' => [
'preset' => 'my-preset',
'strict_column_order' => true,
'allow_extra_columns' => false,
],

'columns' => [
[
'preset' => 'my-preset/login',
'name' => 'Column Name (header)',
'description' => 'Lorem ipsum',
'example' => 'Some example',
'required' => true,

'rules' => [
'preset' => 'my-preset/login',
'not_empty' => true,
'exact_value' => 'Some string',
'allow_values' => ['y', 'n', ''],
Expand Down Expand Up @@ -167,9 +175,9 @@
'is_luhn' => true,

'phone' => 'ALL',
'postal_code' => 'US',
'is_iban' => true,
'is_bic' => true,
'postal_code' => 'US',
'is_imei' => true,
'is_isbn' => true,

Expand All @@ -188,6 +196,8 @@
],

'aggregate_rules' => [
'preset' => 'my-preset/login',

'is_unique' => true,
'sorted' => ['asc', 'natural'],

Expand Down
15 changes: 13 additions & 2 deletions schema-examples/full.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,20 @@ description: | # Any description of the CSV file. Not u
supporting a wide range of data validation rules from basic type checks to complex regex validations.
This example serves as a comprehensive guide for creating robust CSV file validations.

presets: # Include another schema and define an alias for it.
my-preset: ./preset_users.yml # Define preset alias "my-preset". See README.md for details.

# Regular expression to match the file name. If not set, then no pattern check.
# This allows you to pre-validate the file name before processing its contents.
# Feel free to check parent directories as well.
# See: https://www.php.net/manual/en/reference.pcre.pattern.syntax.php
filename_pattern: /demo(-\d+)?\.csv$/i
# preset: my-preset # See README.md for details.

# Here are default values to parse CSV file.
# You can skip this section if you don't need to override the default values.
csv:
preset: my-preset # See README.md for details.
header: true # If the first row is a header. If true, name of each column is required.
delimiter: , # Delimiter character in CSV file.
quote_char: \ # Quote character in CSV file.
Expand All @@ -42,6 +47,7 @@ csv:
# They are not(!) related to the data in the columns.
# You can skip this section if you don't need to override the default values.
structural_rules: # Here are default values.
preset: my-preset # See README.md for details.
strict_column_order: true # Ensure columns in CSV follow the same order as defined in this YML schema. It works only if "csv.header" is true.
allow_extra_columns: false # Allow CSV files to have more columns than specified in this YML schema.

Expand All @@ -50,7 +56,8 @@ structural_rules: # Here are default values.
# This will not affect the validator, but will make it easier for you to navigate.
# For convenience, use the first line as a header (if possible).
columns:
- name: Column Name (header) # Any custom name of the column in the CSV file (first row). Required if "csv.header" is true.
- preset: my-preset/login # Add preset rules for the column. See README.md for details.
name: Column Name (header) # Any custom name of the column in the CSV file (first row). Required if "csv.header" is true.
description: Lorem ipsum # Description of the column. Not used in the validation process.
example: Some example # Example of the column value. Schema will also check this value on its own.

Expand All @@ -63,6 +70,8 @@ columns:
# Data validation for each(!) value in the column. Please, see notes in README.md
# Every rule is optional.
rules:
preset: my-preset/login # Add preset rules for the column. See README.md for details.

# General rules
not_empty: true # Value is not an empty string. Actually checks if the string length is not 0.
exact_value: Some string # Exact value for string in the column.
Expand Down Expand Up @@ -228,9 +237,9 @@ columns:

# Identifications
phone: ALL # Validates if the input is a phone number. Specify the country code to validate the phone number for a specific country. Example: "ALL", "US", "BR".".
postal_code: US # Validate postal code by country code (alpha-2). Example: "02179". Extracted from https://www.geonames.org
is_iban: true # IBAN - International Bank Account Number. See: https://en.wikipedia.org/wiki/International_Bank_Account_Number
is_bic: true # Validates a Bank Identifier Code (BIC) according to ISO 9362 standards. See: https://en.wikipedia.org/wiki/ISO_9362
postal_code: US # Validate postal code by country code (alpha-2). Example: "02179". Extracted from https://www.geonames.org
is_imei: true # Validates an International Mobile Equipment Identity (IMEI). See: https://en.wikipedia.org/wiki/International_Mobile_Station_Equipment_Identity
is_isbn: true # Validates an International Standard Book Number (ISBN). See: https://www.isbn-international.org/content/what-isbn

Expand Down Expand Up @@ -271,6 +280,8 @@ columns:
# Data validation for the entire(!) column using different data aggregation methods.
# Every rule is optional.
aggregate_rules:
preset: my-preset/login # Add preset aggregate rules for the column. See README.md for details.

is_unique: true # All values in the column are unique.

# Check if the column is sorted in a specific order.
Expand Down
12 changes: 10 additions & 2 deletions schema-examples/full_clean.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,13 @@ description: |
supporting a wide range of data validation rules from basic type checks to complex regex validations.
This example serves as a comprehensive guide for creating robust CSV file validations.

presets:
my-preset: ./preset_users.yml

filename_pattern: '/demo(-\d+)?\.csv$/i'

csv:
preset: my-preset
header: true
delimiter: ','
quote_char: \
Expand All @@ -32,16 +36,19 @@ csv:
bom: false

structural_rules:
preset: my-preset
strict_column_order: true
allow_extra_columns: false

columns:
- name: 'Column Name (header)'
- preset: my-preset/login
name: 'Column Name (header)'
description: 'Lorem ipsum'
example: 'Some example'
required: true

rules:
preset: my-preset/login
not_empty: true
exact_value: 'Some string'
allow_values: [ 'y', 'n', '' ]
Expand Down Expand Up @@ -161,9 +168,9 @@ columns:
is_luhn: true

phone: ALL
postal_code: US
is_iban: true
is_bic: true
postal_code: US
is_imei: true
is_isbn: true

Expand All @@ -182,6 +189,7 @@ columns:
credit_card: Any

aggregate_rules:
preset: my-preset/login
is_unique: true
sorted: [ asc, natural ]
first_num_min: 1.0
Expand Down
34 changes: 34 additions & 0 deletions schema-examples/preset_database.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#
# JBZoo Toolbox - Csv-Blueprint.
#
# This file is part of the JBZoo Toolbox project.
# For the full copyright and license information, please view the LICENSE
# file that was distributed with this source code.
#
# @license MIT
# @copyright Copyright (C) JBZoo.com, All rights reserved.
# @see https://github.com/JBZoo/Csv-Blueprint
#

name: Presets for database columns
description: This schema contains basic rules for database user data.

columns:
- name: id
description: Unique identifier, usually used to denote a primary key in databases.
example: 12345
rules:
not_empty: true
is_trimmed: true
is_int: true
num_min: 1
aggregate_rules:
is_unique: true
sorted: [ asc, numeric ]

- name: status
description: Status in database
example: active
rules:
not_empty: true
allow_values: [ active, inactive, pending, deleted ]
68 changes: 68 additions & 0 deletions schema-examples/preset_features.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
#
# JBZoo Toolbox - Csv-Blueprint.
#
# This file is part of the JBZoo Toolbox project.
# For the full copyright and license information, please view the LICENSE
# file that was distributed with this source code.
#
# @license MIT
# @copyright Copyright (C) JBZoo.com, All rights reserved.
# @see https://github.com/JBZoo/Csv-Blueprint
#

name: Complite list of preset features
description: This schema contains all the features of the presets.

presets:
# The basepath for the preset is `.` (current directory of the current schema file).
# Define alias "db" for schema in `./preset_database.yml`.
db: preset_database.yml # Or `db: ./preset_database.yml`. It's up to you.

# For example, you can use a relative path.
users: ./../schema-examples/preset_users.yml

# Or you can use an absolute path.
# db: /full/path/preset_database.yml

filename_pattern: { preset: users } # Take the filename pattern from the preset.
structural_rules: { preset: users } # Take the global rules from the preset.
csv: { preset: users } # Take the CSV settings from the preset.

columns:
# Use name of column from the preset.
# "db" is alias. "id" is column `name` in `preset_database.yml`.
- preset: 'db/id'

# Use column index. "db" is alias. "0" is column index in `preset_database.yml`.
- preset: 'db/0'
- preset: 'db/0:'

# Use column index and column name. It useful if column name is not unique.
- preset: 'db/0:id'

# Use only `rules` of "status" column from the preset.
- name: My column
rules:
preset: 'db/status'

# Override only `aggregate_rules` from the preset.
# Use only `aggregate_rules` of "id" column from the preset.
# We strictly take only the very first column (index = 0).
- name: My column
aggregate_rules:
preset: 'db/0:id'

# Combo!!! If you're a risk-taker or have a high level of inner zen. :)
# Creating a column from three other columns.
# In fact, it will merge all three at once with key replacement.
- name: Crazy combo!
description: > # Just a great advice.
I like to take risks, too.
Be careful. Use your power wisely.
example: ~ # Ignore inherited "example" value. Set it `null`.
preset: 'users/login'
rules:
preset: 'users/email'
not_empty: true # Disable the rule from the preset.
aggregate_rules:
preset: 'db/0'
Loading