Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor evaluation logic #286

Merged
merged 4 commits into from
Dec 20, 2021
Merged

Conversation

earmenda
Copy link
Contributor

@earmenda earmenda commented Dec 10, 2021

Closes #211
Closes #264

Code Changes

Steps to Confirm

  • New unit tests added for missing key validation
  • Existing unit tests test including evaluation

Pre-Merge Checklist

  • All CI Pipelines Succeeded
  • Documentation Updated

Description Of Changes

  • The reason that we can now remove the old missing validation and replace it with a much simpler one is that due to a recent change FidesKey is now reserved for top level models. This is how it was meant to be modeled to begin with.
  • Updates to evaluation logic so evaluation can output which fields triggered a failure

@earmenda earmenda added this to the fidesctl 1.2 milestone Dec 10, 2021
@earmenda earmenda self-assigned this Dec 10, 2021
@earmenda
Copy link
Contributor Author

@ThomasLaPiana

A couple notes on this:

  • I worked on trying to reduce the number of api calls from evaluate but unfortunately I couldn't think of a way of getting the resource type when we populate fides keys. I thought originally we could get it from the type in get_referenced_missing_keys but this finds nested keys which are not of that type.
  • I wasn't able to get profiling to work by using the fidesctl command, this seemed to work alot easier through code
  • Add actionable failure messaging to evaluate  #264 I can add to this, should be easy

@earmenda
Copy link
Contributor Author

Made code change to address #264

@ThomasLaPiana
Copy link
Contributor

@earmenda do we have documentation anywhere about how to auto-generate the migrations? I feel like we should add that in somewhere, just since it is definitely not an obvious thing

@earmenda
Copy link
Contributor Author

@ThomasLaPiana Not at the moment. I can add a new section to the development markdown page. I could see it working under Write your code we could add something like To learn how to migrate the database for schema changes, see the database migration guide

It would just be short at the moment since we just need to document something like this:
alembic revision --autogenerate -m "message"

I do think in the future there will be more things we will need to document around this though.

@earmenda
Copy link
Contributor Author

@ThomasLaPiana

After the latest changes this is what the output of a failing evaluation looks like:

root@8d4b52631402:/fides/fidesctl# fidesctl evaluate --message "Testing new evaluation output" demo_resources/
Loading resource manifests from: demo_resources/
Taxonomy successfully created.
----------
Processing policy resources...
CREATED 0 policy resources.
UPDATED 0 policy resources.
SKIPPED 1 policy resources.
----------
Processing registry resources...
CREATED 0 registry resources.
UPDATED 0 registry resources.
SKIPPED 1 registry resources.
----------
Processing system resources...
CREATED 0 system resources.
UPDATED 0 system resources.
SKIPPED 2 system resources.
----------
Processing dataset resources...
CREATED 0 dataset resources.
UPDATED 0 dataset resources.
SKIPPED 1 dataset resources.
----------
Loading resource manifests from: demo_resources/
Taxonomy successfully created.
Evaluating the following Policies:
- demo_privacy_policy
----------
Checking for missing resources...
Executing Policy evaluation(s)...
Sending the evaluation results to the server...
{
  "fides_key": "43551730_cb99_4dfa_888a_81c53ac9c72f",
  "status": "FAIL",
  "violations": [
    {
      "violating_attributes": {
        "data_category": "user.provided.identifiable.contact",
        "data_subject": "customer",
        "data_qualifier": "aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified",
        "data_use": "advertising"
      },
      "detail": "Declaration (Collect data for marketing) of System (demo_marketing_system) failed Rule (Reject Direct Marketing) from Policy (demo_privacy_policy). Violated usage of user.provided.identifiable.contact data qualified by aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified for data use advertising and subject customer"
    }
  ],
  "message": "Testing new evaluation output"
}

A few callouts on the current state though:

  • I did not create a new high level object for Violation. This might be something we need in the future though if we need to reference violations by an id. Let me know if you think we should just do this now.
  • My first stab at this was harder than I expected and it made the compare_rule_to_declaration a little more complicated than I was hoping. Now that we care about which keys fail, the old logic doesn't work as nicely. I can try to simplify it today if i find some time.

@earmenda earmenda marked this pull request as ready for review December 15, 2021 17:50
@ThomasLaPiana
Copy link
Contributor

@earmenda that output looks great! going to give the code another review and we'll get this merged, this definitely satisfies the requirements

Copy link
Contributor

@ThomasLaPiana ThomasLaPiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Tested locally, all was working.

@ThomasLaPiana
Copy link
Contributor

@earmenda you're good to merge when ready!

@earmenda
Copy link
Contributor Author

@ThomasLaPiana

Thanks for the review. I just pushed a few other changes

  • Added a database_migration.md page to document how to migrate the database
  • Refactored compare_rule_to_declaration a bit so it's more straightforward . Also added some more comments
  • Improved tests for compare_rule_to_declaration to check values
  • ViolationAttributes now has a list of categories / uses / subjects
  • Touched up the detail string so it's a little more consistent

This is ready for merge so if you don't see anything you'd like to see changed then feel free to merge it in or approve again.

root@51540454edf5:/fides/fidesctl# fidesctl evaluate demo_resources/
Loading resource manifests from: demo_resources/
Taxonomy successfully created.
----------
Processing policy resources...
CREATED 0 policy resources.
UPDATED 0 policy resources.
SKIPPED 1 policy resources.
----------
Processing system resources...
CREATED 0 system resources.
UPDATED 0 system resources.
SKIPPED 2 system resources.
----------
Processing registry resources...
CREATED 0 registry resources.
UPDATED 0 registry resources.
SKIPPED 1 registry resources.
----------
Processing dataset resources...
CREATED 0 dataset resources.
UPDATED 0 dataset resources.
SKIPPED 1 dataset resources.
----------
Loading resource manifests from: demo_resources/
Taxonomy successfully created.
Evaluating the following policies:
- demo_privacy_policy
----------
Checking for missing resources...
Executing Policy evaluation(s)...
Sending the evaluation results to the server...
{
  "fides_key": "57118033_1eae_4973_bbce_df3cc0898d02",
  "status": "FAIL",
  "violations": [
    {
      "violating_attributes": {
        "data_categories": [
          "user.provided.identifiable.contact",
          "user.provided.identifiable"
        ],
        "data_subjects": [
          "customer"
        ],
        "data_uses": [
          "advertising"
        ],
        "data_qualifier": "aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified"
      },
      "detail": "Declaration (Collect data for marketing) of system (demo_marketing_system) failed rule (Reject Direct Marketing) from policy (demo_privacy_policy). Violated usage of data categories (user.provided.identifiable.contact,user.provided.identifiable) with qualifier (aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified) for data uses (advertising) and subjects (customer)"
    }
  ],
  "message": "Testing evaluation output"
}

@ThomasLaPiana ThomasLaPiana merged commit 273f12a into main Dec 20, 2021
@ThomasLaPiana ThomasLaPiana deleted the earmenda-evaluation-refactoring branch December 20, 2021 16:36
ThomasLaPiana pushed a commit that referenced this pull request Aug 17, 2022
* Implementations of offset, link, and cursor pagination

* Adding pagination to SaaS connector workflow
Updating documentation and Postman collection

* Fixing Pylint warning

* Updating unwrap postprocessor to accepts lists in addition to dicts
Accounting for the use case where the list of objects is at the root level of the response and does not need a data_path

* Adding missing test case

Co-authored-by: Adrian Galvan <adrian@ethyca.com>
ThomasLaPiana pushed a commit that referenced this pull request Sep 26, 2022
* Implementations of offset, link, and cursor pagination

* Adding pagination to SaaS connector workflow
Updating documentation and Postman collection

* Fixing Pylint warning

* Updating unwrap postprocessor to accepts lists in addition to dicts
Accounting for the use case where the list of objects is at the root level of the response and does not need a data_path

* Adding missing test case

Co-authored-by: Adrian Galvan <adrian@ethyca.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add actionable failure messaging to evaluate Refactor Evaluation logic
2 participants