Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export existing fidesctl resource metadata to csv file #299

Closed
iamkelllly opened this issue Jan 5, 2022 · 0 comments · Fixed by #317
Closed

Export existing fidesctl resource metadata to csv file #299

iamkelllly opened this issue Jan 5, 2022 · 0 comments · Fixed by #317
Assignees

Comments

@iamkelllly
Copy link
Contributor

iamkelllly commented Jan 5, 2022

In an organization, it would be helpful to export the rich metadata that fidesctl already captures in Registry, System, and Dataset resources so that it can provide the foundation for a data map (otherwise known as a data inventory, or data flow map).

This issue will solve for the target state experience:

  • fidesctl developer user runs fidesctl export command, ie fidesctl export path/to/folder/
  • .csv file is written to specified folder that contains Organization, System, Registry, and Dataset metadata, which includes:
    • organization.name, address, email, phone
    • organization.dponame, dpoaddress, dpoemail, dpophone
    • organization.representative, repaddress, repemail, repphone
    • organization.securitypolicy (url)
    • system, dataset: export only what we already capture (resource name, use, subject type, data categories, qualifier)

This will be complete when the CSV file looks similar to the csv here: https://docs.google.com/spreadsheets/d/1AItjzt2DOvCyG9my2wDlUkiTFX8wr9w1gLpNG8VzwTE/edit#gid=0

@iamkelllly iamkelllly added this to the Backlog milestone Jan 5, 2022
@iamkelllly iamkelllly modified the milestones: Backlog, fidesctl 1.3.0 Feb 10, 2022
ThomasLaPiana pushed a commit that referenced this issue Aug 17, 2022
* Allow incoming fields to be specified on a saas config as being dependent on each other, not treated as an independent list of values.

- Update graph_task.pre_process_input_data to be able to optionally separate independent fields from dependent fields when processing incoming data into a collection.

* Add a test for pre_process_input_data when group_dependent_fields is set to True.

- Fix bug where nesting of adding data to output is happening in the wrong place.

* Add validation that grouped_inputs must all reference fields from the same collection.

* Fix bug where empty dict was being added to array.

* Fix bad yaml nesting and the fact that some extra endpoints were adding in the saas config test.

* Fix potential bug where collection name doesn't exist because it didn't pass validation.

* Add a test confirming if no grouped_input fields are specified, "fidesops_grouped_inputs" key just returns an empty list.

* Grouped_inputs fields may not exist.

* Allow grouped inputs to be reference or identity fields.

* Put building the dataset graphs within the try/except because if this fails, this will be swallowed and difficult to debug.

* Remove post-processor item that is being handled by separate PR.

* Responding to CR - when storing grouped_inputs on internal collections, use set representation.

* Set FIDESOPS_GROUPED_INPUTS key regardless.

* Add the fidesops_grouped_inputs keys - they are now included in all outputs.

- Switch the issubset.

* Change grouped_inputs list->set type where we merge collections for saas configs.

* Fix test after merge.
ThomasLaPiana pushed a commit that referenced this issue Sep 26, 2022
* Allow incoming fields to be specified on a saas config as being dependent on each other, not treated as an independent list of values.

- Update graph_task.pre_process_input_data to be able to optionally separate independent fields from dependent fields when processing incoming data into a collection.

* Add a test for pre_process_input_data when group_dependent_fields is set to True.

- Fix bug where nesting of adding data to output is happening in the wrong place.

* Add validation that grouped_inputs must all reference fields from the same collection.

* Fix bug where empty dict was being added to array.

* Fix bad yaml nesting and the fact that some extra endpoints were adding in the saas config test.

* Fix potential bug where collection name doesn't exist because it didn't pass validation.

* Add a test confirming if no grouped_input fields are specified, "fidesops_grouped_inputs" key just returns an empty list.

* Grouped_inputs fields may not exist.

* Allow grouped inputs to be reference or identity fields.

* Put building the dataset graphs within the try/except because if this fails, this will be swallowed and difficult to debug.

* Remove post-processor item that is being handled by separate PR.

* Responding to CR - when storing grouped_inputs on internal collections, use set representation.

* Set FIDESOPS_GROUPED_INPUTS key regardless.

* Add the fidesops_grouped_inputs keys - they are now included in all outputs.

- Switch the issubset.

* Change grouped_inputs list->set type where we merge collections for saas configs.

* Fix test after merge.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants