Update evaluation to populate taxonomy recursively and respect hierar…

…chies (#170) * Update populating taxonomy to be recursive * Consider hierarchy of type in evaluations * Fix hydration of taxonomy to find all keys in model * Add check for policy rule action to only currently support REJECT for evaluation * Fix typing in function definitions * Remove qualifier lists from data set based models * Fix pylint errors * small documentation update * Add evaluation of dataset references * Add validation for missing evaluation resources * Add validation for missing parent_keys and make eval output consistent * Replace instances of identified_data and pseudonymized_data qualifiers with newer version * Update evaluation output to be more consistent * Add hierarchical tests for compare_rule_to_declaration * Add tests for validate_fides_keys_exist_for_evaluation and get_dataset_by_fides_key * Add tests for validate_supported_policy_rules and get_fides_key_parent_hierarchy * Add tests for finding nested missing keys * Fix evaluate validation to include policy rule keys and add tests for dataset evaluation * Add tests for recursive use of populate_referenced_keys * Squashed commit of the following: commit b84e206 Author: CarpeFridiem <brenton@ethyca.com> Date: Thu Oct 28 18:14:13 2021 -0400 Brenton interactive generate (#176) * Basic Functionality in Place * Added initial UI * Added viz url getter * Amended mysql table filter * Update Visualize Endpoint * Combined graphs into one html page and api endpoint * Added associated testing * TODO: currently only visualize default taxonomy, need to update to read all taxonomy from server * Add Annotate CLI * Added commentary for annotate-all * Added annotate option to generate_dataset * Added new cli command annotate_dataset * Visualization for Default and Total Taxonomy * Added functionality to support full taxonomy for visualization * Added viz title change accordingly * FTL * Formatting * Type Hinting * Linting * Minor bug fixes * An attempt was made to satify Xenon but not sure that's going to happen * Xenon fix for visualize [ci-skip] * Xenon and Pylint Fixes * Added annotate to xenon ignore (no space between comma separated items, smh) * Reduced complexity of visualization * Fix Test * Fix Skipping Bug * refactor crud.py to use crud functions instead of embedding the logic directly into the endpoint function * clean up the visualize logic now that crud.py is cleaned up * make the generate_dataset function unaware of annotate_dataset function, move it to the CLI side instead * add fideskey validation to the data categories * pylint fixes * Remove Default Flag for Visualizer * Make Annotate Dataset Separate Command [ci skip] * User input validation * Add option to validate user input data categories for both format and taxonomy compliance * Fixed cli docstring to meet fastapi help expectations [ci skip] * updating cli documentation * Input Validation Minor Fixes and FTL * Added optional data category validation for annotating a dataset * Removed viz hover text for tidiness * Removed default taxonomy only option for resource viz * Format, Type Hint, Lint * Fix Tests * Check-in Multiple Datasets * Main functionality worked out * TODO: Test * Support Multiple Datasets * Refactored Testing * Fix output dataset yaml formatting * Formatting and Linting * update how validation works, known but in the writing manifest part * clean up the looping logic in the annotate-dataset function, fixes the bug around blank files getting written out * update the validation flag for `annotate-dataset` Co-authored-by: Thomas La Piana <tal103020@icloud.com> Co-authored-by: Kelly Huang <kelly@ethyca.com> commit 318f243 Author: kelly <85575406+iamkelllly@users.noreply.github.com> Date: Thu Oct 28 16:43:10 2021 -0400 Delete x "x" slipped through the crax of the API CSS PR. commit 26903b6 Author: Cillian <1268052+cilliankieran@users.noreply.github.com> Date: Tue Oct 26 21:10:53 2021 +0100 1026 ck doc updates (#184) * Updates to SVG renders in dark mode and CSS adjustments to fix code window css * Remove unused stylesheet.css * Update CLI CSS * Format CSS * Tidying pass of unecessary css * Separate and tidy CSS * Include taxonomy CSS Co-authored-by: Neville Samuell <neville@ethyca.com> commit a74fc12 Author: Neville Samuell <neville@ethyca.com> Date: Tue Oct 26 12:27:52 2021 -0400 Update API CSS file for docs site to be consistent with fidesops (and rescope global overrides) (#183) * Rename swagger_override.css to api.css for consistency * Include (re-scoped) API CSS from fidesops repo commit f7fbdea Author: dougfulton <lbj.kgb@gmail.com> Date: Tue Oct 26 11:14:10 2021 -0400 Get rid of 'try it out' and Visualize block (#181) commit efabfd2 Author: dougfulton <lbj.kgb@gmail.com> Date: Tue Oct 26 09:39:18 2021 -0400 restored cli css (#182) Co-authored-by: douglas fulton <dfulton@ad1.systemadmin.com> commit 255a9a9 Author: kelly <85575406+iamkelllly@users.noreply.github.com> Date: Mon Oct 25 20:33:36 2021 -0400 Update fides-logo.svg commit e7768a8 Author: kelly <85575406+iamkelllly@users.noreply.github.com> Date: Mon Oct 25 20:26:22 2021 -0400 Standardizing css with fidesops commit 7ded5af Author: dougfulton <lbj.kgb@gmail.com> Date: Mon Oct 25 18:07:37 2021 -0400 API CSS (#178) * Added more extensive help doc to cli.py and options.py. * updating CLI docs * Update options.py * Update cli.py black didn't like the whitespace. * api and css * Update index.md Co-authored-by: douglas fulton <dfulton@ad1.systemadmin.com> Co-authored-by: Kelly Huang <kelly@ethyca.com> commit 269459e Author: dougfulton <lbj.kgb@gmail.com> Date: Fri Oct 22 01:06:29 2021 -0400 Added more extensive help doc to cli.py and options.py. (#175) * Added more extensive help doc to cli.py and options.py. * updating CLI docs * Update options.py * Update cli.py black didn't like the whitespace. Co-authored-by: douglas fulton <dfulton@ad1.systemadmin.com> Co-authored-by: Kelly Huang <kelly@ethyca.com> commit d9dc784 Author: Adrian Galvan <adriang430@gmail.com> Date: Thu Oct 21 19:57:06 2021 -0700 Fixing stylesheet so dark mode headers can be a separate color from the default light theme (#177) Co-authored-by: Adrian Galvan <adrian@ethyca.com> commit 8bdbf89 Author: dougfulton <lbj.kgb@gmail.com> Date: Thu Oct 21 15:35:58 2021 -0400 Test: cli directory with styled man pages (#157) * man * cli commands * removing dob property * more cli commands * more cli * updates * more * more * Added missing cli commands to the pretty cli doc. The only one that's left (that I know of) is generate-dataset * resolving merge conflicts Co-authored-by: douglas fulton <dfulton@ad1.systemadmin.com> Co-authored-by: Kelly Huang <kelly@ethyca.com> commit a452378 Author: kelly <85575406+iamkelllly@users.noreply.github.com> Date: Tue Oct 19 09:45:15 2021 -0400 Update fides-logo.svg * Add tests for evaluating dataset/dataset collection/dataset field Co-authored-by: Eduardo Armendariz <eduardo@ethyca.com> Co-authored-by: Thomas La Piana <tal103020@icloud.com>
ethyca · Oct 29, 2021 · acd4f1a · acd4f1a
1 parent 81e1524
commit acd4f1a
Show file tree

Hide file tree

Showing 23 changed files with 1,458 additions and 468 deletions.
diff --git a/README.md b/README.md
@@ -71,7 +71,7 @@ If you're looking for a more detailed introduction to Fides, we recommend follow
             data_use: improve.system
             data_subjects:
               - customer
-            data_qualifier: identified_data
+            data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
             dataset_references:
               - demo_users_dataset
 
@@ -87,7 +87,7 @@ If you're looking for a more detailed introduction to Fides, we recommend follow
             data_use: advertising
             data_subjects:
               - customer
-            data_qualifier: identified_data
+            data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
     ```
 
 1. Run `fidesctl evaluate demo_resources`. This will parse all the resource files, sync them to the `fidesctl` server, and then evaluate the defined policy rules to ensure all the systems are compliant:
@@ -153,7 +153,7 @@ If you're looking for a more detailed introduction to Fides, we recommend follow
               inclusion: ANY
               values:
                 - customer
-            data_qualifier: identified_data
+            data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
             action: REJECT
     ```
 

diff --git a/docs/fides/docs/language/resources.md b/docs/fides/docs/language/resources.md
@@ -215,7 +215,7 @@ system:
         data_use: improve.system
         data_subjects:
           - customer
-        data_qualifier: identified_data
+        data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
         dataset_references:
           - demo_users_dataset
     system_dependencies: []
@@ -346,7 +346,7 @@ policy:
           inclusion: ANY
           values:
             - customer
-        data_qualifier: identified_data
+        data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
         action: REJECT
 ```
 
@@ -401,7 +401,7 @@ Fides uses a matching algorithm to determine whether or not each Privacy Declara
     inclusion: ANY
     values:
       - customer
-  data_qualifier: identified_data
+  data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
   action: REJECT
 
 # Example Privacy Declaration:
@@ -412,7 +412,7 @@ Fides uses a matching algorithm to determine whether or not each Privacy Declara
   data_use: advertising
   data_subjects:
     - customer
-  data_qualifier: identified_data
+  data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
 
 # Example Evaluation Logic:
 
@@ -448,7 +448,7 @@ There is a match, and the Privacy Declaration evaluates to REJECT!
     inclusion: ANY
     values:
       - customer
-  data_qualifier: identified_data
+  data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
   action: REJECT
 
 # Example Privacy Declaration:
@@ -458,7 +458,7 @@ There is a match, and the Privacy Declaration evaluates to REJECT!
   data_use: advertising
   data_subjects:
     - customer
-  data_qualifier: identified_data
+  data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
 
 # Example Evaluation Logic:
 

diff --git a/docs/fides/docs/tutorial/policy.md b/docs/fides/docs/tutorial/policy.md
@@ -51,7 +51,7 @@ policy:
           inclusion: ANY
           values:
             - customer
-        data_qualifier: identified_data
+        data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
         action: REJECT
       - fides_key: allow_anon_analytics
         name: Use only anonymized data for analytics

diff --git a/docs/fides/docs/tutorial/system.md b/docs/fides/docs/tutorial/system.md
@@ -39,7 +39,7 @@ system:
         data_use: provide_product_or_service
         data_subjects:
           - customer
-        data_qualifier: identified_data
+        data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
         dataset_references:
           - appdb
 
@@ -55,7 +55,7 @@ system:
         data_use: improve_product_or_service
         data_subjects:
           - customer
-        data_qualifier: identified_data
+        data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
 ```
 
 As you can see, the system is comprised of Privacy Declarations. These can be read colloquially as "This system uses sensitive data types of `data_categories` for `data_subjects` with the purpose of `data_use` at a deidentification level of `data_qualifier`". 

diff --git a/fidesctl/demo_resources/demo_policy.yml b/fidesctl/demo_resources/demo_policy.yml
@@ -18,5 +18,5 @@ policy:
           inclusion: ANY
           values:
             - customer
-        data_qualifier: identified_data
+        data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
         action: REJECT
diff --git a/fidesctl/demo_resources/demo_system.yml b/fidesctl/demo_resources/demo_system.yml
@@ -11,7 +11,7 @@ system:
         data_use: improve.system
         data_subjects:
           - customer
-        data_qualifier: identified_data
+        data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
         dataset_references:
           - demo_users_dataset
 
@@ -27,4 +27,4 @@ system:
         data_use: advertising
         data_subjects:
           - customer
-        data_qualifier: identified_data
+        data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
diff --git a/...tl/src/fidesapi/migrations/versions/45c7a349db68_remove_qualifier_lists_from_data_set_.py b/...tl/src/fidesapi/migrations/versions/45c7a349db68_remove_qualifier_lists_from_data_set_.py
@@ -0,0 +1,38 @@
+"""Remove qualifier lists from data set models
+
+Revision ID: 45c7a349db68
+Revises: 732105cd54e3
+Create Date: 2021-10-25 17:59:25.244689
+
+"""
+from alembic import op
+import sqlalchemy as sa
+from sqlalchemy.dialects import postgresql
+
+# revision identifiers, used by Alembic.
+revision = "45c7a349db68"
+down_revision = "732105cd54e3"
+branch_labels = None
+depends_on = None
+
+
+def upgrade():
+    # ### commands auto generated by Alembic - please adjust! ###
+    op.add_column("datasets", sa.Column("data_qualifier", sa.String(), nullable=True))
+    op.drop_column("datasets", "data_qualifiers")
+    # ### end Alembic commands ###
+
+
+def downgrade():
+    # ### commands auto generated by Alembic - please adjust! ###
+    op.add_column(
+        "datasets",
+        sa.Column(
+            "data_qualifiers",
+            postgresql.ARRAY(sa.VARCHAR()),
+            autoincrement=False,
+            nullable=True,
+        ),
+    )
+    op.drop_column("datasets", "data_qualifier")
+    # ### end Alembic commands ###
diff --git a/fidesctl/src/fidesapi/sql_models.py b/fidesctl/src/fidesapi/sql_models.py
@@ -79,7 +79,7 @@ class Dataset(SqlAlchemyBase, FidesBase):
 
     meta = Column(JSON)
     data_categories = Column(ARRAY(String))
-    data_qualifiers = Column(ARRAY(String))
+    data_qualifier = Column(String)
     collections = Column(JSON)