Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue/72 new api allow for full fledged processing of protection profiles #466

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

adamjanovsky
Copy link
Collaborator

Closes #72

@adamjanovsky adamjanovsky self-assigned this Jan 23, 2025
@adamjanovsky
Copy link
Collaborator Author

@J08nY first batch of commits that refactored Dataset classes is in. Unfrotunately, I had to merge fresh main into this, so lot of changes underway, better look just at the 2165a66 commit to assess the design.

Some of my early design notes:

  • Dataset class will have aux_handlers attribute that accepts a list of instances that implement the AuxiliaryDatasetHandler interface (in form of ABC base class. These days, I’d opt for Protocol, but to be coherent with the old implementation, let’s stick with inheritance).
  • AuxiliaryDatasetHandler protocol defines process_dataset
  • ProtectionProfile dataset can thus inherit from Dataset class and implement no handlers.
  • Each auxiliary dataset will come with its own handler. This enables code re-use between FIPSDataset and CCDataset classes. Any subclass of Dataset class will simply populate its handlers with the required logic.
  • Computation of individual heuristics is outsourced into functions (not part of any class).

Copy link
Member

@J08nY J08nY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK. But still has conflicts with main. Is the merge commit a real merge commit?

Comment on lines -167 to -187
@property
def pp_dataset_path(self) -> Path:
"""
Returns a path to the dataset of Protection Profiles
"""
return self.auxiliary_datasets_dir / "pp_dataset.json"

@property
def mu_dataset_dir(self) -> Path:
"""
Returns directory that holds dataset of maintenance updates
"""
return self.auxiliary_datasets_dir / "maintenances"

@property
def mu_dataset_path(self) -> Path:
"""
Returns a path to the dataset of maintenance updates
"""
return self.mu_dataset_dir / "maintenance_updates.json"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need a way to get to these somehow, when handling the updates on the site. Can I do it with the new API? Iterate over aux_handlers and get their paths and detect based on type?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They form a dictionary and each child of AuxiliaryDatasetHandler must implement:

  • root_dir attribute, where the dataset files are stored.
  • dset_path attribute, where the final json is stored.

Example, in the case of CPEDataset you can:

cpe_dset = cc_dset.aux_handlers[CPEDatasetHandler].dset
cpe_dset_json_path = cc_dset.aux_handlers[CPEDatasetHandler].dset_path

Comment on lines -259 to -264
if self.auxiliary_datasets.pp_dset:
self.auxiliary_datasets.pp_dset.json_path = self.pp_dataset_path

if self.auxiliary_datasets.mu_dset:
self.auxiliary_datasets.mu_dset.root_dir = self.mu_dataset_dir

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this set properly in the new API?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see

def _set_local_paths(self) -> None:
for handler in self.aux_handlers.values():
handler.set_local_paths(self.auxiliary_datasets_dir)

Comment on lines +279 to +282
self.aux_handlers[CCMaintenanceUpdateDatasetHandler].certs_with_updates = [ # type: ignore
x for x in self if x.maintenance_updates
]
self.aux_handlers[CCSchemeDatasetHandler].only_schemes = {x.scheme for x in self} # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit weird that these two handlers have special stuff here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how to take care of this differently. The Handlers cannot access the Dataset instance they're bound to. And if they must be provided with some specific configuration (e.g., to avoid downloading data for all schemes, when only some are in the dataset), this information must be injected into the handlers.

src/sec_certs/dataset/cc.py Outdated Show resolved Hide resolved
@adamjanovsky
Copy link
Collaborator Author

Looks OK. But still has conflicts with main. Is the merge commit a real merge commit?

Meh, something was left out, should be fixed by now.

Copy link

codecov bot commented Jan 23, 2025

Codecov Report

Attention: Patch coverage is 63.61746% with 175 lines in your changes missing coverage. Please review.

Project coverage is 68.23%. Comparing base (7407773) to head (0610c07).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...rc/sec_certs/dataset/auxiliary_dataset_handling.py 54.19% 82 Missing ⚠️
src/sec_certs/utils/label_studio_utils.py 0.00% 56 Missing ⚠️
src/sec_certs/heuristics/common.py 85.34% 11 Missing ⚠️
src/sec_certs/dataset/dataset.py 54.55% 10 Missing ⚠️
src/sec_certs/heuristics/cc.py 89.71% 7 Missing ⚠️
src/sec_certs/dataset/cc.py 86.21% 4 Missing ⚠️
src/sec_certs/dataset/fips.py 76.48% 4 Missing ⚠️
src/sec_certs/dataset/cpe.py 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #466      +/-   ##
==========================================
- Coverage   68.55%   68.23%   -0.32%     
==========================================
  Files          62       67       +5     
  Lines        7934     7961      +27     
==========================================
- Hits         5438     5431       -7     
- Misses       2496     2530      +34     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New API: Allow for full-fledged processing of protection profiles
2 participants