-
Notifications
You must be signed in to change notification settings - Fork 137
Add support for new OO-style scanners to ./scan #232
Conversation
…er versions. This means a lot of code repetition, but we can hopefully cut that down over time.
scanners/noopabc.py
Outdated
# Run locally. | ||
logging.debug("Subclass (noopabc) __init__ method for %s." % domain) | ||
self.extra_environment = {"variable": domain} | ||
super().__init__(domain, handles, environment, options, extra) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more than I'm comfortable seeing scanner writers have to remember to do. The original function signature is just init(environment, options)
. I'd prefer to see an __init__
method can go in the base class as a non-abstract method, and init
remain as init(self, environment, options)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why they have to remember any of this; the __init__
method is in the base class and they would only need to write this for their scanner if they have per-domain functionality. Having __init__
and init
methods seems very confusing; if we want to take the per-domain out of __init__
, we could move it to a method called per_domain
or something similar.
Also, I'm assuming we would have a commented template for scanners that writers would be able to start from, rather than expecting them to remember what the methods and signatures are.
scanners/noopabc.py
Outdated
# that use the network or are otherwise expensive would go. | ||
# | ||
# Runs locally or in the cloud (Lambda). | ||
domain = self.domain # noqa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why assign this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a stray line from earlier testing, I'll remove it.
scanners/scannerabc.py
Outdated
pass | ||
|
||
@abstractmethod | ||
def to_rows(self, data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this meant to have a declaration of the return type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll add that in.
scanners/scannerabc.py
Outdated
# Some subclasses may define an ``extra_environment`` property in their | ||
# __init__ methods, and we want to fold its values into the | ||
# ``environment`` property. | ||
self.environment.update(getattr(self, "extra_environment", {})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There hasn't been an extra
or extra_environment
value used in scanners. I'd prefer to keep the number of moving parts smaller here if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current structure of scan
effectively passes around global variables that act as an extra environment. It's possible there's an elegant way to refactor it, but I've tried to change the structure of scan
only minimally in this first pass. I'll take another pass through it and try to simplify this.
scanners/scannerabc.py
Outdated
@classmethod | ||
@abstractmethod | ||
def initialize_environment(cls: Type[ScannerABCT], environment: dict, | ||
options: dict) -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to see the call signature for scanners stay as it's been. I would also like to avoid passing TypeVar
's into individual scanners, with the goal of keeping the task of writing a scanner as simple and approachable as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why this is a factor; I'm assuming that we'll have a template for scanners that writers will copy from, meaning they won't have to worry about this type declaration. As for calling this class method, it remains the same, i.e. scan_class.initialize_environment(scan_class, environment, options)
. Am I missing a consideration here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming that we'll have a template for scanners that writers will copy from, meaning they won't have to worry about this type declaration.
I'm still assuming scanners will be written by hand, and would like that to be as accessible as possible. But even when using a template, I'd like it to be clear what's going on in the file, and what each function means and does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for calling this class method, it remains the same, i.e. scan_class.initialize_environment(scan_class, environment, options). Am I missing a consideration here?
I think so? There's never been an initialize_environment(scan_class, environment, options)
function. Only an init(environment, options)
function.
…aration for turning gather into its own project. Refactor new scan OO appraoch so that what was “overall scan: class/per-domain: instance” is now “overall scan: instance/per-domain: instance.scan()”.
…the only values in handles will be scanner details, and that we know that Python 3 will preserve dict order.
@tadhg-ohiggins Is this ready for merge yet? I'm comfortable moving forward with the PR as it is. |
'duration': utils.just_microseconds(duration), | ||
'start_time': scan_utils.utc_timestamp(start_time), | ||
'end_time': scan_utils.utc_timestamp(end_time), | ||
'duration': scan_utils.just_microseconds(duration), | ||
'durations': durations, | ||
'command': start_command, | ||
'scan_uuid': scan_uuid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assigning this got moved lower in the script. Is this still set by this time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the variables durations
, start_command
, and scan_uuid
? If so, yes, they're still in scope at this point. (start_command
is still a global variable.)
…3.6; back it out and check for path existence specifically.
… Path.open instead in domains_from.
The old and the new
Sometimes have to co-exist.
Let fitness decide.
This adds support for a new style of object-oriented scanner but still supports the older style. This means a lot of code repetition, but we can hopefully cut that down over time.
Supporting both for a while also means that we can incrementally switch them over.
This initial checkin only added an abstract base class approach for
noop.py
in the form ofnoopabc.py
as a proof-of-concept.