Skip to content
This repository has been archived by the owner on Dec 17, 2021. It is now read-only.

Add support for new OO-style scanners to ./scan #232

Merged
merged 14 commits into from
Apr 4, 2018
Merged

Conversation

tadhg-ohiggins
Copy link
Contributor

The old and the new
Sometimes have to co-exist.
Let fitness decide.

This adds support for a new style of object-oriented scanner but still supports the older style. This means a lot of code repetition, but we can hopefully cut that down over time.

Supporting both for a while also means that we can incrementally switch them over.

This initial checkin only added an abstract base class approach for noop.py in the form of noopabc.py as a proof-of-concept.

…er versions. This means a lot of code repetition, but we can hopefully cut that down over time.
# Run locally.
logging.debug("Subclass (noopabc) __init__ method for %s." % domain)
self.extra_environment = {"variable": domain}
super().__init__(domain, handles, environment, options, extra)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more than I'm comfortable seeing scanner writers have to remember to do. The original function signature is just init(environment, options). I'd prefer to see an __init__ method can go in the base class as a non-abstract method, and init remain as init(self, environment, options).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why they have to remember any of this; the __init__ method is in the base class and they would only need to write this for their scanner if they have per-domain functionality. Having __init__ and init methods seems very confusing; if we want to take the per-domain out of __init__, we could move it to a method called per_domain or something similar.
Also, I'm assuming we would have a commented template for scanners that writers would be able to start from, rather than expecting them to remember what the methods and signatures are.

# that use the network or are otherwise expensive would go.
#
# Runs locally or in the cloud (Lambda).
domain = self.domain # noqa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why assign this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a stray line from earlier testing, I'll remove it.

pass

@abstractmethod
def to_rows(self, data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to have a declaration of the return type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll add that in.

# Some subclasses may define an ``extra_environment`` property in their
# __init__ methods, and we want to fold its values into the
# ``environment`` property.
self.environment.update(getattr(self, "extra_environment", {}))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There hasn't been an extra or extra_environment value used in scanners. I'd prefer to keep the number of moving parts smaller here if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current structure of scan effectively passes around global variables that act as an extra environment. It's possible there's an elegant way to refactor it, but I've tried to change the structure of scan only minimally in this first pass. I'll take another pass through it and try to simplify this.

@classmethod
@abstractmethod
def initialize_environment(cls: Type[ScannerABCT], environment: dict,
options: dict) -> dict:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to see the call signature for scanners stay as it's been. I would also like to avoid passing TypeVar's into individual scanners, with the goal of keeping the task of writing a scanner as simple and approachable as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why this is a factor; I'm assuming that we'll have a template for scanners that writers will copy from, meaning they won't have to worry about this type declaration. As for calling this class method, it remains the same, i.e. scan_class.initialize_environment(scan_class, environment, options). Am I missing a consideration here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that we'll have a template for scanners that writers will copy from, meaning they won't have to worry about this type declaration.

I'm still assuming scanners will be written by hand, and would like that to be as accessible as possible. But even when using a template, I'd like it to be clear what's going on in the file, and what each function means and does.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for calling this class method, it remains the same, i.e. scan_class.initialize_environment(scan_class, environment, options). Am I missing a consideration here?

I think so? There's never been an initialize_environment(scan_class, environment, options) function. Only an init(environment, options) function.

tadhg-ohiggins and others added 4 commits April 2, 2018 17:07
…aration for turning gather into its own project. Refactor new scan OO appraoch so that what was “overall scan: class/per-domain: instance” is now “overall scan: instance/per-domain: instance.scan()”.
…the only values in handles will be scanner details, and that we know that Python 3 will preserve dict order.
@konklone
Copy link
Contributor

konklone commented Apr 3, 2018

@tadhg-ohiggins Is this ready for merge yet? I'm comfortable moving forward with the PR as it is.

'duration': utils.just_microseconds(duration),
'start_time': scan_utils.utc_timestamp(start_time),
'end_time': scan_utils.utc_timestamp(end_time),
'duration': scan_utils.just_microseconds(duration),
'durations': durations,
'command': start_command,
'scan_uuid': scan_uuid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assigning this got moved lower in the script. Is this still set by this time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the variables durations, start_command, and scan_uuid? If so, yes, they're still in scope at this point. (start_command is still a global variable.)

@tadhg-ohiggins tadhg-ohiggins merged commit eaedcfc into master Apr 4, 2018
@konklone konklone deleted the scanner-abc branch April 16, 2018 19:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants