-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Caching, advanced mapping and separating events for MISP Feed output bot #2509
base: develop
Are you sure you want to change the base?
Conversation
b4f2e68
to
0e0b949
Compare
Yeah, finally green 🎉 |
Are you using it in production? |
We started using it on staging, found some pain points, and now I'll test it & hopefully promote to prod in a few days - so, not yet, but soon ;) |
There's not much I can check here without reading the MISP code and docs and setting up an instance. Maybe @Rafiot, can you have a glimpse? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor changes requested pls.
@@ -4595,6 +4595,44 @@ The PyMISP library >= 2.4.119.1 is required, see | |||
() The output bot creates one event per each interval, all data in this time frame is part of this event. Default "1 | |||
hour", string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens when set to 0 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked in the code, and it looks like in such case, this would be a ValueError. However, you can use "0 hours" what should result in creating a new MISP event for every IntelMQ event.
docs/user/bots.md
Outdated
|
||
(optional, string): If set to a field name from IntelMQ event, the bot will group incoming messages | ||
in separated MISP events, based on the value of this field. The `interval_event` parameter acts | ||
for all grouping events together. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not 100% clear for me yet:
will it collect events until (... until when?...) and then sort and group by this viel?
I think rephrasing this slightly makes it clearer in the documentaiton.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about the current version?
will it collect events until (... until when?...) and then sort and group by this viel?
This field is responsible only for separating using the field. Collecting & holding is a different setting, that can be used together, but you don't have to. Let's have a look at the new description
output_dir: str = "/opt/intelmq/var/lib/bots/mispfeed-output" # TODO: should be path | ||
output_dir: str = ( | ||
"/opt/intelmq/var/lib/bots/mispfeed-output" # TODO: should be path | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the tuple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a tuple ;) It's a Python way to put the string into next line (or split into multiple strings) to avoid line-length limit - although it's my fault. The rules we enforce in the repository are very weak, so weak that my almost every time wants tot reformat a lot, and sometimes I forget reverting changes not related to my code.
BTW, I think more and more that we would profit from enforcing black/ruff instead of weak pycodestyle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also using black more and more...
else: | ||
self._custom_mapping(obj, message) | ||
|
||
def _default_mapping(self, obj: "MISPObject", message: dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disagree it's needed here.
|
||
feed_output = self.current_event.to_feed(with_meta=False) | ||
def _custom_mapping(self, obj: "MISPObject", message: dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disagree it's needed here.
|
||
with self.current_file.open('w') as f: | ||
json.dump(feed_output, f) | ||
def _generate_feed(self, message: dict = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disagree it's needed here.
My main request is:
|
Thanks for the review! re 1: It's a good idea, I'll look into it. I was concentrated on what we actually need, and I do think there are more things we can improve in the bot. |
Generating MISP feed on every incoming message slows down processing. The new config option let us decide to save them in batches. Cached events are stored in a cache list in Redis. In addition, a code related to Python 3.6 was removed as we do not support this version any more.
The bot can now construct an event much more alligned to custom needs, allowing setting comments and selecting just a subset of fields to export
With `event_separator` parameter, user can decide to create more than one MISP event in the output bot and group incomming messages based on given field. In additon, the message library was fixed not to modify the parameter directly.
A lot of tests depend on that, so it looks currently risky to change.
246e181
to
05f4aef
Compare
Could you have a look again? I have implemented tagging as well as rewritten the documentation, added config examples, implemented validation in the I do admit that the configuration of the bot is complex. I did my best to marriage flexibility and readability, but I think in the future we may eventually redesign this configuration, based on feedback. I'm also open for any suggestions. |
@aaronkaplan you requested some changes to this PR and @kamil-certat implemented various changes based on your feedback. Can you please re-check? |
MISP Feed output bot got new features:
The bot is fully backward-compatible. By default, the previous behaviour is kept.
In addition, code related to Python 3.6 was removed and the message library was fixed not to modify the original dict instance.
This PR replaces on PR #2505.