Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Max messages could be higher (or not exist at all) #8

Open
predictiple opened this issue Nov 26, 2019 · 5 comments
Open

Max messages could be higher (or not exist at all) #8

predictiple opened this issue Nov 26, 2019 · 5 comments

Comments

@predictiple
Copy link

predictiple commented Nov 26, 2019

extract_windows.go#L176 causes extraction to skip some potentially useful DLLs.

Without the restriction the extraction seems to max out at 65535 (see logs in my latest PR) which seems pretty reasonable and also preferable to ignoring such DLLs completely.

Is the limit just there to keep the extraction results down to a convenient size or is it a safeguard against some more serious condition?

@scudette
Copy link
Contributor

scudette commented Nov 26, 2019 via email

@predictiple
Copy link
Author

ok, and yes I have seen such dlls. It's unlikely that any developers are going to craft more than 10k individual messages so probably the 10k constraint is reasonable. I just get nervous when I see things being excluded. For dlls that do just repeat the same values, the sqlite db tables' uniqueness constraint will deduplicate that. So it's really just about wasted CPU cycles.

Not sure if you've noticed the 2nd PR that I've submited for the evtx-data repo? Those db files were all extracted without the 10k limit and there are extraction logs for each where you can see which dlls are hitting 65535.

@predictiple
Copy link
Author

that just reminded me to mention that in the logging there's a typo: "Openning"

@scudette
Copy link
Contributor

scudette commented Nov 27, 2019 via email

@predictiple
Copy link
Author

It doesn't look like we store channel. I thought event ids are unique (in theory) per provider so the combination of eventID and provider name implies the channel. In my Python script that merges the dbs I apply uniqueness constraint on the rows of each table, so it does deduplicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants