Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding custom threat lists to database #48

Open
TheFrozenFire opened this issue Jul 29, 2019 · 3 comments
Open

Adding custom threat lists to database #48

TheFrozenFire opened this issue Jul 29, 2019 · 3 comments

Comments

@TheFrozenFire
Copy link

I am integrating gglsbl as a backend for checking of URLs that are sent in our customers' messaging. This library has been extremely helpful in doing so - props for that.

A question I have is whether anyone has attempted to integrate data from other sources into their database. For instance, we will be checking URLs against feeds such as OpenPhish, and industry-specific feeds we've gained access to.

It's not super obvious from the code whether the functions for converting a URL to an entry which can be added to the database are present, though it seems like it might be possible using the function to get the hashes of a URL, plus the functions to add threat entries to the database.

One area of confusion for me is how I might compute "threat prefixes". Is that just intended to be the first four hash values of a URL's hash list?

@afilipovich
Copy link
Owner

This library is compatible only with Google Safe Browsing API data source.
URLs are transformed into hashes and hash prefixes as described in this spec:
https://developers.google.com/safe-browsing/v4/urls-hashing

Conceptually it is possible to add support for other URL blacklist providers, but it would require a major overhaul. In fact it would be easier to make a separate library for other feeds as they list URLs in clear text while Google provides only irreversible hashes which require extra transformations and lookups.

@TheFrozenFire
Copy link
Author

My intention would be to generate the same sort of hashes that Google does, and then add those to the database. I'm not looking to add support for other feed providers to the library itself, but rather to add some helper functionality for generating the same hash format as Google does, and for associating those hashes with threat lists.

@afilipovich
Copy link
Owner

Gotcha. You can use this class to translate URL to hash: https://github.com/afilipovich/gglsbl/blob/master/gglsbl/protocol.py#L168

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants