IN4253ET-HackingLab

About

Why

Cooooookiiies

How

This project only focusses on the websites owned by the Dutch government. The domains used for this project were fetched from the Dutch goverrnment website. Using the script in this repository it is possible to fetch all the domain names automatically.

TODO: add univeristies, hospitals, police & banks to the list

Setup

This project makes use of Python3. In order to run this repository the dependencies of the Pythia library are required. Run the following command to install all dependencies:

$ pip3 install --upgrade ipwhois tldextract wordsegment selenium bs4 dnspython intervaltree netaddr nltk psutil

Google Chrome aswell as the corresponding Chrome Driver should be installed on the OS aswell. Make sure to download the correct driver version. This should match the version of the install Google Chrome browser. The Chrome Driver should be located in the Pythia folder and chrome path should be the following: C:\Program Files\Google\Chrome\Application\chrome.exe

TODO:

et some links on the frontpage to get the cookies

and also if the user clicks something, does the user get more cookies?

compare front page vs inside

get all the links -> take up to 20

something with the https://

decisions for tools, why

what worked and what doesnt work

for the list of trackers we used these sources

rank the websites by # third parties

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
3rd-party-trackers		3rd-party-trackers
graphs		graphs
out		out
websites		websites
.gitignore		.gitignore
Crawler.py		Crawler.py
CrawlerManager.py		CrawlerManager.py
README.md		README.md
fetch_urls.sh		fetch_urls.sh
graphbuilder.py		graphbuilder.py
helper.py		helper.py
main.py		main.py
post_processing.py		post_processing.py
requirements.txt		requirements.txt
sort_trackers.ipynb		sort_trackers.ipynb
statistics.py		statistics.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IN4253ET-HackingLab

About

Why

How

Setup

TODO:

About

Releases

Packages

Contributors 3

Languages

xncz8h/IN4253ET-HackingLab

Folders and files

Latest commit

History

Repository files navigation

IN4253ET-HackingLab

About

Why

How

Setup

TODO:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages