-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Index presence of ads, trackers #34
Comments
Wow I didn't know about filterlists, looks very useful, thanks! We could use some of those lists for better parsing, for instance better remove cookie notices that usually pollute the top of the pages. => #35 :-) I definitely agree that junk content should be a negative ranking signal for websites. The questions is where to draw the line (or which weights to give to each category). I'm pretty sure we want to outright drop websites containing malware, but what about the rest? Are there lists that differentiate between common trackers like Google Analytics and "less acceptable" ones? There is also a greater discussion to have on the number of options we want to provide users with in a future "advanced search" feature. There is a balance to find between the additional stress these searches could cause on the infrastructure (because they wouldn't be part of the "mainstream" caches) and the number of users/powerusers they could interest. |
Copying information over from (dupe) #59: FilterLists is working on a 2.0 version and I've requested that they include a machine readable format we could parse. |
I think we should just warn users, these issues are typically transitory. All-things-being-equal, a result that doesn't have tracking should be promoted above one that does. |
hey, maintainer of FilterLists here. just discovered commonsearch via @indolering . looks like a great project! no promises on timely completion of a machine-readable format (non-monetized side-project), but it is on my radar to work on. will check back here with updates. |
I just launched v2 of FilterLists, and the data is now in json format on GitHub over here. Feel free to use. |
https://filterlists.com/ could help determine.
Allow users to filter based on index and/or boost results lacking presence.
Looking at https://about.commonsearch.org/values it seems such filters would be mainstream (more so than license filters) and possibly aligned with privacy, though as stated the value is only about what Common Search does with user data. But Common Search's independence could allow it to take stronger (or at least different) measures to protect searchers than Google does.
I'd love to be able to search the web sans ad-laden sites. Not to avoid the ads (for that I use an ad blocker) but to avoid the junk content. Searching for info on many consumer products on Google, one has to wade through ad/affiliate-driven reviews and stores to find neutral information or even information provided by the manufacturer. Filtering out stores would be harder so I didn't put in the title of this issue.
The text was updated successfully, but these errors were encountered: