Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise the scope of the filter syntax converter #228

Closed
JustOff opened this issue Mar 6, 2020 · 55 comments
Closed

Revise the scope of the filter syntax converter #228

JustOff opened this issue Mar 6, 2020 · 55 comments
Labels

Comments

@JustOff
Copy link
Collaborator

JustOff commented Mar 6, 2020

Currently, filter syntax converter applies only to an explicitly specified limited set of filters, and also does not apply to "My filters".

There was a request that this behavior needs to be adjusted.

@JustOff
Copy link
Collaborator Author

JustOff commented Mar 6, 2020

is there any particular reason why the conversion is even limited to specified lists

This was done for performance reasons. This was discussed in more detail somewhere deep in General filter chit-chat, but its size does not allow me now to find exactly where it was. The current set of filters to proceed was created based on which of the filters uses the new syntax. We can expand it if I missed something or think of an additional advanced option to enable processing of all filters, but I would not want to enable it by default.

@THEtomaso
Copy link

THEtomaso commented Mar 6, 2020

I agree that all filters should be converted, including "user filters" (if you refer to "custom" ones, under uBO's "Filter lists" tab).
However, anything that exists under "My filters" should be left alone!

@JustOff
Copy link
Collaborator Author

JustOff commented Mar 6, 2020

This translation is not applied to "My filters"?

Yes, and this was done intentionally so as not to confuse users. Filters loaded from remote sources are displayed by uBO already converted and are displayed in the logs in the same way. With "My filters" this will not work, so they are not converted, and displayed and used as is. Theoretically, I can try to add an additional option that will activate the converter for "My filters", if one need this for A/B testing, but I'm afraid that you yourself will get confused.

@DandelionSprout
Copy link
Contributor

The current set of filters to proceed was created based on which of the filters uses the new syntax. We can expand it if I missed something (…)

That approach has the disadvantage that it doesn't automatically account for whenever a new list has moved from the old syntax to the new. For instance, Frellwit's list now has ≥100 entries with the new +js syntax, which I think happened after the inital creation of the conversion script.

@JustOff
Copy link
Collaborator Author

JustOff commented Mar 6, 2020

+js syntax is supported by uBO legacy core and don't need a converter.

@DandelionSprout
Copy link
Contributor

DandelionSprout commented Mar 6, 2020

Frellwit's list used the old syntax when the conversion began, but has since moved on to the new syntax.

For instance, in uBO 1.16.4.19 on Pale Moon 18.8.4 64-bit, testing on https://www.vasterastidning.se/, the following entries from lines 2774-2777 (preceded by some dozen domains):

##+js(set, youplay.helpers.getTrackUrl, noopFunc)
##+js(set, youplay.helpers.disableRightClick, noopFunc)
##+js(set, youplay.helpers.getTrackUrl, noopFunc)
##+js(set, youplay.helpers.trackUrls, noopFunc)

...are not shown in the uBO logger as having been applied, nor are any other +js entries shown as having been applied either.

@THEtomaso
Copy link

Yes, Frellwit's Swedish Filter does indeed use the new filter syntaxes, and so does many other filters, which the converter currently doesn't process!

@JustOff
Copy link
Collaborator Author

JustOff commented Mar 6, 2020

Ok, probably then it's worth adding all regional filters to the list for processing by default. However, turning processing on globally will have too much negative impact on performance due to monsters like EasyList or Malware Domains, which are also pointless to process, just like all AdGuard lists.

@THEtomaso
Copy link

THEtomaso commented Mar 6, 2020

What about the custom filters?
I'm currently using 12 myself, and I know some people that use even more!

--

turning processing on globally will have too much negative impact on performance

How much of a performance drop do you estimate that we're talking about exactly?

@JustOff
Copy link
Collaborator Author

JustOff commented Mar 6, 2020

What about the custom filters?

What about an option in the advanced settings to turn on global processing? Unfortunately, the best solution has not yet crossed my mind.

How much of a performance drop do you estimate

This can cause the browser to freeze for several seconds when updating filters.

@DandelionSprout
Copy link
Contributor

DandelionSprout commented Mar 6, 2020

Trying to rule out lists that quite clearly intended to target ABP only (e.g. Greek Adblock Filter, EasyList Italy, etc.), gave me these remaining results for lists that do intend to support uBO and its many syntaxes in whichever ways:

* All uAssets lists
* CJX's EasyList Lite
* EasyList Czech and Slovak
* Adblock List for Finland
* EasyList Hebrew
* Oficjalne Polskie Filtry do AdBlocka, uBlocka Origin i AdGuarda
* Frellwit's Swedish Filters
* EasyList Thailand

It's also possible that AdGuard, who are themselves already converting from their own #%#AG_ and/or #%#//scriptlet into uBO's old syntax, may choose to convert into the new syntax instead. If that day comes, all AdGuard lists would have to be whitelisted at the same time.

@THEtomaso
Copy link

This can cause the browser to freeze for several seconds when updating filters.

OK, an optional advanced setting sounds like a reasonable solution then.
This will enable processing for custom filters too, right?

@krystian3w

This comment has been minimized.

@THEtomaso
Copy link

What about an option in the advanced settings to turn on global processing?

Or something like this, perhaps?:

ublockorigin-possiblefunction

@JustOff
Copy link
Collaborator Author

JustOff commented Mar 14, 2020

Well, it seems that since the initial implementation of the filter syntax converter, my test results for its performance are somewhat outdated. I checked it again and it turned out that now it looks quite good at handling even large filters like EasyList or Malware Domains.

Here is a test version that has assetConvertAllRemote parameter in the advanced settings:

uBlock0_1.16.4.20b2.firefox-legacy.xpi.zip (rename to xpi)

Could you please check how uBO behaves when this option turned on in your case? Perhaps now we can just apply the converter for all filters (except My Filters)?

There is also assetConvertMyFilters parameter, you can play around with it too if you want.

@THEtomaso

This comment has been minimized.

@JustOff

This comment has been minimized.

@THEtomaso

This comment has been minimized.

@JustOff

This comment has been minimized.

@krystian3w

This comment has been minimized.

@THEtomaso

This comment has been minimized.

@krystian3w

This comment has been minimized.

@JustOff

This comment has been minimized.

@THEtomaso

This comment has been minimized.

@krystian3w

This comment has been minimized.

@JustOff

This comment has been minimized.

@THEtomaso

This comment has been minimized.

@THEtomaso

This comment has been minimized.

@JustOff
Copy link
Collaborator Author

JustOff commented Mar 15, 2020

In fact, AdGuard filters already use syntax that requires conversion and such filters are not excluded (AdGuard Base, AdGuard Mobile Ads, etc.). At the same time, AdGuard Tracking Protection and AdGuard Social Media, which are excluded, seem to use only simple syntax by the very nature of these filters. Though, if I'm wrong, this can always be corrected later.

@krystian3w
Copy link

AG no needed, They recently launched (a month ago?) xpath and nth-ancestor (I don't think they've started turning it into a upward() one yet).

In addition, when it comes to scriptlets, they have already made sure that long names are used instead of abbreviated ones.

@THEtomaso
Copy link

Well, let's just ask the source..

@BlazDT:
Can you confirm that there are no immediate plans to change filter syntaxes in 'AdGuard Social Media filter' and 'AdGuard Tracking Protection filter'?

@BlazDT
Copy link

BlazDT commented Mar 15, 2020

At least not that I am aware of. We rarely use javascript rules in Tracking and Social filter.

@ameshkov Or is anything planned I am probably not aware of (at least not found anything in filters chat)?

@ameshkov
Copy link

ameshkov commented Mar 16, 2020

@DandelionSprout

converting from their own #%#AG_ and/or #%#//scriptlet into uBO's old syntax

Wait-wait-wait, what have I missed, the scriptlets syntax was changed again?

Regarding the original question:

Really, the only problem I can foresee is if the AdGuard team suddenly decides to change their syntaxes too, like gorhill has done.

I wholeheartedly hate backward-incompatible syntax changes that appear out of the blue.

@JustOff
Copy link
Collaborator Author

JustOff commented Mar 16, 2020

uBO itself perfectly understands both the new and old syntax, but uBO-filters have been converted to the new shorthand one. That is why uBO-legacy now has filter syntax converter, and here we discuss which filters from other sources also require it to be applied.

@DandelionSprout
Copy link
Contributor

DandelionSprout commented Mar 16, 2020

The change occured 1~2 years ago, so that e.g. abort-on-property-read.js could now be written as aopr. But it was only gradually introduced to the uAssets lists this winter.

Double-checking with my AdGuard for Windows installation, aopr is already converted into #%#//scriptlet('ubo-aopr.js'; at least for those lists that are included in that extension such as my Nordic list.

@krystian3w
Copy link

krystian3w commented Mar 16, 2020

But it's not about "uBo-Firefox-legacy", how AG convert the short names in AG-extension / server side to works as an adguard scriplet filter.

If your record doesn't work:
#%#//scriptlet('ubo-aopr.js';

then report it converted to a long name:
#%#//scriptlet('ubo-abort-on-property-read.js';

or add anchors in the code (alias).

@JustOff JustOff changed the title Improve filter syntax converter Revise the scope of the filter syntax converter Mar 17, 2020
JustOff added a commit that referenced this issue Mar 17, 2020
…cluded

Also add an advanced option `assetConvertMyFilters` to make the converter process `My Filters` (disabled by default).

Related issue:
- #228
@JustOff JustOff closed this as completed Mar 17, 2020
@gwarser
Copy link
Contributor

gwarser commented Oct 19, 2020

Scriptlet for Twitch does not work because filter does not have .js ?

@JustOff
Copy link
Collaborator Author

JustOff commented Oct 19, 2020

@gwarser, sorry, but I didn't quite understand your question.

@krystian3w

This comment has been minimized.

@gwarser
Copy link
Contributor

gwarser commented Oct 20, 2020

One user on Reddit complained about ads on Twitch on Waterfox classic (or even older version). I tried adding custom scriptlet to uBO by advanced userResourcesLocation, but twitch.tv##+js(twitch-videoad) was not logged at all. Adding twitch.tv##+js(twitch-videoad.js) to My filters make it work.

@krystian3w
Copy link

"My filters" need use .js.

@gwarser
Copy link
Contributor

gwarser commented Oct 20, 2020

Twitch scriptlet is not translated internally:

mapRules: {
'=1x1.gif': '=1x1-transparent.gif',
'=2x2.png': '=2x2-transparent.png',
'=3x2.png': '=3x2-transparent.png',
'=32x32.png': '=32x32-transparent.png',
'=addthis_widget.js': '=addthis.com/addthis_widget.js',
'=ampproject_v0.js': '=ampproject.org/v0.js',
'=chartbeat.js': '=static.chartbeat.com/chartbeat.js',
'=amazon_ads.js': '=amazon-adsystem.com/aax2/amzn_ads.js',
'=disqus_embed.js': '=disqus.com/embed.js',
'=disqus_forums_embed.js': '=disqus.com/forums/*/embed.js',
'=doubleclick_instream_ad_status.js': '=doubleclick.net/instream/ad_status.js',
'=google-analytics_analytics.js': '=google-analytics.com/analytics.js',
'=google-analytics_cx_api.js': '=google-analytics.com/cx/api.js',
'=google-analytics_ga.js': '=google-analytics.com/ga.js',
'=google-analytics_inpage_linkid.js': '=google-analytics.com/inpage_linkid.js',
'=googlesyndication_adsbygoogle.js': '=googlesyndication.com/adsbygoogle.js',
'=googletagmanager_gtm.js': '=googletagmanager.com/gtm.js',
'=googletagservices_gpt.js': '=googletagservices.com/gpt.js',
'=ligatus_angular-tag.js': '=ligatus.com/*/angular-tag.js',
'=monkeybroker.js': '=d3pkae9owd2lcf.cloudfront.net/mb105.js',
'=noop-0.1s.mp3': '=noopmp3-0.1s',
'=noop-1s.mp4': '=noopmp4-1s',
'=noop.html': '=noopframe',
'=outbrain-widget.js': '=widgets.outbrain.com/outbrain.js',
'=scorecardresearch_beacon.js': '=scorecardresearch.com/beacon.js',
'=noeval-silent.js': '=silent-noeval.js',
'=silent-noeval': '=silent-noeval.js',
'=noop.js': '=noopjs',
'=noop.txt': '=nooptext',
'=popads.js': '=popads.net.js',
'(popads.js)': '(popads.net.js)',
'(popads)': '(popads.net.js)',
'(nobab)': '(bab-defuser.js)',
'(nofab)': '(fuckadblock.js-3.2.0)',
'(acis,': '(abort-current-inline-script.js,',
'(acis.js,': '(abort-current-inline-script.js,',
'(abort-current-inline-script,': '(abort-current-inline-script.js,',
'(aopr,': '(abort-on-property-read.js,',
'(aopr.js,': '(abort-on-property-read.js,',
'(abort-on-property-read,': '(abort-on-property-read.js,',
'(aopw,': '(abort-on-property-write.js,',
'(aopw.js,': '(abort-on-property-write.js,',
'(abort-on-property-write,': '(abort-on-property-write.js,',
'(aeld,': '(addEventListener-defuser.js,',
'(aeld)': '(addEventListener-defuser.js)',
'(addEventListener-defuser,': '(addEventListener-defuser.js,',
'(addEventListener-defuser)': '(addEventListener-defuser.js)',
'(aell,': '(addEventListener-logger.js,',
'(aell)': '(addEventListener-logger.js)',
'(addEventListener-logger,': '(addEventListener-logger.js,',
'(addEventListener-logger)': '(addEventListener-logger.js)',
'(nano-sib,': '(nano-setInterval-booster.js,',
'(nano-sib)': '(nano-setInterval-booster.js)',
'(nano-sib.js,': '(nano-setInterval-booster.js,',
'(nano-sib.js)': '(nano-setInterval-booster.js)',
'(nano-setInterval-booster,': '(nano-setInterval-booster.js,',
'(nano-setInterval-booster)': '(nano-setInterval-booster.js)',
'(nano-stb,': '(nano-setTimeout-booster.js,',
'(nano-stb)': '(nano-setTimeout-booster.js)',
'(nano-stb.js,': '(nano-setTimeout-booster.js,',
'(nano-stb.js)': '(nano-setTimeout-booster.js)',
'(nano-setTimeout-booster,': '(nano-setTimeout-booster.js,',
'(nano-setTimeout-booster)': '(nano-setTimeout-booster.js)',
'(ra,': '(remove-attr.js,',
'(rc,': '(remove-class.js,',
'(remove-attr,': '(remove-attr.js,',
'(sid,': '(setInterval-defuser.js,',
'(nosiif,': '(no-setInterval-if.js,',
'(nosiif)': '(no-setInterval-if.js)',
'(std,': '(setTimeout-defuser.js,',
'(setTimeout-defuser,': '(setTimeout-defuser.js,',
'(nostif,': '(no-setTimeout-if.js,',
'(nostif)': '(no-setTimeout-if.js)',
'(window.open-defuser,': '(window.open-defuser.js,',
'(window.open-defuser)': '(window.open-defuser.js)',
'(nowoif,': '(window.open-defuser.js,',
'(nowoif)': '(window.open-defuser.js)',
'(json-prune,': '(json-prune.js,',
'(json-prune)': '(json-prune.js)',
'(set,': '(set-constant.js,',
'(set-constant,': '(set-constant.js,',
'(cookie-remover,': '(cookie-remover.js,',
'(raf-if,': '(requestAnimationFrame-if.js,',
'(norafif,': '(no-requestAnimationFrame-if.js,',
'(noeval)': '(noeval.js)',
'(nowebrtc)': '(nowebrtc.js)'
},

@gwarser
Copy link
Contributor

gwarser commented Dec 21, 2020

twitch-videoad.js is no longer used in uBO. Even specialized add-on is now failing. They fight hard.

@JustOff
Copy link
Collaborator Author

JustOff commented Dec 21, 2020

That's sad, but thanks for the info, since I don't use twitch myself and don't really follow this topic.

@krystian3w
Copy link

Rather, the script will not survive in this form anyway.

Also, it is unlikely to be supported by parameters so as not to update it "every 2 days" at extension code level.

@THEtomaso
Copy link

Well, I've been considering adding ||twitch.tv^$all anyway..

@AroKol78
Copy link

https://www.chess.com/pl/tv/schedule
https://www.twitch.tv/dawidczerw

@krystian3w

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants