Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OPENJPA-2924] BlacklistClassResolver is improved #118

Merged
merged 2 commits into from
Aug 19, 2024

Conversation

solomax
Copy link
Contributor

@solomax solomax commented Aug 10, 2024

The list of improvement is:

  • white/black list can now be specified as *, this is especially useful for blacklist i.e. block all
  • empty System.property value is now treated as empty list
  • primitives are now whitelisted

The code might be simplified in case Class will be checked (not it's name)

@rmannibucau you were the author of the original fix, could you please review these changes? :)

@solomax solomax self-assigned this Aug 10, 2024
@solomax
Copy link
Contributor Author

solomax commented Aug 10, 2024

BTW IMO it would be useful to have some DEBUG/TRACE logging on what is currently being checked. I was unable to find much logging in the project. What is the preferred logging library?

@rmannibucau
Copy link
Contributor

I assume all the constants can be hardcoded (no reflection needed) and match_any can be extracted to avoid any loop or alike and be a "return always true/false" impl when present? else ok for me, just a small note that in terms of perf it depends the number of hit of the preferred types (primitives/arrays) cause now the user does not control anymore the ordering so there are cases it can be slower - but I guess when this class is used it is not the most common concern :D

@solomax
Copy link
Contributor Author

solomax commented Aug 14, 2024

Thanks for the quick review @rmannibucau :)

primitives are in the Set so should be checked in constant time. Arrays would be a problem :(
I can drop all the changes related to the primitives/arrays and improve match_any

Should I ? :)

@rmannibucau
Copy link
Contributor

rmannibucau commented Aug 14, 2024

@solomax I'd prefer to drop that and update the old impl to use Set instead of arrays and do a "contains" and fallback on starts with if not, overhead will be low but I wouldn't hardcode - except in defaults - any type, including primitives and arrays.

PS: also check the set constant time is faster than the array loop for a standard number of items, it is always constant but often slower

@solomax
Copy link
Contributor Author

solomax commented Aug 14, 2024

@rmannibucau I have created this https://github.com/solomax/contains-microbench benchmark

The results are:

Benchmark          Mode  Cnt  Score    Error  Units
App.testArrayList  avgt   10  0.005 ±  0.001  ms/op
App.testSet        avgt   10  0.003 ±  0.001  ms/op

this is my first benchmark, I'll appreciate review :)

@rmannibucau
Copy link
Contributor

@solomax globally looks good to give an idea except a small detail: it tests arraylist which is not used instead of array of strings (String[] - arrays not using iterators are faster in general).

I modified the bench to add this test and here is my raw result:

Benchmark          Mode  Cnt  Score   Error  Units
App.testArray      avgt    2  0.022          ms/op
App.testArrayList  avgt    2  0.030          ms/op
App.testSet        avgt    2  0.022          ms/op

(which is close to what you had for the overlapping part except you have a better computer ;)).

What is interesting is that the set is "hot" faster by design where the array string needs some cycles (JIT) - but not sure it is critical since normally the blacklist is small and whitelist open.

What is sure is that ArrayList is a poor fit there - even if I'm not convinced that when it is used - serialization - it will be noticed.

Reran with mode details - in particular the ops/sec instead of avg - here is what I'm getting:

Benchmark                    Mode       Cnt     Score   Error   Units
App.testArray               thrpt         2  1204.234          ops/ms
App.testArrayList           thrpt         2   855.284          ops/ms
App.testSet                 thrpt         2  1357.010          ops/ms
App.testArray                avgt         2     0.026           ms/op
App.testArrayList            avgt         2     0.036           ms/op
App.testSet                  avgt         2     0.024           ms/op
App.testArray              sample   9660334     0.030 � 0.001   ms/op
App.testArray:p0.00        sample               0.006           ms/op
App.testArray:p0.50        sample               0.012           ms/op
App.testArray:p0.90        sample               0.012           ms/op
App.testArray:p0.95        sample               0.012           ms/op
App.testArray:p0.99        sample               0.012           ms/op
App.testArray:p0.999       sample               0.157           ms/op
App.testArray:p0.9999      sample              46.334           ms/op
App.testArray:p1.00        sample              87.032           ms/op
App.testArrayList          sample   7826638     0.038 � 0.001   ms/op
App.testArrayList:p0.00    sample               0.007           ms/op
App.testArrayList:p0.50    sample               0.014           ms/op
App.testArrayList:p0.90    sample               0.014           ms/op
App.testArrayList:p0.95    sample               0.014           ms/op
App.testArrayList:p0.99    sample               0.016           ms/op
App.testArrayList:p0.999   sample               0.294           ms/op
App.testArrayList:p0.9999  sample              46.072           ms/op
App.testArrayList:p1.00    sample             118.358           ms/op
App.testSet                sample  12102463     0.023 � 0.001   ms/op
App.testSet:p0.00          sample               0.008           ms/op
App.testSet:p0.50          sample               0.009           ms/op
App.testSet:p0.90          sample               0.010           ms/op
App.testSet:p0.95          sample               0.010           ms/op
App.testSet:p0.99          sample               0.010           ms/op
App.testSet:p0.999         sample               0.087           ms/op
App.testSet:p0.9999        sample              43.450           ms/op
App.testSet:p1.00          sample             115.737           ms/op
App.testArray                  ss         2     0.739           ms/op
App.testArrayList              ss         2     2.620           ms/op
App.testSet                    ss         2     0.120           ms/op

So overall I'm not sure of the improvement in terms of speed nor usage - which would need to have stats about the ratio between primitives and custom classes, we can probably assume the primitives are numerous - if we include String as a primitive indeed - but I'm less convinced about arrays there. Keep in mind that even if we use a set for known types and add configuration to drop them from the default "known" set, we still need to iterate for other types so the speed of the set will be mitigated a bit and the small remaining distance with a plain array will be reduced so overall I think the compromise in terms of speed and code simplicity is not that bad with a plain array.

Wdyt?

@solomax
Copy link
Contributor Author

solomax commented Aug 15, 2024

@rmannibucau makes a lot of sense :) PR is updated :)

Copy link
Contributor

@rmannibucau rmannibucau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok for me, just for the record there is one case missing (but not worse than before): what when there are overlapping between or a wildcard.

There are a few options:

  1. if one wildcard and not the other -> test the other first to override the wildcard (often desired)
  2. add another config to say which list wins over the other (tomee flavor for ex)
  3. use the longest as "winner" and respect its list behavior (include/exclude)

but guess it is not needed there (yet at least ;))

@solomax solomax changed the title BlacklistClassResolver is improved [OPENJPA-2924] BlacklistClassResolver is improved Aug 19, 2024
@solomax solomax merged commit ae8c759 into master Aug 19, 2024
2 checks passed
@solomax solomax deleted the blacklist-enhancement branch August 19, 2024 04:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants