-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-26582 Prune use of Random and SecureRandom objects #4118
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we have several different patterns in the UT, for example, ThreadLocalRandom and RandomUtils in commons-lang3.
For me I would like to make them align and I prefer we use what we have in JDK first. But anyway, it will not introduce big problems and this is really a big patch so if you do not want to spend more time on this, I'm also OK.
And please clean up the ThreadLocalRandom imports from netty.
Thanks @apurtell for the great and hard work.
private static int READ_TIMEOUT_MS = 2000; | ||
private static final Random RNG = new Random(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said in the previous PR, let's just use ThreadLocalRandom.current()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. The caveat for replacing Random with TLR is where the existing test code calls setSeed() for whatever reason. TLR doesn't work for these cases because TLR doesn't support setSeed(). Some tests depend on setting the seed, HFile tests come to mind, and sometimes they don't. I can be more aggressive in switching to TLR where we don't really need to call setSeed. It's an option. YDYT @Apache9 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setSeed could make the Random always give the same sequence, which could make the test more determined. So maybe for now let's just follow these rules:
- By default, we will replace Random with ThreadLocalRandom.
- If there is a setSeed call, then
a. If the seed is just System.currentTimeMillis, then just replace it with ThreadLocalRandom.
b. Otherwise, keep the old behavior, call setSeed on the Random object.
WDYT? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes @Apache9 that's what I've done :-)
import org.apache.hadoop.hbase.HConstants; | ||
import org.apache.hbase.thirdparty.io.netty.util.internal.ThreadLocalRandom; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
java.util.concurrent.ThreadLocalRandom. Maybe we should ban this import in maven enforcer to avoid typo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, thanks, will check each diff where this may have happened...
final BloomType[] bloomArray = BloomType.values(); | ||
final int bloomArraySize = bloomArray.length; | ||
|
||
getLogger().info("Performing action: Change bloom filter on all columns of table " + tableName); | ||
|
||
modifyAllTableColumns(tableName, (columnName, columnBuilder) -> { | ||
BloomType bloomType = bloomArray[random.nextInt(bloomArraySize)]; | ||
BloomType bloomType = bloomArray[RandomUtils.nextInt(0, bloomArraySize)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also use ThreadLocalRandom.current().nextInt here?
@@ -57,7 +56,7 @@ public void perform() throws IOException { | |||
// exception. | |||
Algorithm algo; | |||
do { | |||
algo = possibleAlgos[random.nextInt(possibleAlgos.length)]; | |||
algo = possibleAlgos[RandomUtils.nextInt(0, possibleAlgos.length)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ThreadLocalRandom.current()? Let's follow the same pattern.
Right, RandomUtils is very common in hbase-it, and not common outside of that module. I preferred to keep to the current practice in the module. However...
Thanks for giving me an out here but this kind of cleanup work I think is the place to realign our practices to what we think are best. I agree we can prefer what is in the JDK to a third party dependency. That seems like a good policy to me. I will update the patch. |
Thanks for taking this work on, Andrew. Duo's suggestion makes sense to me. I know this is quite a burden to update all of this. |
4fee380
to
7c61eb1
Compare
Updated.
|
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one minor question about the Bytes change. Otherwise LGTM.
Thanks Andrew for the hard work!
@@ -2358,7 +2360,7 @@ public static void zero(byte[] b, int offset, int length) { | |||
Arrays.fill(b, offset, offset + length, (byte) 0); | |||
} | |||
|
|||
private static final SecureRandom RNG = new SecureRandom(); | |||
private static final Random RNG = new Random(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bytes is marked as IA.Public, so is it OK to change from SecureRandom to normal Random here?
I'm not saying we can not do this, as there is no change on the method signatures, just asking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My interpretation is that private fields are not a part of our IA.Public interface. I approve of this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out @Apache9 . I think becasue we did not advertise the security of the underlying RNG on Bytes#random (like in javadoc) we are safe here. There should not be an assumption. The new Bytes#secureRandom does imply the underlying RNG is SecureRandom. I think this is an improvement. I will also update the javadoc of these methods to clarify.
Updated javadoc as discussed, improved a couple of comments, rebased. Will merge soon unless objection. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There are valid javadoc and checkstyle warnings and an error prone finding, addressing them shortly. |
Avoid the pattern where a Random object is allocated, used once or twice, and then left for GC. This pattern triggers warnings from some static analysis tools because this pattern leads to poor effective randomness. In a few cases we were legitimately suffering from this issue; in others a change is still good to reduce noise in analysis results. Use ThreadLocalRandom where there is no requirement to set the seed to gain good reuse. Where useful relax use of SecureRandom to simply Random or ThreadLocalRandom, which are unlikely to block if the system entropy pool is low, if we don't need crypographically strong randomness for the use case. The exception to this is normalization of use of Bytes#random to fill byte arrays with randomness. Because Bytes#random may be used to generate key material it must be backed by SecureRandom.
Updates to fix some issues I introduced in the last round. |
🎊 +1 overall
This message was automatically generated. |
Avoid the pattern where a Random object is allocated, used once or twice, and then left for GC. This pattern triggers warnings from some static analysis tools because this pattern leads to poor effective randomness. In a few cases we were legitimately suffering from this issue; in others a change is still good to reduce noise in analysis results. Use ThreadLocalRandom where there is no requirement to set the seed to gain good reuse. Where useful relax use of SecureRandom to simply Random or ThreadLocalRandom, which are unlikely to block if the system entropy pool is low, if we don't need crypographically strong randomness for the use case. The exception to this is normalization of use of Bytes#random to fill byte arrays with randomness. Because Bytes#random may be used to generate key material it must be backed by SecureRandom. Signed-off-by: Duo Zhang <zhangduo@apache.org>
Avoid the pattern where a Random object is allocated, used once or twice, and then left for GC. This pattern triggers warnings from some static analysis tools because this pattern leads to poor effective randomness. In a few cases we were legitimately suffering from this issue; in others a change is still good to reduce noise in analysis results. Use ThreadLocalRandom where there is no requirement to set the seed to gain good reuse. Where useful relax use of SecureRandom to simply Random or ThreadLocalRandom, which are unlikely to block if the system entropy pool is low, if we don't need crypographically strong randomness for the use case. The exception to this is normalization of use of Bytes#random to fill byte arrays with randomness. Because Bytes#random may be used to generate key material it must be backed by SecureRandom. Signed-off-by: Duo Zhang <zhangduo@apache.org>
Avoid the pattern where a Random object is allocated, used once or twice, and then left for GC. This pattern triggers warnings from some static analysis tools because this pattern leads to poor effective randomness. In a few cases we were legitimately suffering from this issue; in others a change is still good to reduce noise in analysis results. Use ThreadLocalRandom where there is no requirement to set the seed to gain good reuse. Where useful relax use of SecureRandom to simply Random or ThreadLocalRandom, which are unlikely to block if the system entropy pool is low, if we don't need crypographically strong randomness for the use case. The exception to this is normalization of use of Bytes#random to fill byte arrays with randomness. Because Bytes#random may be used to generate key material it must be backed by SecureRandom. Signed-off-by: Duo Zhang <zhangduo@apache.org>
Avoid the pattern where a Random object is allocated, used once or twice, and then left for GC. This pattern triggers warnings from some static analysis tools because this pattern leads to poor effective randomness. In a few cases we were legitimately suffering from this issue; in others a change is still good to reduce noise in analysis results. Use ThreadLocalRandom where there is no requirement to set the seed to gain good reuse. Where useful relax use of SecureRandom to simply Random or ThreadLocalRandom, which are unlikely to block if the system entropy pool is low, if we don't need crypographically strong randomness for the use case. The exception to this is normalization of use of Bytes#random to fill byte arrays with randomness. Because Bytes#random may be used to generate key material it must be backed by SecureRandom. Signed-off-by: Duo Zhang <zhangduo@apache.org> (cherry picked from commit 2105170) Change-Id: Iba9282f90a2bb0442db4def68c59e099db8a830a
[This change does not cause any test failures (when tested locally) but there are flaky tests failing on master branch, so expect precommit results to be unclean.]
Avoid the pattern where a Random object is allocated, used once or twice, and then left for GC. This pattern triggers warnings from some static analysis tools because this pattern leads to poor effective randomness. In a few cases we were legitimately suffering from this issue; in others a change is still good to reduce noise in analysis results.
Use ThreadLocalRandom where there is no requirement to set the seed to gain good reuse.
Where useful relax use of SecureRandom to simply Random or ThreadLocalRandom, which are unlikely to block if the system entropy pool is low, if we don't need crypographically strong randomness for the use case.
The exception to this is normalization of use of Bytes#random to fill byte arrays with randomness. Because Bytes#random may be used to generate key material it must be backed by SecureRandom.