-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ConcurrentModificationException问题及其系统设计讨论 #293
Comments
java版本:java version "1.8.0_301"
private static final ExecutorService EXECUTOR = new ThreadPoolExecutor(8, 16, 60, TimeUnit.SECONDS, new LinkedBlockingQueue<>(), new ThreadPoolExecutor.CallerRunsPolicy());
List<Future> futures = new ArrayList<>();
for (List<RuleDTO> ruleList : ruleContainer) {
ruleList.forEach(futures.add(EXECUTOR.submit(Objects.requireNonNull(TtlRunnable.get(() -> ...)));
} 3.主线程获取任务结果 futures.foreach(f->{
try {
future.get();
} catch (Exception e) {
log.error("多线程异常",e);
}
}) |
与#220 重复,建议提供可复现的demo,运行时环境,是否引入其他agent |
@wmq930626 你的业务逻辑实现,是不是在
这2个业务回调(业务逻辑)中,有触发增减 @wmq930626 请排查确认一下,并给一下分析说明 与 结果。💕 虽然在业务回调( 但即使如此, @wuwen5 你怎么看? :") |
@oldratlee
目前猜测是否短时间内新建多个 |
在提交到 |
@oldratlee 你看下呢?这个正常么, 我翻代码看应该不存在 |
@Markkkkks 你确认有引入其他agent么? |
启动参数里面仅有这个agent |
忽略我的疑问,栈信息没问题,那个是内部类调用产生的 |
@Markkkkks
在 如果是这个情况,我可以很肯定是安全的。并且这样的用法,在业务中是很常见的。 |
根据上面提供的信息写了一个可以运行的验证demo(demo代码附下面):
没有出现问题。 @Markkkkks @wmq930626 @wuwen5 当然,这个demo并不能说明 一定没有 并发的 import com.alibaba.ttl.TransmittableThreadLocal;
import com.alibaba.ttl.TtlRunnable;
import java.util.List;
import java.util.concurrent.*;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
public class ConcurrentModificationExceptionDemo {
private static final ExecutorService EXECUTOR = new ThreadPoolExecutor(
16, 16, 60, TimeUnit.SECONDS,
new LinkedBlockingQueue<>(), Thread::new, new ThreadPoolExecutor.CallerRunsPolicy());
private static final TransmittableThreadLocal<String> context = new TransmittableThreadLocal<>();
public static void main(String[] args) {
context.set("set-in-main");
List<Future<?>> futures = IntStream.range(0, 10_000).mapToObj(num -> {
TtlRunnable task = TtlRunnable.get(() -> {
long ms = ThreadLocalRandom.current().nextLong(0, 7);
sleep(ms);
System.out.println("run in thread[" + Thread.currentThread().getName()
+ "] with context[" + context.get() + "], num " + num + " sleep " + ms);
// remove in summited runnable
context.remove();
});
return EXECUTOR.submit(task);
}).collect(Collectors.toList());
futures.forEach(f -> {
try {
f.get();
} catch (Exception e) {
e.printStackTrace();
}
});
EXECUTOR.shutdown();
}
private static void sleep(long ms) {
try {
Thread.sleep(ms);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}
} PS:源码文件 |
@Markkkkks @wmq930626
否则:
|
@oldratlee 是这种做法,谢谢回复与 另外根据堆栈的信息,是在 请问下
|
这看起来是个
@Markkkkks 你可以通过 |
@Markkkkks
|
修改好了,在分支 @Markkkkks 安装到本地,使用 git checkout expt/hotfix-293
mvn install -Dmaven.test.skip |
好的😊,我试试 |
还有一个可能是 还要进一步确认:
上面情况应该不会出现,否则 如果这个情况会出现,hotfix的实现要再加强。 |
测试了吗,结果如何? 😄 @Markkkkks |
写了测试 验证Demohttps://github.com/oldratlee/HelloKt 运行方式: ./gradlew execTestMain -P mainClass=playground.weakhashmap.WeakHashMapGcIterationKt demo代码: 一次运行结果Key(num=271828)
Preparing data...
[round 0] begin! keyListSize: 1000, weakHashMapSize: 1000
[round 0] KeyList: removed Key(num=692), keyListSize: 999, weakHashMapSize: 1000
[round 0] KeyList: removed Key(num=561), keyListSize: 998, weakHashMapSize: 1000
[round 0] KeyList: removed Key(num=45), keyListSize: 997, weakHashMapSize: 999
finalize Key(num=692)
[round 0] KeyList: removed Key(num=743), keyListSize: 996, weakHashMapSize: 998
finalize Key(num=45)
[round 0] KeyList: removed Key(num=263), keyListSize: 995, weakHashMapSize: 997
finalize Key(num=743)
[round 0] KeyList: removed Key(num=400), keyListSize: 994, weakHashMapSize: 996
finalize Key(num=263)
[round 0] KeyList: removed Key(num=503), keyListSize: 993, weakHashMapSize: 995
finalize Key(num=400)
finalize Key(num=561)
finalize Key(num=271828)
[round 0] KeyList: removed Key(num=347), keyListSize: 992, weakHashMapSize: 994
[round 0] KeyList: removed Key(num=456), keyListSize: 991, weakHashMapSize: 994
finalize Key(num=503)
finalize Key(num=347)
[round 0] KeyList: removed Key(num=710), keyListSize: 990, weakHashMapSize: 992
finalize Key(num=456)
[round 0] KeyList: removed Key(num=336), keyListSize: 989, weakHashMapSize: 991
finalize Key(num=710)
[round 0] KeyList: removed Key(num=92), keyListSize: 988, weakHashMapSize: 990
finalize Key(num=336)
......
[round 0] KeyList: removed Key(num=197), keyListSize: 950, weakHashMapSize: 952
finalize Key(num=481)
[round 0] KeyList: removed Key(num=79), keyListSize: 949, weakHashMapSize: 951
finalize Key(num=709)
[round 0] KeyList: removed Key(num=295), keyListSize: 948, weakHashMapSize: 950
finalize Key(num=197)
finalize Key(num=79)
[round 0] end! keyListSize: 1000 -> 948, weakHashMapSize: 1000 -> 949
[round 1] begin! keyListSize: 948, weakHashMapSize: 949
[round 1] KeyList: removed Key(num=985), keyListSize: 947, weakHashMapSize: 949
[round 1] KeyList: removed Key(num=268), keyListSize: 946, weakHashMapSize: 949
finalize Key(num=295)
[round 1] KeyList: removed Key(num=145), keyListSize: 945, weakHashMapSize: 947
finalize Key(num=985)
[round 1] KeyList: removed Key(num=78), keyListSize: 944, weakHashMapSize: 946
finalize Key(num=268)
[round 1] KeyList: removed Key(num=291), keyListSize: 943, weakHashMapSize: 945
finalize Key(num=145)
finalize Key(num=78)
[round 1] KeyList: removed Key(num=364), keyListSize: 942, weakHashMapSize: 944
[round 1] KeyList: removed Key(num=44), keyListSize: 941, weakHashMapSize: 943
finalize Key(num=291)
[round 1] KeyList: removed Key(num=712), keyListSize: 940, weakHashMapSize: 942
finalize Key(num=364)
[round 1] KeyList: removed Key(num=904), keyListSize: 939, weakHashMapSize: 941
finalize Key(num=44)
[round 1] KeyList: removed Key(num=951), keyListSize: 938, weakHashMapSize: 941
finalize Key(num=712)
...... |
使用hotfix版本的ttl这个问题没有重现了,thanks a lot @oldratlee
业务代码中,在子线程 |
发布了正式版 这个Issue 先关闭了;如果复现问题,随时 Reopen~ 💕 @Markkkkks @wmq930626 |
…biz lifecycle callbacks #293 biz lifecycle callbacks: - TransmittableThreadLocal#beforeExecute - TransmittableThreadLocal#afterExecute
…biz lifecycle callbacks #293 biz lifecycle callbacks: - TransmittableThreadLocal#beforeExecute - TransmittableThreadLocal#afterExecute
我今天也遇到了这个报错。我这边的case是能够稳定复现。 先说最后排查结论:业务方在二方包里实现了TTL的 beforeExecute 和 afterExecute 方法,并在该方法里触发了addValue()和removeValue(),最终导致了ConcurrentModificationException异常的发生。 以下是我的排查过程:
Exception in thread "HXXBizProcessor-DEFAULT-12-thread-104" java.util.ConcurrentModificationException
at java.util.WeakHashMap$HashIterator.nextEntry(WeakHashMap.java:806)
at java.util.WeakHashMap$EntryIterator.next(WeakHashMap.java:845)
at java.util.WeakHashMap$EntryIterator.next(WeakHashMap.java:843)
at com.alibaba.ttl.TransmittableThreadLocal.doExecuteCallback(TransmittableThreadLocal.java:164)
at com.alibaba.ttl.TransmittableThreadLocal.access$300(TransmittableThreadLocal.java:54)
at com.alibaba.ttl.TransmittableThreadLocal$Transmitter.replay(TransmittableThreadLocal.java:328)
at com.alibaba.ttl.TtlRunnable.run(TtlRunnable.java:49)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:874)
method=com.alibaba.ttl.TransmittableThreadLocal.beforeExecute location=AtExit
ts=2021-12-24 01:32:44; [cost=0.006412ms] result=@ArrayList[
@String[RxComputationThreadPool-7],
null,
]
method=com.alibaba.ttl.TransmittableThreadLocal.beforeExecute location=AtExit
ts=2021-12-24 01:32:44; [cost=0.004131ms] result=@ArrayList[
@String[RxComputationThreadPool-7],
null,
]
method=com.xx.xx.xx.service.xxxx.xxxxServiceImpl$1.beforeExecute location=AtExit
ts=2021-12-24 01:32:44; [cost=0.290056ms] result=@ArrayList[
@String[RxComputationThreadPool-7],
null,
]
业务方实现的代码如下: protected void beforeExecute() {
/* 58*/ super.beforeExecute();
/* 59*/ Object ctx = this.get();
/* 60*/ if (ctx != null && ctx != Trace.getRpcContext()) {
/* 61*/ Trace.setRpcContext((Object)ctx);
}
}
protected void afterExecute() {
/* 67*/ super.afterExecute();
/* 68*/ Trace.clearRpcContext();
}
doExecuteCallback --> iteraror.next() --> beforeExecute --> getRpcContext()
--> ttl.get() --> setRpcContext() --> ttl.set(null) & ttl.set(newCapture)
--> iteraror.next() --> ConcurrentModificationException
|
跟 @zavakid 讨论了一种可能的检测方式:
|
@happyomg 非常专业的问题说明、排查说明 及其 设计讨论~ 👍
@happyomg 『doc上加上一些明显的提示』赞成! 容易出问题的地方 值得说明出来。
@happyomg 『掩盖用户错误实现( 有问题不能
在 如何具体修改方式,如
我再想一下。 |
cool! 确实要做这个决定的前提是要确定清楚 TTL 本身支不支持在 回调的方法(beforeExecute和afterExecute)中调整增减TTL实例,更进一步是,站在用户角度上,在用户需求上,是否有这个需要,值得好好分析。 我也还不确定用户在这个场景做了这样的事情,他的初始预期是什么?现在还缺少这样的信息。也需要 @wmq930626 @Markkkkks @happyomg 提供一下场景的使用初衷,虽然并不能代表全部用户,但还是能提供一下输入,也能一起帮助对设计做出更好的判断。 同时,我认为还有一点需要注意,TTL 一般都是被底层中间件或者作为很基础的横向组件在使用,在修改时必须要考虑向前的兼容性,既然 |
使用过程中,多线程开启任务时报错
The text was updated successfully, but these errors were encountered: