Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] Mysql 8.x 会话超时,因不再支持 autoReconnect 参数导致超时自动中断 #1512

Open
hjrowen opened this issue Jun 23, 2021 · 11 comments
Labels
bug Something isn't working help wanted Extra attention is needed improvement General improvement

Comments

@hjrowen
Copy link

hjrowen commented Jun 23, 2021

description

使用Mysql作为存储,当连接超过 wait_timeout 时,没法继续写入数据。在Mysq5中,不再支持autoReconnect=true。
当前hugegraph中,是否存在连接没有关闭的情况?

Environment ( 环境信息 - 必填 )

  • Server Version: 0.11.2
  • Backend: MySQL 8.0.16-X-Cluster-1.0.2
  • OS: Linux version 4.18.0-147.5.1.el8_1.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 8.3.1 20190507 (Red Hat 8.3.1-4) (GCC))
  • Data Size: xx vertices, xx edges

Expected behavior ( 期望表现 )

日常正常使用下,数据能成功呢添加。

Actual behavior ( 实际表现 / 报错)

数据库连接丢失。

异常日志

2021-05-20 00:41:14 93889862 [grizzly-http-server-11] [ERROR] com.baidu.hugegraph.server.RestServer [] - Failed to commit
com.baidu.hugegraph.backend.BackendException: Failed to open transaction
at com.baidu.hugegraph.backend.store.mysql.MysqlStore.beginTx(MysqlStore.java:292) ~[hugegraph-mysql-0.11.2.jar:?]
at com.baidu.hugegraph.backend.tx.AbstractTransaction.commitMutation2Backend(AbstractTransaction.java:304) ~[hugegraph-core-0.11.2.jar:0.11.2.0]
at com.baidu.hugegraph.backend.cache.CachedGraphTransaction.commitMutation2Backend(CachedGraphTransaction.java:292) ~[hugegraph-core-0.11.2.jar:0.11.2.0]
at com.baidu.hugegraph.backend.tx.IndexableTransaction.commit2Backend(IndexableTransaction.java:57) ~[hugegraph-core-0.11.2.jar:0.11.2.0]
at com.baidu.hugegraph.backend.tx.AbstractTransaction.commit(AbstractTransaction.java:223) ~[hugegraph-core-0.11.2.jar:0.11.2.0]
at com.baidu.hugegraph.backend.tx.GraphTransaction.commit(GraphTransaction.java:493) ~[hugegraph-core-0.11.2.jar:0.11.2.0]
at com.baidu.hugegraph.StandardHugeGraph$Txs.commit(StandardHugeGraph.java:1237) ~[hugegraph-core-0.11.2.jar:0.11.2.0]
at com.baidu.hugegraph.StandardHugeGraph$TinkerPopTransaction.doCommit(StandardHugeGraph.java:1117) ~[hugegraph-core-0.11.2.jar:0.11.2.0]
at org.apache.tinkerpop.gremlin.structure.util.AbstractTransaction.commit(AbstractTransaction.java:104) ~[gremlin-core-3.4.3.jar:3.4.3]
at com.baidu.hugegraph.StandardHugeGraph$TinkerPopTransaction.commit(StandardHugeGraph.java:1083) ~[hugegraph-core-0.11.2.jar:0.11.2.0]
at com.baidu.hugegraph.api.API.commit(API.java:97) [hugegraph-api-0.11.2.jar:0.58.0.0]
at com.baidu.hugegraph.api.graph.VertexAPI.create(VertexAPI.java:94) [hugegraph-api-0.11.2.jar:0.58.0.0]
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_275]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_275]
at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81) [jersey-server-2.25.1.jar:?]
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144) [jersey-server-2.25.1.jar:?]
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161) [jersey-server-2.25.1.jar:?]
at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205) [jersey-server-2.25.1.jar:?]
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99) [jersey-server-2.25.1.jar:?]
at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389) [jersey-server-2.25.1.jar:?]
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347) [jersey-server-2.25.1.jar:?]
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102) [jersey-server-2.25.1.jar:?]
at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326) [jersey-server-2.25.1.jar:?]
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) [jersey-common-2.25.1.jar:?]
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) [jersey-common-2.25.1.jar:?]
at org.glassfish.jersey.internal.Errors.process(Errors.java:315) [jersey-common-2.25.1.jar:?]
at org.glassfish.jersey.internal.Errors.process(Errors.java:297) [jersey-common-2.25.1.jar:?]
at org.glassfish.jersey.internal.Errors.process(Errors.java:267) [jersey-common-2.25.1.jar:?]
at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317) [jersey-common-2.25.1.jar:?]
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305) [jersey-server-2.25.1.jar:?]
at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154) [jersey-server-2.25.1.jar:?]
at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:384) [jersey-container-grizzly2-http-2.25.1.jar:?]
at org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:200) [grizzly-http-server-2.4.4.jar:2.4.4]
at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:569) [grizzly-framework-2.4.4.jar:2.4.4]
at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:549) [grizzly-framework-2.4.4.jar:2.4.4]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275]
Caused by: com.mysql.cj.jdbc.exceptions.CommunicationsException: The last packet successfully received from the server was 8,873,663 milliseconds ago. The last packet sent successfully to the server was 8,873,666 milliseconds ago. is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem.
at com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:174) ~[mysql-connector-java-8.0.19.jar:8.0.19]
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64) ~[mysql-connector-java-8.0.19.jar:8.0.19]
at com.mysql.cj.jdbc.ConnectionImpl.setAutoCommit(ConnectionImpl.java:2056) ~[mysql-connector-java-8.0.19.jar:8.0.19]
at com.baidu.hugegraph.backend.store.mysql.MysqlSessions$Session.begin(MysqlSessions.java:381) ~[hugegraph-mysql-0.11.2.jar:?]
at com.baidu.hugegraph.backend.store.mysql.MysqlStore.beginTx(MysqlStore.java:290) ~[hugegraph-mysql-0.11.2.jar:?]
... 36 more

@hjrowen hjrowen added the bug Something isn't working label Jun 23, 2021
@imbajin
Copy link
Member

imbajin commented Jun 23, 2021

看到你使用的是 mysql8-cluster 的版本, 目前适配的应该只是 5.x, 后续的兼容性欢迎反馈.

不过你说的回话超时导致无法写入, 跟之前 issue 区有提到的自动超时没有重连是一个问题么 (参考 #561)? 是说长时间没有跟 mysql 发送请求后链接断开了还是其他意思. (store.connection_detect_interval 参数配置后可以解决么)

@hjrowen
Copy link
Author

hjrowen commented Jun 28, 2021

好的,我尝试添加store.connection_detect_interval。

@hjrowen
Copy link
Author

hjrowen commented Jun 28, 2021

Caused by: com.mysql.cj.jdbc.exceptions.CommunicationsException: The last packet successfully received from the server was 8,873,663 milliseconds ago. The last packet sent successfully to the server was 8,873,666 milliseconds ago. is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem.

具体报错如上,报错信息建议添加'autoReconnect=true' ,但MySQL5开始,不再支持该参数。

@imbajin imbajin added help wanted Extra attention is needed improvement General improvement and removed bug Something isn't working labels Jun 28, 2021
@imbajin
Copy link
Member

imbajin commented Jun 28, 2021

感谢反馈, 那你有没有去 MySQL 官方 issue 或者类似搜索过, autoRecoonect 参数被替换为什么, 或者是取消了类似的问题在高版本如何解决的呢?

我想官方取消参数一般会给出新的解决方案吧

@imbajin imbajin changed the title [Bug] Mysql会话超时,导致无法正常写入。 [Improvement] Mysql 8.x 会话超时,因不再支持 autoReconnect 参数导致超时自动中断 Jun 28, 2021
@hjrowen
Copy link
Author

hjrowen commented Jul 7, 2021

  1. 配置 store.connection_detect_interval=60,不能解决。
  2. Mysql5之后的版本,autoRecoonect 参数是无效了。目前我所知的解决办法有2种:
    1. 通过mysql的wait_timeout增长会话时间(治标不治本
    2. 后台服务调用MYSQL的时候,会话需要在一段时间后自动关闭(时间要少于wait_timeout

@imbajin
Copy link
Member

imbajin commented Jul 8, 2021

  1. 配置 store.connection_detect_interval=60,不能解决。

  2. Mysql5 之后的版本,autoRecoonect 参数是无效了。目前我所知的解决办法有 2 种:

    1. 通过 mysql 的 wait_timeout 增长会话时间(治标不治本
    2. 后台服务调用 MYSQL 的时候,会话需要在一段时间后自动关闭(时间要少于 wait_timeout

多谢反馈, 那高版自动重连失效之后, 官方也没有给相关的解决方案么, 你后面提到的两个感觉都是用户侧需要做的改动, 还可能带来其他的影响.

@hjrowen
Copy link
Author

hjrowen commented Jul 14, 2021

mysql官网给出的理由是:不建议使用autoReconnect选项,因为没有安全的方法重新连接到MySQL服务器,而不冒损坏连接状态或数据库状态信息的风险。相反,使用连接池,这将使您的应用程序能够使用池中的可用连接连接到MySQL服务器。所以autoReconnect功能已弃用。

@cleverpig
Copy link

"当前hugegraph中,是否存在连接没有关闭的情况?"——关闭有close方法,就是不知道在哪里调用。可以看一下mysql模块的源码,十个类很简洁。

我对这个问题的看法:没有实现真正的连接池管理,只是在线程与connection之间做了个所谓的session绑定。所以当请求挤压时,线程和连接都会暴增,直到内存爆破。

个人建议:官方团队应当考虑引入第三方成熟的connection pool,而没必要重新发明轮子。

@imbajin
Copy link
Member

imbajin commented Jul 20, 2021

"当前 hugegraph 中,是否存在连接没有关闭的情况?"—— 关闭有 close 方法,就是不知道在哪里调用。可以看一下 mysql 模块的源码,十个类很简洁。

我对这个问题的看法:没有实现真正的连接池管理,只是在线程与 connection 之间做了个所谓的 session 绑定。所以当请求挤压时,线程和连接都会暴增,直到内存爆破。

个人建议:官方团队应当考虑引入第三方成熟的 connection pool,而没必要重新发明轮子。

你说引入类似 c3p0 那种连接框架么? 多谢建议, 后续优化 mysql 会调整, 也欢迎贡献 PR

@vincentshu
Copy link
Contributor

vincentshu commented Feb 24, 2022

这个问题,我这边尝试增加一个自定义的参数,用来对应自己mysql的超时时间
基本思想就是,当判断这个session超时,就关闭当前这个超时的session(我现在是简单的置空了,好像会照成内存泄漏,需要请教怎么正确的关闭),然后重新新增一个session,改成这样就没有发现报上面的那个错误。
public static final ConfigOption CONNECTION_WAIT_TIMEOUT =
new ConfigOption<>(
"store.connection_wait_timeout",
"The interval in seconds for connects timeout, " +
"if the idle time of a connection exceeds this value, " +
"detect it and create new session before using, " +
"value 0 means detecting every time.",
rangeInt(0L, Long.MAX_VALUE),
28800L
);

然后改写了BackendSessionPool的几个方法
public final BackendSession getOrNewSession() {
BackendSession session = this.threadLocalSession.get();
if (session == null) {
return createNewSession();
} else {
session = createSessionIfWaitTimeout(session);
this.detectSession(session);
}
return session;
}

private BackendSession createNewSession(){
BackendSession session = this.newSession();
assert session != null;
this.threadLocalSession.set(session);
assert !this.sessions.containsKey(Thread.currentThread().getId());
this.sessions.put(Thread.currentThread().getId(), session);
int sessionCount = this.sessionCount.incrementAndGet();
LOG.debug("Now(after connect({})) session count is: {}",
this, sessionCount);
return session;
}

private BackendSession createSessionIfWaitTimeout(BackendSession session) {
// create new session if the session idle time exceed wait timeout specified value
long connectionWaitTimeout = TimeUnit.SECONDS.toMillis(this.connectionWaitTimeout);
long now = System.currentTimeMillis();
if(now - session.updated() > connectionWaitTimeout){
session = null;//这个地方这样处理,好像会照成内存泄漏,所以想请教如何正确的关闭之前那个已经超时的连接。
return createNewSession();
}
return session;
}

@javeme javeme added the bug Something isn't working label Feb 24, 2023
@javeme
Copy link
Contributor

javeme commented Feb 24, 2023

maybe related to: #1672

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed improvement General improvement
Projects
None yet
Development

No branches or pull requests

5 participants