-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #918: release connection if its' reference count is zero. #983
Conversation
tempesta_fw/sock_srv.c
Outdated
@@ -446,7 +446,8 @@ tfw_sock_srv_disconnect(TfwConn *conn) | |||
* attached to the connection are released. If the connection is | |||
* closed already, then force the release of attached resources. | |||
*/ | |||
if (atomic_read(&conn->refcnt) != TFW_CONN_DEATHCNT) | |||
refcnt = atomic_read(&conn->refcnt); | |||
if (refcnt && refcnt != TFW_CONN_DEATHCNT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please describe the case when the reference count is zero? It's not obvious from the patch. A good comment would be helpful.
The patch introduces extra work in a potentially frequently executed code, while it appears that it's for a very limited case. Can it be done differently? Like, calling tfw_connection_put_to_death()
beforehand in that particular case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keshonok Thank you for review.
Fix has been reworked and extended. Comment added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Patch is good ang can be merged if all related tests pass.
Connection reference counting is rather complicated thing with possibly hidden issues. I don't see any of them now, but its a mandatory to run the next tests before merging:
- flacky
- regression.test_reboot
tempesta_fw/connection.h
Outdated
@@ -185,7 +185,8 @@ enum { | |||
TFW_CONN_B_QFORWD, /* Need to forward requests in the queue. */ | |||
TFW_CONN_B_HASNIP, /* Has non-idempotent requests. */ | |||
|
|||
TFW_CONN_B_DEL /* Remove connection */ | |||
TFW_CONN_B_DEL, /* Remove connection */ | |||
TFW_CONN_B_ACTIVE /* Connection has been got at least once. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Connection has been got at least once.
is not actually true. The flag means that the connection itself got reference to server. And it's not about of connection's reference counter. Connection was in use or at least scheduled to be established.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected.
tempesta_fw/sock_srv.c
Outdated
* TFW_CONN_B_ACTIVE bit is checked here to see if the connection has | ||
* already got a server's reference; if so, then server's reference must | ||
* be put. If bit is not set, this means that connection didn't have | ||
* time to get server (due to some error in sock_srv start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"didn't have time to get server" -> "didn't have time or was never scheduled to reach server"
Using of get
in this sentence is confusing since talk is about getting/putting reference counters.
@@ -483,7 +502,7 @@ tfw_sock_srv_connect_srv(TfwServer *srv) | |||
* that parallel execution can't happen with the same socket. | |||
*/ | |||
list_for_each_entry(srv_conn, &srv->conn_list, list) { | |||
tfw_server_get(srv); | |||
tfw_sock_srv_conn_activate(srv, srv_conn); | |||
tfw_sock_srv_connect_try_later(srv_conn); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like tfw_sock_srv_connect_try_later()
is always called after tfw_sock_srv_conn_activate()
. Let's move it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also tfw_sock_srv_connect_try_later()
call here
tempesta/tempesta_fw/sock_srv.c
Line 309 in f4ef380
tfw_sock_srv_connect_try_later(srv_conn); |
tempesta_fw/sock_srv.c
Outdated
@@ -2035,7 +2056,10 @@ tfw_cfgop_update_sg_srv_list(TfwCfgSrvGroup *sg_cfg) | |||
|
|||
TFW_DBG2("Update server list for group '%s'\n", sg_cfg->orig_sg->name); | |||
|
|||
__tfw_sg_for_each_srv(sg_cfg->orig_sg, __tfw_cfgop_update_sg_srv_list, sg_cfg); | |||
if ((r = __tfw_sg_for_each_srv(sg_cfg->orig_sg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line is too log, is was also warned a couple of time that the next formatting is better:
r = __tfw_sg_for_each_srv(sg_cfg->orig_sg, __tfw_cfgop_update_sg_srv_list,
sg_cfg);
if (r)
return r;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected.
tempesta_fw/sock_srv.c
Outdated
@@ -539,6 +558,7 @@ tfw_srv_conn_alloc(void) | |||
INIT_LIST_HEAD(&srv_conn->fwd_queue); | |||
INIT_LIST_HEAD(&srv_conn->nip_queue); | |||
spin_lock_init(&srv_conn->fwd_qlock); | |||
atomic_set(&srv_conn->refcnt, TFW_CONN_DEATHCNT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to comment why the refcounter is initialised with this value. Some thing like "Force releasing of taken server's reference counter on connection removing"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to merge. Please backport the fix to 0.5
No description provided.