Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #918: release connection if its' reference count is zero. #983

Merged
merged 4 commits into from
Mar 30, 2018

Conversation

aleksostapenko
Copy link
Contributor

No description provided.

@@ -446,7 +446,8 @@ tfw_sock_srv_disconnect(TfwConn *conn)
* attached to the connection are released. If the connection is
* closed already, then force the release of attached resources.
*/
if (atomic_read(&conn->refcnt) != TFW_CONN_DEATHCNT)
refcnt = atomic_read(&conn->refcnt);
if (refcnt && refcnt != TFW_CONN_DEATHCNT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please describe the case when the reference count is zero? It's not obvious from the patch. A good comment would be helpful.
The patch introduces extra work in a potentially frequently executed code, while it appears that it's for a very limited case. Can it be done differently? Like, calling tfw_connection_put_to_death() beforehand in that particular case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keshonok Thank you for review.
Fix has been reworked and extended. Comment added.

Copy link
Contributor

@vankoven vankoven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch is good ang can be merged if all related tests pass.

Connection reference counting is rather complicated thing with possibly hidden issues. I don't see any of them now, but its a mandatory to run the next tests before merging:

  • flacky
  • regression.test_reboot

@@ -185,7 +185,8 @@ enum {
TFW_CONN_B_QFORWD, /* Need to forward requests in the queue. */
TFW_CONN_B_HASNIP, /* Has non-idempotent requests. */

TFW_CONN_B_DEL /* Remove connection */
TFW_CONN_B_DEL, /* Remove connection */
TFW_CONN_B_ACTIVE /* Connection has been got at least once. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connection has been got at least once. is not actually true. The flag means that the connection itself got reference to server. And it's not about of connection's reference counter. Connection was in use or at least scheduled to be established.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected.

* TFW_CONN_B_ACTIVE bit is checked here to see if the connection has
* already got a server's reference; if so, then server's reference must
* be put. If bit is not set, this means that connection didn't have
* time to get server (due to some error in sock_srv start
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"didn't have time to get server" -> "didn't have time or was never scheduled to reach server"
Using of get in this sentence is confusing since talk is about getting/putting reference counters.

@@ -483,7 +502,7 @@ tfw_sock_srv_connect_srv(TfwServer *srv)
* that parallel execution can't happen with the same socket.
*/
list_for_each_entry(srv_conn, &srv->conn_list, list) {
tfw_server_get(srv);
tfw_sock_srv_conn_activate(srv, srv_conn);
tfw_sock_srv_connect_try_later(srv_conn);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like tfw_sock_srv_connect_try_later() is always called after tfw_sock_srv_conn_activate(). Let's move it there.

Copy link
Contributor Author

@aleksostapenko aleksostapenko Mar 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also tfw_sock_srv_connect_try_later() call here

tfw_sock_srv_connect_try_later(srv_conn);
and it seems that we needn't increase server reference count in this path.

@@ -2035,7 +2056,10 @@ tfw_cfgop_update_sg_srv_list(TfwCfgSrvGroup *sg_cfg)

TFW_DBG2("Update server list for group '%s'\n", sg_cfg->orig_sg->name);

__tfw_sg_for_each_srv(sg_cfg->orig_sg, __tfw_cfgop_update_sg_srv_list, sg_cfg);
if ((r = __tfw_sg_for_each_srv(sg_cfg->orig_sg,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line is too log, is was also warned a couple of time that the next formatting is better:

r = __tfw_sg_for_each_srv(sg_cfg->orig_sg, __tfw_cfgop_update_sg_srv_list,
                          sg_cfg);
if (r)
    return r;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected.

@@ -539,6 +558,7 @@ tfw_srv_conn_alloc(void)
INIT_LIST_HEAD(&srv_conn->fwd_queue);
INIT_LIST_HEAD(&srv_conn->nip_queue);
spin_lock_init(&srv_conn->fwd_qlock);
atomic_set(&srv_conn->refcnt, TFW_CONN_DEATHCNT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to comment why the refcounter is initialised with this value. Some thing like "Force releasing of taken server's reference counter on connection removing"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@krizhanovsky krizhanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to merge. Please backport the fix to 0.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants