-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(postgres): close socket actively when timeout happens during query #11480
Conversation
@windmgc good. I have seen this before in other cases. I think we should close connection in any |
Great catch. I agree with @bungle that we should close the connection in any error, not just a timeout. Or we should first perform a ping-like operation (I'm not sure if it can be easily implemented in Openresty, I've used it in other languages) after taking out the connection to make sure that the connection is working. |
However, I found it quite hard to tell whether the error returned by pgmoon is due to a socket error or is just an SQL error. Option 1 is that I can enumerate the common error strings returned by lua-nginx module but it seems to be dirty. Option 2 is that we can disconnect if any kind of error happens even if it is an SQL error and the socket can actually be reused. My take is that based on DAO we don't have many arbitrary SQL queries that are erroneous so it might be okay to just disconnect regardless of the error type. |
I think it is fine to |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-release/3.1.x release/3.1.x
# Navigate to the new working tree
cd .worktrees/backport-release/3.1.x
# Create a new branch
git switch --create backport-11480-to-release/3.1.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 d2da4dbb372db3687f1dfae33ba422c384b61024
# Push it to GitHub
git push --set-upstream origin backport-11480-to-release/3.1.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-release/3.1.x Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-release/2.8.x release/2.8.x
# Navigate to the new working tree
cd .worktrees/backport-release/2.8.x
# Create a new branch
git switch --create backport-11480-to-release/2.8.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 d2da4dbb372db3687f1dfae33ba422c384b61024
# Push it to GitHub
git push --set-upstream origin backport-11480-to-release/2.8.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-release/2.8.x Then, create a pull request where the |
#11480) Currently, we do set/keep socket keepalive after every Postgres SQL query, based on keepalive timeout configured or lua_socket_keepalive_timeout(default 60s). This could go wrong under some cases, when a query encounters read timeout when trying to receive data from a database with high load, the query ends on Kong's side but the query result may be sent back after timeout happens, and the result data will be lingering inside the socket buffer, and the socket itself get reused for subsequent query, then the subsequent query might get the incorrect result from the previous query. The PR checks the query result's err string, and if any error happens, it'll try to close the socket actively so that the subsequent query will establish new clean ones. Fix FTI-5322 (cherry picked from commit d2da4db)
#11480) Currently, we do set/keep socket keepalive after every Postgres SQL query, based on keepalive timeout configured or lua_socket_keepalive_timeout(default 60s). This could go wrong under some cases, when a query encounters read timeout when trying to receive data from a database with high load, the query ends on Kong's side but the query result may be sent back after timeout happens, and the result data will be lingering inside the socket buffer, and the socket itself get reused for subsequent query, then the subsequent query might get the incorrect result from the previous query. The PR checks the query result's err string, and if any error happens, it'll try to close the socket actively so that the subsequent query will establish new clean ones. Fix FTI-5322 (cherry picked from commit d2da4db)
#11480) Currently, we do set/keep socket keepalive after every Postgres SQL query, based on keepalive timeout configured or lua_socket_keepalive_timeout(default 60s). This could go wrong under some cases, when a query encounters read timeout when trying to receive data from a database with high load, the query ends on Kong's side but the query result may be sent back after timeout happens, and the result data will be lingering inside the socket buffer, and the socket itself get reused for subsequent query, then the subsequent query might get the incorrect result from the previous query. The PR checks the query result's err string, and if any error happens, it'll try to close the socket actively so that the subsequent query will establish new clean ones. Fix FTI-5322 (cherry picked from commit d2da4db)
#11480) Currently, we do set/keep socket keepalive after every Postgres SQL query, based on keepalive timeout configured or lua_socket_keepalive_timeout(default 60s). This could go wrong under some cases, when a query encounters read timeout when trying to receive data from a database with high load, the query ends on Kong's side but the query result may be sent back after timeout happens, and the result data will be lingering inside the socket buffer, and the socket itself get reused for subsequent query, then the subsequent query might get the incorrect result from the previous query. The PR checks the query result's err string, and if any error happens, it'll try to close the socket actively so that the subsequent query will establish new clean ones. Fix FTI-5322 (cherry picked from commit d2da4db)
#11480) Currently, we do set/keep socket keepalive after every Postgres SQL query, based on keepalive timeout configured or lua_socket_keepalive_timeout(default 60s). This could go wrong under some cases, when a query encounters read timeout when trying to receive data from a database with high load, the query ends on Kong's side but the query result may be sent back after timeout happens, and the result data will be lingering inside the socket buffer, and the socket itself get reused for subsequent query, then the subsequent query might get the incorrect result from the previous query. The PR checks the query result's err string, and if any error happens, it'll try to close the socket actively so that the subsequent query will establish new clean ones. Fix FTI-5322 (cherry picked from commit d2da4db)
#11480) Currently, we do set/keep socket keepalive after every Postgres SQL query, based on keepalive timeout configured or lua_socket_keepalive_timeout(default 60s). This could go wrong under some cases, when a query encounters read timeout when trying to receive data from a database with high load, the query ends on Kong's side but the query result may be sent back after timeout happens, and the result data will be lingering inside the socket buffer, and the socket itself get reused for subsequent query, then the subsequent query might get the incorrect result from the previous query. The PR checks the query result's err string, and if any error happens, it'll try to close the socket actively so that the subsequent query will establish new clean ones. Fix FTI-5322 (cherry picked from commit d2da4db)
#11480) Currently, we do set/keep socket keepalive after every Postgres SQL query, based on keepalive timeout configured or lua_socket_keepalive_timeout(default 60s). This could go wrong under some cases, when a query encounters read timeout when trying to receive data from a database with high load, the query ends on Kong's side but the query result may be sent back after timeout happens, and the result data will be lingering inside the socket buffer, and the socket itself get reused for subsequent query, then the subsequent query might get the incorrect result from the previous query. The PR checks the query result's err string, and if any error happens, it'll try to close the socket actively so that the subsequent query will establish new clean ones. Fix FTI-5322 (cherry picked from commit d2da4db)
#11480) Currently, we do set/keep socket keepalive after every Postgres SQL query, based on keepalive timeout configured or lua_socket_keepalive_timeout(default 60s). This could go wrong under some cases, when a query encounters read timeout when trying to receive data from a database with high load, the query ends on Kong's side but the query result may be sent back after timeout happens, and the result data will be lingering inside the socket buffer, and the socket itself get reused for subsequent query, then the subsequent query might get the incorrect result from the previous query. The PR checks the query result's err string, and if any error happens, it'll try to close the socket actively so that the subsequent query will establish new clean ones. Fix FTI-5322 (cherry picked from commit d2da4db)
#11480) Currently, we do set/keep socket keepalive after every Postgres SQL query, based on keepalive timeout configured or lua_socket_keepalive_timeout(default 60s). This could go wrong under some cases, when a query encounters read timeout when trying to receive data from a database with high load, the query ends on Kong's side but the query result may be sent back after timeout happens, and the result data will be lingering inside the socket buffer, and the socket itself get reused for subsequent query, then the subsequent query might get the incorrect result from the previous query. The PR checks the query result's err string, and if any error happens, it'll try to close the socket actively so that the subsequent query will establish new clean ones. Fix FTI-5322 (cherry picked from commit d2da4db)
### Summary The PR #11480 introduced a bug that calls `store_connection` without passing `self`. This fixes that. Signed-off-by: Aapo Talvensaari <aapo.talvensaari@gmail.com>
### Summary The PR #11480 introduced a bug that calls `store_connection` without passing `self`. This fixes that. Signed-off-by: Aapo Talvensaari <aapo.talvensaari@gmail.com>
### Summary The PR #11480 introduced a bug that calls `store_connection` without passing `self`. This fixes that. Signed-off-by: Aapo Talvensaari <aapo.talvensaari@gmail.com>
### Summary The PR #11480 introduced a bug that calls `store_connection` without passing `self`. This fixes that. Signed-off-by: Aapo Talvensaari <aapo.talvensaari@gmail.com>
### Summary The PR #11480 introduced a bug that calls `store_connection` without passing `self`. This fixes that. Signed-off-by: Aapo Talvensaari <aapo.talvensaari@gmail.com>
Summary
Currently, we do set/keep socket keepalive after every Postgres SQL query, based on keepalive timeout configured or
lua_socket_keepalive_timeout
(default 60s).This could go wrong under some cases, when a query encounters read timeout when trying to receive data from a database with high load, the query ends on Kong's side but the query result may be sent back after timeout happens, and the result data will be lingering inside the socket buffer, and the socket itself get reused for subsequent query, then the subsequent query might get the uncorrect result from the previous query.
The PR checks the query result's err string, and if
timeout
happens, it'll try to close the socket actively so that the subsequent query will establish new clean ones.Checklist
CHANGELOG/unreleased/kong
or addingskip-changelog
label on PR if unnecessary. README.md (Please ping @vm-001 if you need help)- [ ] There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HEREFull changelog
Issue reference
Fix FTI-5322