A ecommerce application instana/robot-shop hosted on Managed kubernetes cluster which stores all the user related important data in MySQL server, which is shared with some other operational and analytical microservices as well.
One day, Robot-shop started facing high page load time and at the same time Shipping service was unable to fetch data from the MySQL server, Causing the system to have severely degraded performance.
To find the real cause of this outage:
- High API latencies were noticed from shipping service
- We started checking logs of application.
- Checked status of microservices in Kiali dashboard for Istio-service-mesh.
- We checked on MySQL DB connection metrics
- Debugged MySQL connections by logging into RDS
-
Other shared application are using high amount of db connections.
- First we checked Database connection on the application.
-
Bug in application which open connection and leave.
- We checked open connection via SQL query.
show status where
variable_name= 'Threads_connected';
- We checked open connection via SQL query.