-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fleet Rest API /agents/current_upgrades with very large numbers of agents #139404
Comments
Pinging @elastic/fleet (Team:Fleet) |
The response in
I think the issue comes from the fact that the upgrade This is usually not a problem when the upgrade completes within the work window, but in case of large loads like this, it can happen. |
@joshdover @kpollich what would be your call here? |
As part of elastic/elastic-agent#778, we've been discussing having the agent ack expired actions as expired/failed in some way. If we do this, then I think we can have the Another thing to consider is that as part of elastic/elastic-agent#778 we will be acking failure attempts, so there will be potentially more than one ack per agent. I think the Anything else to add @michel-laterman? |
@joshdover @kpollich As part of the new Agent activity feature, there is a new endpoint introduced that doesn't filter out expired actions. |
Let's track down any external consumers of this API to be sure, but I'm in favor of removing this. We'll need to wait for the obs-robots folks to migrate off of the |
@pjbertels could you move to the new
|
closing it in favor of https://github.com/elastic/observability-perf/issues/245 |
Created #141894 to clean up the |
Kibana version:
8.4
Elasticsearch version:
Server OS version:
Browser version:
Browser OS version:
Original install method (e.g. download page, yum, from source, etc.):
Describe the bug:
The way we check if upgrades are complete(Fleet Rest API /agents/current_upgrades) doesn't seem to work well with large numbers of agents. The issue seems to be a combination of batches of 10,000 finishing and the API reporting to be done as soon as the last upgrades are scheduled in the batch instead of after they complete.
Steps to reproduce:
We have automation to reproduce this issue. Get in touch with us via #fleet-scaling
Expected behavior:
Until the agents are all upgraded the Rest Call should report the upgrade is in process.
Screenshots (if relevant):
Errors in browser console (if relevant):
Provide logs and/or server output (if relevant):
These are logs of the automation polling current upgrades which demonstrates the issue. Unfortunately this set of logs doesn't show the /agents_status with ~ 6000 agents still
updating
but that was discovered after a number of runs with debug code added to find the root cause of the issue.[14:35:39] WARNING -------------------------------------------------------------------------------------------- harness.py:133
WARNING label=harness_0_step_6_iteration_1 description=test step 6: Upgrade drones message=executing harness.py:134
WARNING -------------------------------------------------------------------------------------------- harness.py:135
INFO Rollout duration is 600 test_perf02.py:206
[14:35:40] INFO FleetAgentStatus(total=75000, inactive=0, online=73532, error=0, offline=1468, updating=0, other=0, events=0, doc_id=None, run_id=None, test_perf02.py:210
timestamp=None, kuery='local_metadata.elastic.agent.version : 8.2.0 and local_metadata.elastic.agent.upgradeable : true', cluster_name=None)
[14:37:15] INFO [FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=2660, version='8.2.1', test_perf02.py:215
startTime='2022-08-23T18:36:53.824Z')]
[14:37:26] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=3176, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:37:37] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=3654, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:37:47] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=4011, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:37:58] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=4375, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:38:08] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=4584, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:38:19] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=4604, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:38:29] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=4680, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:38:40] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=4731, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:38:50] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=4812, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:39:01] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=5033, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:39:11] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=5163, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:39:22] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=5301, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:39:32] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=5403, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:39:43] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=5448, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:39:53] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=5470, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:40:04] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=5494, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:40:15] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=5559, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:40:25] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=5662, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:40:36] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=5831, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:40:46] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=6106, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:40:57] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=6447, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:41:07] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=6835, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:41:18] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=7163, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:41:28] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=7645, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:41:39] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=8197, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:41:50] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=8879, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:42:00] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=9758, version='8.2.1', perf_lib.py:146
startTime='2022-08-23T18:36:53.824Z')
[14:42:11] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=11289, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:42:21] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=12868, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:42:32] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=14161, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:42:42] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=14904, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:42:53] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=15670, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:43:03] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=16881, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:43:14] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=18622, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:43:25] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=20518, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:43:35] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=22351, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:43:46] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=23510, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:43:56] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=24645, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:44:07] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=25650, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:44:17] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=26573, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:44:28] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=27542, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:44:38] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=28521, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:44:49] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=29752, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:45:00] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=31189, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:45:10] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=32334, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:45:21] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=33283, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:45:31] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=75000, nbAgentsAck=34050, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:45:42] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=65000, nbAgentsAck=34595, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:45:53] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=65000, nbAgentsAck=35164, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:46:03] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=55000, nbAgentsAck=35757, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:46:14] INFO Current upgrade FleetCurrentUpgrade(actionId='2fc0bb02-6b2c-4fed-83a3-8aa53188b2a6', complete=False, nbAgents=45000, nbAgentsAck=36384, perf_lib.py:146
version='8.2.1', startTime='2022-08-23T18:36:53.824Z')
[14:46:25] INFO Upgrade finished perf_lib.py:153
Any additional context:
The text was updated successfully, but these errors were encountered: