-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor performance of version 7.1 compared to 3.4.4 #8458
Comments
@meisamhasani, how many silos do you have and how many copies of the dashboard are open? The dashboard can have some performance issues, but those shouldn't differ from 3.x to 7.x |
The |
No, I didn't use it at all, and as I said, the logic of the program is exactly the same for both versions, except for the issue of serialization. |
only one dashboard opened. |
Do you have logs for the silo which restarted? |
Regardless of whether it is the same, I believe there is something going awry here with regards to scheduling - that will almost certainly be caused by application code. It is worth checking your code again |
System.TimeoutException: Response did not arrive on time in 00:00:30 for message: Request [S10.42.3.64:30003:44578723 symbol/6F6F86E97EC7E6]->[S10.42.0.102:30003:44578724 tse/TSE] GrainInterfaces.ITseGrainGrainInterfaces.ITseGrain.GetHistoryVersening() #1571307. Last known status is IsExecuting: False, IsWaiting: True, Diagnostics: [[Activation: S10.42.0.102:30003:44578724/tse/TSE@dfcf2658ed144d12b7dc28797753b340 #GrainType=Grains.TseGrain,BiFilter.Grains Placement=RandomPlacement State=Valid NonReentrancyQueueSize=296 NumRunning=1 IdlenessTimeSpan=00:00:25.7440000 CollectionAgeLimit=00:15:00], TaskScheduler status: WorkItemGroup:Name=[Activation: S10.42.0.102:30003:44578724/tse/TSE@dfcf2658ed144d12b7dc28797753b340#GrainType=Grains.TseGrain,BiFilter.Grains Placement=RandomPlacement State=Valid],WorkGroupStatus=Running. Currently QueuedWorkItems=1; Total Enqueued=163336; Total processed=163334; Executing Task Id=1280 Status=Running for 00:00:25.1680000.TaskRunner=ActivationTaskScheduler-404:Queued=1; Detailed context=<[Activation: S10.42.0.102:30003:44578724/tse/TSE@dfcf2658ed144d12b7dc28797753b340 #GrainType=Grains.TseGrain,BiFilter.Grains Placement=RandomPlacement State=Valid NonReentrancyQueueSize=296 NumRunning=1 IdlenessTimeSpan=00:00:25.7440000 CollectionAgeLimit=00:15:00 CurrentlyExecuting=Request [S10.42.0.102:30003:44578724 sys.client/208a741136e24164a88829bd09aad203]->[S10.42.0.102:30003:44578724 tse/TSE] GrainInterfaces.ITseGrain[(GrainInterfaces.ITseGrain)Grains.TseGrain].GetOptionReport() #90158]>, Message Request [S10.42.0.102:30003:44578724 sys.client/208a741136e24164a88829bd09aad203]->[S10.42.0.102:30003:44578724 tse/TSE] GrainInterfaces.ITseGrain[(GrainInterfaces.ITseGrain)Grains.TseGrain].GetOptionReport() #90158 was enqueued 00:00:25.1680000 ago and has now been executing for 00:00:25.1680000., Message Request [S10.42.3.64:30003:44578723 symbol/6F6F86E97EC7E6]->[S10.42.0.102:30003:44578724 tse/TSE] GrainInterfaces.ITseGrainGrainInterfaces.ITseGrain.GetHistoryVersening() #1571307 has been enqueued on the target grain for 00:00:23.2120000 and is currently position 134 in queue for processing.]. |
In this image, you say ~"this is for the most current version, and in the worst case, but for 7.1, the latency is over 800ms" Does that mean that your latency is substantially reduced with v7.1.2? If so, that is likely due to this PR: #8394. The dashboard tends to be pretty heavy. From looking at the logs you have posted, I think there is a significant chance that there are some threading/async issues with your application code. You must fix those before you can hope to achieve good performance. |
Could it be because I keep the last state of each grain in the file and the size of some of these files is 11 megabytes, and before calling OnActivateAsync that file must be read and desrealized? |
Possibly. Is the file being read synchronously? Is that file for a single grain, or is it shared across many? |
The image is related to version 3.4.4 |
What are you using for storage? I see Redis in your call stacks, is that what you are using? |
The file is exactly for one grain. and async |
Are there many of these files or just one, is every grain reading one of these files? |
No, I did not use Redis for storage. |
There are as many files as there are, and the latest status is saved in the file every half hour (for example) |
The latencies on your application grains are good (the highest avg is 0.11ms). Is the only issue with ManagementGrain? For the silo which crashes, are you able to find out why it crashes? Was it being declared dead by the other silos? |
Is the only issue with ManagementGrain? yes. |
It is very strange that this architecture and code, which definitely has flaws, works well on version 3.4.4 without any problems, but I encountered this problem during the upgrade. |
It's strange to me, too, and I'd like to understand why it is happening. When you were using v7.1, was it v7.1.0 or v7.1.2? Does |
7.1.2 . lastest version. |
Discussed in #8457
Posted by meisamhasani
Originally posted by meisamhasani June 1, 2023
Hello
I migrated from version 3.4 to version 7.1 and serialization and configuration went well.
But in production, the performance of the system is not as good as 3.4, and management grain has latency, and after 6 hours, the silos are restarted.
The number of grains is more than 100 thousand
Thank you for your answer
I am sending a sample of the error, but maybe it is caused by something else
System.InvalidOperationException: Attempt to access an invalid activation: [Activation: S10.42.3.64:30003:44578723/symbol/F9A71214BEF5F9@205c76e528a04dce9de7515597aa2b0b#Placement=RandomPlacement State=Invalid]
at Orleans.Runtime.GrainRuntime.g__ThrowInvalidActivation|20_1(ActivationData activationData) in //src/Orleans.Runtime/Core/GrainRuntime.cs:line 101
at Orleans.Runtime.GrainRuntime.CheckRuntimeContext(IGrainContext context) in //src/Orleans.Runtime/Core/GrainRuntime.cs:line 100
at Orleans.Core.StateStorageBridge`1.get_State() in /_/src/Orleans.Runtime/Storage/StateStorageBridge.cs:line 30
'System.InvalidOperationException:' is not recognized as an internal or external command,
operable program or batch file.
I'm looking for it to be clear if I made a mistake in the 7.1 config because I don't want to stay on the 3.4 version forever.
i used:
UseAdoNetClustering & AddMemoryGrainStorage
I did not use :
UseAdoNetReminderService or AddAdoNetGrainStorage
The text was updated successfully, but these errors were encountered: