Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce AddressLevelCache footprint in Avni-server #794

Open
mahalakshme opened this issue Sep 19, 2024 · 1 comment
Open

Reduce AddressLevelCache footprint in Avni-server #794

mahalakshme opened this issue Sep 19, 2024 · 1 comment

Comments

@mahalakshme
Copy link
Contributor

mahalakshme commented Sep 19, 2024

Issue

Avni server becomes nonresponsive after 4-5 days due to repeated Full-GC using up almost all the CPU time, trying to bring down heap memory from near the limit of 5gb, in production env.

Current issue

This results in us having to periodically encounter non-responsive app and later a forced restart of the server, which seems like a broken behaviour.

Quick change to reduce impact

  • On reducing the 'avni.cache.max.weight' to 1000, we might be able to delay the issue for a much longer period of time, but the SyncDetails performance might be reduced.

Analysis details

  • On heap dump analysis, we found that a huge chunk of heap memory is held by VirtualCachmentProjection Proxy class objects, which account for nearly 50% of the Heap memory.
  • Currently, we had specified a 'avni.cache.max.weight' of 3000, which currently results in total number of 670K records, instead of the expected 300K records.

AC

Reduce AddressLevelCache footprint in Avni-server, so that overall app memory footprint remains within configured limits.

There are following avenues for improvement that we could check out:

  • Configure the Cache to have WeakKeys and SoftValues (https://www.baeldung.com/guava-cache)
  • Ensure that the count of entries are staying within bounds in method 'getConcurrentMapCacheWithWeightedCapacityForAddressesConfig()', validate with Unit tests
  • Play around with limits, to figure optimal value for Prod Server config
  • Add Cache size, Miss, Hit Stats logging to monitor the AddressLevelCache effectiveness
  • Ensure that CachedObjects and their Proxy classes are also cleaned up during GC
@mahalakshme
Copy link
Contributor Author

commit done for this: avniproject/avni-infra@192860f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Analysis Complete
Development

No branches or pull requests

1 participant