Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize memory usage by Timeline chart with very large cases #1598

Closed
wladimirleite opened this issue Mar 21, 2023 · 38 comments · Fixed by #1669
Closed

Optimize memory usage by Timeline chart with very large cases #1598

wladimirleite opened this issue Mar 21, 2023 · 38 comments · Fixed by #1669
Assignees

Comments

@wladimirleite
Copy link
Member

I have a very large case here, processed with the master branch, with ~125 million items (many evidence files processed with "fastmode" profile).
I had similar cases processed with 3.18.x and 4.0.x and they opened relatively quickly (~2 minutes).
With the latest version, it is taking ~15 minutes, with a long wait in "Starting to load/build time cache of [[class org.jfree.data.time.Day]]", as can be seen in the log below.

After that, the analysis UI becomes visible, but unusable (any mouse/keyboard action makes the application "Not responding" in Windows). It seems all the JVM memory is consumed and the garbage collector is running continuously (screenshot of VisualVM below).
Disabling the load/build of the timeline cache (in the source code) makes the loading process fast again, and it takes a bit less than 8 GB to load everything.

Is there anything that can be done about this?
Although such large cases are not common, new IPED versions may create more items (depending on the processing options).
Maybe putting a limit in the maximum number of items that this cache building process will run. Or maybe an option to disable it, to be used when the user needs to open a very large case.

image

2023-03-20 21:06:32	[INFO]	[app.ui.AppMain]			Indexador e Processador de Evidências Digitais 4.2-snapshot
2023-03-20 21:06:32	[INFO]	[app.ui.AppMain]			   Java Version: 11.0.13
2023-03-20 21:06:32	[INFO]	[app.ui.AppMain]			   Java Home: C:\Users\Wladimir\.iped\jre-11.0.13
2023-03-20 21:06:32	[INFO]	[app.ui.AppMain]			   Java VM Name: OpenJDK 64-Bit Server VM
2023-03-20 21:06:32	[INFO]	[engine.config.Configuration]			Loading configuration from D:\****\iped
2023-03-20 21:06:35	[INFO]	[app.ui.App]			Starting...
2023-03-20 21:06:38	[INFO]	[iped.viewers.IcePDFViewer]			IcePDFViewer took 380ms to be initialized.
2023-03-20 21:06:41	[INFO]	[app.ui.App$1]			GUI created
2023-03-20 21:06:41	[INFO]	[application.connection.LocalOfficeConnection]			Opening local OpenOffice.org connection.
2023-03-20 21:06:41	[INFO]	[engine.datasource.SleuthkitReader]			Sleuthkit version 4.12.0-iped-patch detected.
2023-03-20 21:06:42	[INFO]	[iped.viewers.LibreOfficeViewer]			LibreOffice running.
2023-03-20 21:06:42	[INFO]	[iped.viewers.LibreOfficeViewer$1]			Constructing LibreOffice frame...
2023-03-20 21:06:42	[INFO]	[iped.viewers.LibreOfficeViewer$1]			LibreOffice frame ok
2023-03-20 21:06:49	[INFO]	[sleuthkit.datamodel.SleuthkitCase]			Time to initialize parent node cache: 7,013 ms
2023-03-20 21:06:49	[INFO]	[sleuthkit.datamodel.SleuthkitCase]			sqlite-jdbc version 3.34.0 loaded in native mode
2023-03-20 21:06:49	[INFO]	[engine.data.IPEDSource]			Opening index D:\****\iped\index
2023-03-20 21:06:49	[INFO]	[engine.lucene.ConfiguredFSDirectory]			Using MMapDirectory to open index...
2023-03-20 21:06:50	[INFO]	[engine.data.IPEDSource]			Index opened
2023-03-20 21:06:50	[INFO]	[engine.data.IPEDSource]			Creating LuceneId to ID mapping...
2023-03-20 21:07:23	[INFO]	[engine.data.Bookmarks]			Loading state from file D:\****\iped\bookmarks.iped
2023-03-20 21:07:23	[INFO]	[engine.data.IPEDMultiSource]			Opening MultiReader...
2023-03-20 21:07:23	[INFO]	[engine.data.IPEDMultiSource]			MultiReader opened
2023-03-20 21:08:41	[INFO]	[engine.data.IPEDMultiSource]			Loaded 1 cases.
2023-03-20 21:08:41	[INFO]	[app.ui.UICaseDataLoader]			Loading Columns
2023-03-20 21:08:41	[INFO]	[app.ui.ColumnsManager]			Loading default columns
2023-03-20 21:08:43	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class org.jfree.data.time.Day]]...
2023-03-20 21:20:56	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Time to build time cache of [[class org.jfree.data.time.Day]]: 733829ms
2023-03-20 21:20:57	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class org.jfree.data.time.Hour]]...
2023-03-20 21:20:57	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class org.jfree.data.time.Year]]...
2023-03-20 21:20:57	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class org.jfree.data.time.Month]]...
2023-03-20 21:20:57	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class org.jfree.data.time.Second]]...
2023-03-20 21:20:57	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class org.jfree.data.time.Week]]...
2023-03-20 21:20:57	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class iped.jfextensions.model.Minute]]...
2023-03-20 21:20:57	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class org.jfree.data.time.Quarter]]...
2023-03-20 21:20:58	[INFO]	[browsers.edge.EdgeWebCacheParser]			Libesedb library version: 20181229
2023-03-20 21:20:58	[INFO]	[parsers.util.EsedbManager]			Libesedb library version: 20181229
2023-03-20 21:20:59	[INFO]	[app.ui.UICaseDataLoader]			Listing all items
2023-03-20 21:20:59	[INFO]	[app.ui.FileProcessor]			Opening iped/help/Help_pt-BR.htm
2023-03-20 21:20:59	[INFO]	[app.ui.UICaseDataLoader]			Listing all items Finished
2023-03-20 21:20:59	[INFO]	[app.ui.UICaseSearcherFilter]			Searching for query *:*
2023-03-20 21:21:10	[INFO]	[engine.search.QueryBuilder]			Expanded query terms: []
@lfcnassif
Copy link
Member

lfcnassif commented Mar 21, 2023

Wow, that's a ton of memory... I just tested this feature with medium size cases of 2-3 million items, my fault...

IPED should be able to handle cases up to ~135 million items. Beyond that, the main table internal height exceeds the maximum value for an Integer (although that is another issue, we can simply warn the user to apply some filter, he probably doesn't want to analyze 135million items on the table at the same time)

Is there anything that can be done about this?

Not sure if cache building could be faster, but consuming all available memory for sure is not desired/expected.

Could you take a heap dump to help identify the memory hungry class?

and it takes a bit less than 8 GB to load everything.

Out of curiosity, what version has this heap usage? And the other? I expect 4.0.x to use less memory than 3.18.x because of Lucene 5.x to 9.x upgrade.

@lfcnassif lfcnassif added bug and removed question labels Mar 21, 2023
@lfcnassif
Copy link
Member

I think this is high priority, version 4.1 should be as stable as previous versions. @patrickdalla could you take a look at this?

@wladimirleite
Copy link
Member Author

and it takes a bit less than 8 GB to load everything.

Out of curiosity, what version has this heap usage? And the other? I expect 4.0.x to use less memory than 3.18.x because of Lucene 5.x to 9.x upgrade.

The 8 GB I mentioned is with 4.2-snapshot.
I don't have the cases processed with the older versions anymore, but I remember I had to use -Xmx with 3.18.x, going well beyond 32GB. So 4.x is using much less memory indeed to load the case.

@wladimirleite
Copy link
Member Author

Not sure if cache building could be faster, but consuming all available memory for sure is not expected/desired.

The first call seems to be synchronous. I have no idea if this is mandatory, but it seems a long wait comparing to other initialization steps.
Not sure if the time spent was already affected by the memory shortage. Maybe reducing the memory consumption will speed things up.

Have you taken a heap dump to help identifying the memory hungry class?

No, but I can take tomorrow.

@lfcnassif
Copy link
Member

The first call seems to be synchronous. I have no idea if this is mandatory

It was just done to try to speed up the Day cache, to don't share CPU resources with other caches building, as the Day scale is the default and rendered first, maybe that is not needed...

@patrickdalla
Copy link
Collaborator

I think this is high priority, version 4.1 should be as stable as previous versions. @patrickdalla could you take a look at this?

Right.

@patrickdalla
Copy link
Collaborator

@tc-wleite The first time a case is opened, this cache is built, but the next times it is loaded from disk. As the log says, it was successfully built. So, this excessive memory consumption occurs only the first time, or in every opening?
If the subsequent opening is acceptable, maybe the solution could be to find a way create this cache at index/processing time.
As the log states also, other time granularity caches starts, but does not end. So, the UI block could have been caused by these threads.

@patrickdalla
Copy link
Collaborator

In fact, what is being created is an time period index. I don't remember well why this was not made to be executed in processing time. I think that there was some constraint about timezone change.

@wladimirleite
Copy link
Member Author

As the log says, it was successfully built. So, this excessive memory consumption occurs only the first time, or in every opening?

I am not sure about the next executions after the first cache is built. As it was taking a while to open the case and I had to access it just to export some files, I disabled the cache creation. I will test and report back here.

@wladimirleite
Copy link
Member Author

This is the memory consumption when I opened the case again:
image

@lfcnassif
Copy link
Member

lfcnassif commented Mar 21, 2023

I don't remember well why this was not made to be executed in processing time

AFAIK it was to keep backwards compatibility with 4.0.x cases and also to work transparently with multicases.

So, the UI block could have been caused by these threads.

Usually when there isn't enough heap memory and high GC pressure, the UI starts to freeze for some while before an OutOfMemoryError is thrown.

@lfcnassif
Copy link
Member

The first call seems to be synchronous.

If all of them are changed to run sequentially, do they finish? Could you check what is the final heap usage?

@wladimirleite
Copy link
Member Author

The first call seems to be synchronous.

If all of them are changed to run sequentially, do they finish? Could you check what is the final heap usage?

I am going to verify.

@wladimirleite
Copy link
Member Author

wladimirleite commented Mar 22, 2023

If all of them are changed to run sequentially, do they finish? Could you check what is the final heap usage?

I changed the code to make all cache building steps run sequentially, but they didn't finish after a bit more than 2 hours.
There were a few out-of-memory exceptions in the log, and it doesn't seem that enough memory is released when the first steps were finished.

image
Execution log:

2023-03-21 18:44:13	[INFO]	[app.ui.ColumnsManager]			Loading default columns
2023-03-21 18:44:13	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class org.jfree.data.time.Day]]...
2023-03-21 18:47:53	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Time to build time cache of [[class org.jfree.data.time.Day]]: 219272ms
2023-03-21 18:47:53	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class org.jfree.data.time.Hour]]...
2023-03-21 18:56:11	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Time to build time cache of [[class org.jfree.data.time.Hour]]: 498703ms
2023-03-21 18:56:12	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class org.jfree.data.time.Year]]...
Exception in thread "pool-14-thread-1" java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.Arrays.copyOf(Unknown Source)
	at java.base/java.util.ArrayList.grow(Unknown Source)
	at java.base/java.util.ArrayList.grow(Unknown Source)
	at java.base/java.util.ArrayList.add(Unknown Source)
	at java.base/java.util.ArrayList.add(Unknown Source)
	at iped.app.timelinegraph.cache.EventTimestampCache.run(EventTimestampCache.java:150)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Exception in thread "pool-14-thread-2" java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.Arrays.copyOf(Unknown Source)
	at java.base/java.util.ArrayList.grow(Unknown Source)
	at java.base/java.util.ArrayList.grow(Unknown Source)
	at java.base/java.util.ArrayList.add(Unknown Source)
	at java.base/java.util.ArrayList.add(Unknown Source)
	at iped.app.timelinegraph.cache.EventTimestampCache.run(EventTimestampCache.java:150)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
2023-03-21 19:07:30	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Time to build time cache of [[class org.jfree.data.time.Year]]: 678652ms
2023-03-21 19:07:35	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class org.jfree.data.time.Month]]...
Exception in thread "Thread-2" Exception in thread "D3D Screen Updater" java.lang.OutOfMemoryError: Java heap space
Exception in thread "JavaFX Application Thread" java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
Exception in thread "pool-15-thread-1" Exception in thread "Disposer" java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
Exception in thread "TimerQueue" java.lang.OutOfMemoryError: Java heap space
Exception in thread "pool-15-thread-2" java.lang.OutOfMemoryError: Java heap space
Exception in thread "pool-15-thread-3" java.lang.OutOfMemoryError: Java heap space
Exception in thread "pool-15-thread-4" java.lang.OutOfMemoryError: Java heap space
2023-03-21 20:12:49	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Time to build time cache of [[class org.jfree.data.time.Month]]: 3913660ms
2023-03-21 20:12:49	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class org.jfree.data.time.Quarter]]...
Exception in thread "pool-16-thread-1" java.lang.OutOfMemoryError: Java heap space
Exception in thread "pool-16-thread-2" java.lang.OutOfMemoryError: Java heap space
Exception in thread "pool-16-thread-3" java.lang.OutOfMemoryError: Java heap space
Exception in thread "pool-16-thread-4" java.lang.OutOfMemoryError: Java heap space
2023-03-21 20:15:18	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Time to build time cache of [[class org.jfree.data.time.Quarter]]: 148867ms
2023-03-21 20:15:18	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class org.jfree.data.time.Week]]...
Exception in thread "pool-17-thread-1" java.lang.OutOfMemoryError: Java heap space
Exception in thread "pool-17-thread-2" java.lang.OutOfMemoryError: Java heap space
Exception in thread "pool-17-thread-3" java.lang.OutOfMemoryError: Java heap space
Exception in thread "pool-17-thread-4" java.lang.OutOfMemoryError: Java heap space
2023-03-21 20:16:59	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Time to build time cache of [[class org.jfree.data.time.Week]]: 101224ms
2023-03-21 20:16:59	[INFO]	[timelinegraph.cache.IndexTimeStampCache]			Starting to load/build time cache of [[class iped.jfextensions.model.Minute]]...
Exception in thread "pool-18-thread-1" java.lang.OutOfMemoryError: Java heap space

@lfcnassif
Copy link
Member

Thank you for testing @tc-wleite!

and it doesn't seem that enough memory is released when the first steps were finished.

This seems bad, maybe the final (not the temporary) memory needed is too excessive...

@patrickdalla
Copy link
Collaborator

I'm working on this.

@patrickdalla
Copy link
Collaborator

The use of primitive could help, but the problem is that all data of timeperiods counting is loaded and maintained in memory, what should be solved.

@patrickdalla
Copy link
Collaborator

patrickdalla commented Mar 24, 2023

To use bitsets in this case, we should have one bitset per Timeperiod,event tuple. So, lots of bitsets (most of them small in size).
It seems that the use of bitsets can worsen the memory usage issue. See the memory profile below of this simple test. After the 5s pause the memory usage increases a lot, with the same values included in the array list.

    ArrayList<Object> list = new ArrayList<Object>();
    ArrayList<Object> list2 = new ArrayList<Object>();
    boolean bitseteval=false;

    for(int i=0; i<1000000;i++) {
        ArrayList<Integer> b = new ArrayList<Integer>();
        for(int j=0; j<Math.random()*100;j++) {
            b.add((int) Math.floor(Math.random()*1000000));
        }
        list.add(b);
    }

    try {
        Thread.sleep(5000);
    } catch (InterruptedException e) {
    }

    for(int i=0; i<1000000;i++) {
        SparseBitSet b = new SparseBitSet();
        List<Integer> l = (List<Integer>)list.get(i);
        for(int j=0; j<l.size();j++) {
            b.set(l.get(j));
        }
        list2.add(b);
    }
    
    int pause2 = list.size();

image

With the java original BitSet it was worse, leading to OOM.
image

@lfcnassif
Copy link
Member

The use of primitive could help, but the problem is that all data of timeperiods counting is loaded and maintained in memory, what should be solved.

Possibly you have already thought about this, but keeping just the current used TimePeriod scale (Day, Hour...) on memory maybe would be enough? Or too slow to load on demand?

I just ask you to try to be careful about the changes, since I plan to include them in 4.1.2, thank you!

@lfcnassif
Copy link
Member

One thing that could help is taking a heap dump and check what is really memory hungry, to avoid wasting time trying to optimize unrelated things.

@lfcnassif lfcnassif changed the title New Timeline tab and very large cases OutOfMemoryError thrown by new Timeline tab with very large cases Mar 25, 2023
@patrickdalla
Copy link
Collaborator

After some tests I noticed that ItemIds lists of every shown item were been kept in memory to be used when the user right click on a bar to select or mark its respective items.
After I have removed this items from memory, I could navigate in the timeline chart panel in a case with 600K items, restricting its maximum heap size to 512 (without this limit it seems JVM postpone GC). There were some minimum response time increase.
I think I can change the logic to select/highlight items on the selected bar to avoid the necessity to keep this ItemIds in memory. I think that with a specific lucene query based on the selected bar will be possible to do it.
This is not enough to solve all the issue, as it does not affect timeline index creation, but will be done.

image

@lfcnassif
Copy link
Member

Thank you @patrickdalla for investigating this! @tc-wleite if you could take a heap dump of your huge case and send it to me privately, I can take a look to try to confirm what is consuming all the heap in your case. I usually use the Eclipse Memory Analyzer plugin for it, it can handle huge dumps not handled by JVisualVM and it has very nice features to investigate issues like this.

@patrickdalla
Copy link
Collaborator

I found one another point where references to loaded entries were not released to GC. After solving it I could start IPED for this case with 1.9M items (>600K of them timestamped) limiting heap memory to 512MB, all indexes were successfully created, and I have make some chart navigations, leading to the following mem usage chart:
image

@patrickdalla
Copy link
Collaborator

patrickdalla commented Mar 28, 2023

I pushed all the commits to branch TimelineInteractionFixes. I think that some optimizations can still be made. But, at least, what could be classified as "bug" was solved.

@lfcnassif
Copy link
Member

Thank you very much @patrickdalla!

keeping just the current used TimePeriod scale (Day, Hour...) on memory maybe would be enough?

Have you tried this? Was it bad for UI responsiveness?

If @tc-wleite case still triggers OOME with 32GB heap, maybe we can expose a simple option in conf/AnalysisConfig.txt to disable the feature for huge cases like that.

@patrickdalla patrickdalla linked a pull request Apr 28, 2023 that will close this issue
lfcnassif added a commit that referenced this issue May 26, 2023
lfcnassif added a commit that referenced this issue May 27, 2023
lfcnassif added a commit that referenced this issue May 27, 2023
lfcnassif added a commit that referenced this issue Oct 18, 2023
lfcnassif added a commit that referenced this issue Oct 18, 2023
lfcnassif added a commit that referenced this issue Oct 19, 2023
creations leading to errors if tab visibility is changed multiple times
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants