Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ami search: 'OutOfMemoryError' #73

Closed
vaishaliarora277 opened this issue Jul 1, 2020 · 18 comments
Closed

ami search: 'OutOfMemoryError' #73

vaishaliarora277 opened this issue Jul 1, 2020 · 18 comments

Comments

@vaishaliarora277
Copy link
Collaborator

Using Amisearch for a corpus of 950 articles showing an OutOfMemoryError when searched for the dictionary, shoeed the following error:
....

Caused by: java.lang.OutOfMemoryError: Java heap space
544001 [main] ERROR org.contentmine.cproject.args.DefaultArgProcessor  - ERR! java.lang.RuntimeException: cannot run [runTransform] in --transform (OutOfMemoryError: Java heap space)
PMC7259790 java.lang.reflect.InvocationTargetException
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:564)
        at org.contentmine.cproject.args.DefaultArgProcessor.instantiateAndRunMethod(DefaultArgProcessor.java:1191)
        at org.contentmine.cproject.args.DefaultArgProcessor.runMethodsOfType(DefaultArgProcessor.java:1085)
        at org.contentmine.cproject.args.DefaultArgProcessor.runRunMethodsOnChosenArgOptions(DefaultArgProcessor.java:1064)
        at org.contentmine.cproject.args.DefaultArgProcessor.runAndOutput(DefaultArgProcessor.java:1263)
        at org.contentmine.norma.Norma.run(Norma.java:38)
        at org.contentmine.ami.plugins.CommandProcessor.runNormaIfNecessary(CommandProcessor.java:182)
        at org.contentmine.ami.tools.AbstractAMISearchTool.runLegacyCommandProcessor(AbstractAMISearchTool.java:229)
        at org.contentmine.ami.tools.AMISearchTool.runLegacyCommandProcessor(AMISearchTool.java:249)
        at org.contentmine.ami.tools.AMISearchTool.runProjectSearch(AMISearchTool.java:244)
        at org.contentmine.ami.tools.AMISearchTool.processProject(AMISearchTool.java:230)
        at org.contentmine.ami.tools.AbstractAMISearchTool.runSpecifics(AbstractAMISearchTool.java:182)
        at org.contentmine.ami.tools.AbstractAMITool.runCommands(AbstractAMITool.java:212)
        at org.contentmine.ami.tools.AbstractAMITool.call(AbstractAMITool.java:192)
        at org.contentmine.ami.tools.AbstractAMITool.call(AbstractAMITool.java:39)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1853)
        at picocli.CommandLine.access$1100(CommandLine.java:145)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2255)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2213)
        at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2080)
        at picocli.CommandLine.execute(CommandLine.java:1978)
        at org.contentmine.ami.tools.AMI.main(AMI.java:113)
Caused by: java.lang.OutOfMemoryError: Java heap space
552824 [main] DEBUG org.contentmine.cproject.args.DefaultArgProcessor  -  exception in option:  or --transform; (1,2147483647); parseTransform; STRING: null / []; nlm2html; [nlm2html]
java.lang.RuntimeException: cannot run [runTransform] in --transform (OutOfMemoryError: Java heap space)
        at org.contentmine.cproject.args.DefaultArgProcessor.runMethodsOfType(DefaultArgProcessor.java:1093)
        at org.contentmine.cproject.args.DefaultArgProcessor.runRunMethodsOnChosenArgOptions(DefaultArgProcessor.java:1064)
        at org.contentmine.cproject.args.DefaultArgProcessor.runAndOutput(DefaultArgProcessor.java:1263)
        at org.contentmine.norma.Norma.run(Norma.java:38)
        at org.contentmine.ami.plugins.CommandProcessor.runNormaIfNecessary(CommandProcessor.java:182)
        at org.contentmine.ami.tools.AbstractAMISearchTool.runLegacyCommandProcessor(AbstractAMISearchTool.java:229)
        at org.contentmine.ami.tools.AMISearchTool.runLegacyCommandProcessor(AMISearchTool.java:249)
        at org.contentmine.ami.tools.AMISearchTool.runProjectSearch(AMISearchTool.java:244)
        at org.contentmine.ami.tools.AMISearchTool.processProject(AMISearchTool.java:230)
        at org.contentmine.ami.tools.AbstractAMISearchTool.runSpecifics(AbstractAMISearchTool.java:182)
        at org.contentmine.ami.tools.AbstractAMITool.runCommands(AbstractAMITool.java:212)
        at org.contentmine.ami.tools.AbstractAMITool.call(AbstractAMITool.java:192)
        at org.contentmine.ami.tools.AbstractAMITool.call(AbstractAMITool.java:39)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1853)
        at picocli.CommandLine.access$1100(CommandLine.java:145)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2255)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2249)
        at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2080)
        at picocli.CommandLine.execute(CommandLine.java:1978)
        at org.contentmine.ami.tools.AMI.main(AMI.java:113)
Caused by: java.lang.RuntimeException: invoke runTransform fails
        at org.contentmine.cproject.args.DefaultArgProcessor.instantiateAndRunMethod(DefaultArgProcessor.java:1196)
        at org.contentmine.cproject.args.DefaultArgProcessor.runMethodsOfType(DefaultArgProcessor.java:1085)
        ... 20 more
Caused by: java.lang.reflect.InvocationTargetException
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:564)
        at org.contentmine.cproject.args.DefaultArgProcessor.instantiateAndRunMethod(DefaultArgProcessor.java:1191)
        ... 21 more
Caused by: java.lang.OutOfMemoryError: Java heap space
552871 [main] ERROR org.contentmine.cproject.args.DefaultArgProcessor  - ERR! java.lang.RuntimeException: cannot run [runTransform] in --transform (OutOfMemoryError: Java heap space)
PMC7261124

....
@Priya-Jk-15
Copy link
Collaborator

Using the dictionary disease for ami search, I also had the same error mentioned by @vaishaliarora277 for my corpus 950.

@petermr
Copy link
Owner

petermr commented Jul 1, 2020

I am assuming that you are running something like:
ami -p myproject950 search disease ..
Do you get a list of successful documents (PMCddddddd)?

try :
<cd to project directory - i.e. where the 950 files are>
ls -l . PMC*/scholarly.html | wc
This should tell you how many transformations worked.

`ls -

@petermr
Copy link
Owner

petermr commented Jul 1, 2020

The new logger from Remko may help on this.
Do you have a file logs/ami.log in your filestore (probably in your project directory?

@petermr
Copy link
Owner

petermr commented Jul 1, 2020

@vaishaliarora277
Copy link
Collaborator Author

@petermr, I was running ami -p miniprojectfunders search --dictionary funders. I don't have logs/ami.log file in my CProject directory named miniprojectfunders.
Yes, I did got a list of 952 items in my directory.
OS: Windows 10

C:\Users\me>miniprojectfunders 1s -1 . PMC*/scholarly.html | wc
'miniprojectfunders' is not recognized as an internal or external command,
operable program or batch file.

@petermr
Copy link
Owner

petermr commented Jul 3, 2020 via email

@Priya-Jk-15
Copy link
Collaborator

In command prompt, I gave the command cd mpc (mpc is the directory where my 950 files are) and then I tried the command
ls -l . PMC*/scholarly.html | wc
The output was
'ls' is not recognized as an internal or external command, operable program or batch file.

So I tried the same commands in Git Bash and got the following some numbers as output.
amisearch

@petermr Is the above output correct?

I also tried setting the environment variable MAVEN_OPTS = -Xmx512m -XX:MaxPermSize=128m as per

also see https://cwiki.apache.org/confluence/display/MAVEN/OutOfMemoryError

and gave the command ami -p mpc search --dictionary disease. The output was

[...]
Caused by: java.lang.OutOfMemoryError: Java heap space
569564 [main] DEBUG org.contentmine.cproject.args.DefaultArgProcessor - exception in option: or --transform; (1,2147483647); parseTransform; STRING: null / []; nlm2html; [nlm2html]
569564 [main] DEBUG org.contentmine.cproject.args.DefaultArgProcessor - exception in option: or --transform; (1,2147483647); parseTransform; STRING: null / []; nlm2html; [nlm2html]
[...]

@petermr Is there any change should I do in the environment variable?

@vaishaliarora277
Copy link
Collaborator Author

Thanks @petermr,
I first entered this in the Command prompt : set MAVEN_OPTS=-Xmx512m -XX:MaxPermSize=128m
I again ran this : C:\Users\me>ami -p miniprojectfunders search --dictionary funders
and got :

+++++++++++++++++++running: search; search([funders])[]
279807 [main] DEBUG org.contentmine.ami.plugins.CommandProcessor  -
+++++++++++++++++++running: search; search([funders])[]
..............................................
large document (1507) for PMC6824115 truncated to 500 sections
.......................................................................................................

I got no search tables for dictionary funders, so next, I deleted this large file PMC6824115 from the directory and again run the same command:

C:\Users\me>ami -p miniprojectfunders search --dictionary funders

This time I got full data tables in my directory with complete search for dictionary funders.

https://photos.google.com/search/_tra_/photo/AF1QipPM1Mytn-__zViXjfugVKIslmzYWMYp9RPEHv-2

@petermr
Copy link
Owner

petermr commented Jul 4, 2020 via email

@petermr
Copy link
Owner

petermr commented Jul 4, 2020 via email

@vaishaliarora277
Copy link
Collaborator Author

Thanks @petermr
Sure, I'll do that.

@Priya-Jk-15
Copy link
Collaborator

@petermr I gave the command which ls in command prompt and the output was :
'which' is not recognized as an internal or external command, operable program or batch file.
So I tried in git and successfully got the path of ls.

But for using in command prompt, I found an equivalent command DIR for ls from https://skimfeed.com/blog/windows-command-prompt-ls-equivalent-dir/#:~:text=Answer%3A%20Type%20DIR%20to%20show,commands%20and%20their%20Windows%20equivalents.
I used DIR in command prompt and the following output was obtained for a small directory
lsdir

For my corpus containing 950 articles, I gave the command DIR mpc in command prompt and the output was

[...]
01/07/2020   03:44 PM     <DIR>          PMC7310742
01/07/2020   03:44 PM     <DIR>          PMC7312578
01/07/2020   03:44 PM     <DIR>          PMC7314749
01/07/2020   03:44 PM     <DIR>          PMC7316228
[...]

Next, when I gave the command DIR PMC*/scholarly.html , the output was
Parameter format not correct - "scholarly.html".

Is there any change, I could do?

@petermr
Copy link
Owner

petermr commented Jul 4, 2020 via email

@Priya-Jk-15
Copy link
Collaborator

For viewing the html files, first I gave the command cd mpc (mpc is the 950 articles directory) in command prompt and then I gave the command DIR PMC*\scholarly.html and the output was

The filename, directory name, or volume label syntax is incorrect,

@petermr I checked the filename and directory name. Is there anything I should do about the volume label?

@petermr
Copy link
Owner

petermr commented Jul 5, 2020 via email

@Priya-Jk-15
Copy link
Collaborator

@petermr I tried the command DIR PMC*\scholarly.html, in Windows PowerShell. It worked and showed the following output :

           Directory: C:\Users\Admin\Desktop\mpc\PMC5764404

Mode                         LastWriteTime                 Length Name
----                        --------------                 ------ -----
-a----                 04/07/2020 11:11 AM                 124852 scholarly.html
[...]

@petermr
Copy link
Owner

petermr commented Jul 5, 2020 via email

@petermr petermr closed this as completed Sep 10, 2020
@petermr
Copy link
Owner

petermr commented Sep 10, 2020

Closed as part of the learning process

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants