Skip to content
This repository has been archived by the owner on Dec 10, 2021. It is now read-only.

lifs-tools/jgoslin-v1.x

Repository files navigation

jgoslin Parser, Validator and Normalized for Shorthand Lipid Nomenclatures

Latest Release DOI Build Status Quality Gate

Note
This repository has been archived. Please use the new jg repository for jGoslin 2.0

This project is a parser, validator and normalizer implementation for shorthand lipid nomenclatures, base on the Grammar of Succinct Lipid Nomenclatures project.

Goslin defines multiple grammers compatible with ANTLRv4 for different sources of shorthand lipid nomenclature. This allows to generate parsers based on the defined grammars, which provide immediate feedback whether a processed lipid shorthand notation string is compliant with a particular grammar, or not.

jGoslin uses the Goslin grammars and the generated parser to support the following general tasks:

  1. Facilitate the parsing of shorthand lipid names dialects.

  2. Provide a structural representation of the shorthand lipid after parsing.

  3. Use the structural representation to generate normalized names.

The Maven site with JavaDoc is available here.

The web-based application is available here.

Building the project and generating client code from the command-line

In order to build the client code and run the unit tests, execute the following command from a terminal:

./mvnw install

This generates the necessary domain specific code for Java.

Running a validation with the command-line interface

The cli sub-project provides a command line interface for parsing of lipid names either from the command line or from a file with one lipid name per line.

After building the project as mentioned above with ./mvnw install, the cli/target folder will contain the jgoslin-cli-<version>-bin.zip file. Alternatively, you can download the latest cli zip file from Bintray: Search for latest jgoslin-cli-<VERSION>-bin.zip artefact and click to download.

In order to run the validator, unzip that file, change into the unzipped folder and run

java -jar jgoslin-cli-<VERSION>.jar

to see the available options.

To parse a single lipid name from the command line, run

java -jar jgoslin-cli-<VERSION>.jar -n "Cer(d31:1/20:1)"

To parse multiple lipid names from a file via the commmand line, run

java -jar jgoslin-cli-<VERSION>.jar -f examples/lipidnames.txt

To use a specific grammar, instead of trying all, run

java -jar jgoslin-cli-<VERSION>.jar -f examples/lipidNames.txt -g GOSLIN

To write output to the tab-separated output file 'goslin-out.tsv', run

Running the Web Application for Validation

The goslin web application is available at: https://apps.lifs-tools.org/goslin

Building the Docker Image

In order to build a Docker image of the command line interface application, run

./mvnw -Pdocker install

from your commandline (mvnw.bat on Windows). This will build and tag a Docker image lifs/jgoslin-cli with a corresponding version and make it available to your local Docker installation. To show the coordinates of the image, call

docker image ls | grep "jgoslin-cli"

Running the Docker Image

If you have not done so, please build the Docker image of the validator cli or pull it from the docker hub (see previous sections). Then, run the following command, replacing <VERSION> with the current version, e.g. 1.0.0) and <DATA_DIR> with the local directory containing your lipid name files:

docker run -v <YOUR_DATA_DIR>:/home/data:rw lifs/jgoslin-cli:<VERSION>

This will only invoke the default entrypoint of the container, which is a shell script wrapper calling the jgoslin-cli Jar. It passes all arguments to the validator, so that all arguments that you would pass normally will work in the same way (please replace <YOUR_FILE> with the actual file’s name in <YOUR_DATA_DIR>:

docker run -v <YOUR_DATA_DIR>:/home/data:rw lifs/jgoslin-cli:<VERSION> -f <YOUR_FILE>

You can also run the docker container without the -f <YOUR_FILE> argument to see a list of possible arguments.

Using the project code releases via Bintray

The library release artifacts are available from Bintray. If you want to use them, add the following lines to your own Maven pom file :

<profile>
  <id>lifs-repos</id>
  <repositories>
   <repository>
       <snapshots>
           <enabled>false</enabled>
       </snapshots>
       <id>bintray-lifs</id>
       <name>bintray-lifs</name>
       <url>https://dl.bintray.com/lifs/maven</url>
   </repository>
  </repositories>
</profile>

To compile jgoslin against the LIFS Bintray repository, please add the following entry to you ~/.m2/settings.xml file:

<activeProfiles>
  <activeProfile>lifs-repos</activeProfile>
</activeProfiles>

or use the -Plifs-repos command line switch when running Maven to enable the LIFS Bintray maven repositories for parent pom and artifact resolution.

To use the parser libraries (reading and validation) in your own Maven projects, use the following dependency:

<dependency>
  <groupId>de.isas.lipidomics</groupId>
  <artifactId>jgoslin-parsers</artifactId>
  <version>${jgoslin.version}</version>
</dependency>

where jgoslin.version is the version of jgoslin you wish to use, e.g. for a release version:

<properties>
  <jgoslin.version>1.0.0</jgoslin.version>
</properties>

as defined in the properties section of your pom file.

Using the API programmatically

Reading a Shorthand Lipid Name with a given Parser

The following snippet shows how to parse a shorthand lipid name with the different parsers:

import de.isas.lipidomics.domain.*; // contains Domain objects like LipidAdduct, LipidSpecies ...
import de.isas.lipidomics.palinom.*; // contains the parser implementations
...
String ref = "Cer(d18:1/20:2)";
try {
	// use the SwissLipids parser
	SwissLipidsVisitorParser slParser = new SwissLipidsVisitorParser();
	LipidAdduct sllipid = slParser.parse(ref);
	System.out.println(sllipid.getLipidString()); // to print the lipid name to the console
} catch (ParsingException pe) {
// catch this for any syntactical issues with the name during parsing with a particular parser
	pe.printStackTrace();
} catch (ParseTreeVisitorException ptve) {
// catch this for any structural issues with the name during parsing with a particular parser
	ptve.printStackTrace();
}
//alternatively, use the other parsers. Don't forget to place try catch blocks around the following lines, as for the SwissLipids parser example
// use the LipidMAPS parser
LipidMapsVisitorParser lmParser = new LipidMapsVisitorParser();
LipidAdduct lmlipid = lmParser.parse(ref);
// use the shorthand notation parser GOSLIN
GoslinVisitorParser goslinParser = new GoslinVisitorParser();
LipidAdduct golipid = goslinParser.parse(ref);
// use the shorthand notation parser with support for fragments GOSLIN_FRAGMENTS
GoslinFragmentsVisitorParser goslinFragmentsParser = new GoslinFragmentsVisitorParser();
LipidAdduct gflipid = goslinFragmentsParser.parse(ref);

To retrieve a parsed lipid name on a higher hierarchy of lipid level, simply define the level when requesting the lipid name:

System.out.println(sllipid.getLipidString(LipidLevel.CATEGORY));
System.out.println(sllipid.getLipidString(LipidLevel.CLASS));
System.out.println(sllipid.getLipidString(LipidLevel.SPECIES));
System.out.println(sllipid.getLipidString(LipidLevel.MOLECULAR_SUBSPECIES));
System.out.println(sllipid.getLipidString(LipidLevel.STRUCTURAL_SUBSPECIES));
System.out.println(sllipid.getLipidString(LipidLevel.ISOMERIC_SUBSPECIES)); // will throw a ConstraintViolationException since this lipid is only on structural subspecies level

This functionality allows easy computation of aggregate statistics of lipids on lipid class, category or arbitrary levels. Requesting a lipid name on a lower level than the provided will raise an exception.

For more examples how the API works, please consult the tests, especially in the parsers module.