Skip to content

Commit

Permalink
Merge pull request #81 from dig-team/remote-kb
Browse files Browse the repository at this point in the history
Remote knowledge base server
  • Loading branch information
scanim authored May 23, 2024
2 parents f710f08 + ef764d8 commit 8c6ec84
Show file tree
Hide file tree
Showing 117 changed files with 7,614 additions and 7,608 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,8 @@
.idea/
bin/
kb/nb-configuration.xml
.log
query-logs/
*cache/
*.log
*.xml
69 changes: 48 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ AMIE is a system to mine Horn rules on knowledge bases. A knowledge base is a co
AMIE can find rules in such knowledge bases, such as for example
> wasBornIn(x, y) & isLocatedIn(y, z) => hasNationality(x, z)
These rules are accompanied by various confidence scores. “AMIE” stands for “Association Rule Mining under Incomplete Evidence”. This repository contains the latest version of AMIE, called AMIE 3. The previous version of AMIE can be found [here](https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/amie/).
These rules are accompanied by various confidence scores. “AMIE” stands for “Association Rule Mining under Incomplete Evidence”. This repository contains the latest version of AMIE, called AMIE 3.5. The versions of AMIE prior to 3.x can be found [here](https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/amie/). The code of version 3.0 can be found [here](https://github.com/dig-team/amie/tree/v3.0).

## Input files

Expand All @@ -24,23 +24,48 @@ In the near future, AMIE will be able to parse the W3C Turtle format as well.

Make sure that you have the latest version of [Java](https://java.com/en/download/) installed. Download the jar file, and type:

```java -jar amie3.jar [TSV file]```
```java -jar amie3.5.jar [TSV file]```

In case of memory issues, try to increase the virtual machine's memory resources using the arguments `-XX:-UseGCOverheadLimit -Xmx [MAX_HEAP_SPACE]`, e.g:

```java -XX:-UseGCOverheadLimit -Xmx2G -jar amie3.jar [TSV file]```
```java -XX:-UseGCOverheadLimit -Xmx2G -jar amie3_5.jar [TSV file]```

`MAX_HEAP_SPACE` depends on your input size and the system's available memory. The package also contains the utilities to generate and evaluate predictions from the rules mined by AMIE. Without additional arguments AMIE thresholds using PCA confidence 0.1 and head coverage 0.01. You can change these default settings. Run `java -jar amie3.jar -h` (without an input file) to see a detailed description of the available options.
`MAX_HEAP_SPACE` depends on your input size and the system's available memory. The package also contains the utilities to generate and evaluate predictions from the rules mined by AMIE. Without additional arguments AMIE thresholds with PCA confidence 0.1 and head coverage 0.01. You can change these default settings. Run `java -jar amie3_5.jar -h` (without an input file) to see a detailed description of the available options.

### Reproducing our experiments
### Use with remote knowledge base server

The executables can be found in the milestone directory or in the "releases" github onglet. Option names and default options are subject to change compared these milestones. To reproduce experiments, use by default:
Since loading and storing knowledge graphs can take a significant amount of memory space and time, AMIE 3.5 makes it possible to run the mining routine against a remote knowledge base, splitting the architecture into two parts communicating over network.

```java -jar amie-milestone-intKB.jar -bias lazy -full -noHeuristics -ostd [TSV file]```
Below is a basic setup example to use AMIE with a remote knowledge base.

Experimental implementation of the GPro and GRank measures can be found in the "gpro" branch. After recompiling the sources ot this branch, use:
#### Server-side

```java -jar amie3.jar -bias amie.mining.assistant.experimental.[GPro|GRank] [TSVFile]```
```java -jar amie3.5.jar -server [TSVFile] -port <Server Port (default: 9092)>```

This will load the data into the memory of the server.

#### Client-side

```java -jar amie3.5.jar -client -serverAddress <Server Address (default: localhost:9092)>```

In this case the client will mine the rules on the server deployed at the provided answer.

__NOTE__:
- Client and Server communicate using the WebSocket protocol.

#### Optional: Enabling cache

AMIE may run the same query more than once. It is therefore possible to enable query caching for either server or client side with the ```-cache``` option. This option is available only for remote mining. The cache option can be set either on the client or on the server side. The cache is automatically saved upon shutdown. If a corresponding cache is found, cache save is loaded, unless `-invalidateCache` is passed as argument.

The cache can improve performance significantly by reducing the amount of queries sent over network or executed by the KB.
Performances will vary depending on the knowledge graph and the user parameters.

The performance of the cache and the remote setting is sensitive to the data, as this defines the size of AMIE's search space as well as the amount of queries and query answers that will be sent over the network.

__NOTE__:
- Cache uses Least Recently Used (LRU) policy. As of yet, only LRU cache policy has been implemented.
- Custom cache policies can be implemented in `amie/data/remote/cachepolicies` package.
- Cache is saved locally in the cache directory using the knowledge graph file name and run options.

## Deploying AMIE

Expand All @@ -58,7 +83,19 @@ AMIE is managed with [Maven](https://maven.apache.org/), therefore to deploy you
2. Import and compile the project
* It is usually done by executing the following command in the amie directory: `$ mvn install`
* IDEs such as Eclipse offer the option to create a project from an existing Maven project. The IDE will call Maven to compile the code.
3. Maven will generate an executable jar named amie3.jar in a new "bin/" directory. This executable accepts RDF files in TSV format [like this one](http://resources.mpi-inf.mpg.de/yago-naga/amie/data/yago2_sample/yago2core.10kseedsSample.compressed.notypes.tsv) as input, but also other format described below. To run it, just write in your comand line:
3. Maven will generate an executable jar named amie3.5.jar in a new "bin/" directory. This executable accepts RDF files in TSV format [like this one](http://resources.mpi-inf.mpg.de/yago-naga/amie/data/yago2_sample/yago2core.10kseedsSample.compressed.notypes.tsv) as input, but also other format described below. To run it, just write in your comand line:

### Reproducing our experiments (AMIE 3)

Our [2020 ESWC publication](https://luisgalarraga.de/docs/amie3.pdf) introduced a handful of algorithmic optimizations that gave birth to [AMIE3](https://github.com/dig-team/amie/tree/v3.0). Besides an extensive code refactoring, the lastest version of AMIE includes novel features, optimizations, and several bug fixes. You might therefore not obtain the exact same runtime results as AMIE3.

If you want nevertheless reproduce the experiments published in 2020, you can find the executables used for the experiments in the milestone directory or in the "releases" github tab. Option names and default options are subject to change compared these milestones. To reproduce experiments, use by default:

```java -jar amie-milestone-intKB.jar -bias lazy -full -noHeuristics -ostd [TSV file]```

Experimental implementation of the GPro and GRank measures can be found in the "gpro" branch. After recompiling the sources ot this branch, use:

```java -jar amie3.jar -bias amie.mining.assistant.experimental.[GPro|GRank] [TSVFile]```

## Publications

Expand All @@ -72,17 +109,7 @@ AMIE is managed with [Maven](https://maven.apache.org/), therefore to deploy you
> Luis Galárraga, Christina Teflioudi, Katja Hose, Fabian M. Suchanek:
> [“AMIE: Association Rule Mining under Incomplete Evidence in Ontological Knowledge Bases”](https://suchanek.name/work/publications/www2013.pdf)
> Full paper at the International World Wide Web Conference (WWW), 2013
### Determining Obligatory Attributes in Knowledge Bases

The present repository also contains the code for the following paper:

> Jonathan Lajus, Fabian M. Suchanek:
> [“Are All People Married? Determining Obligatory Attributes in Knowledge Bases”](https://suchanek.name/work/publications/www-2018.pdf)
> Full paper at the Web Conference (WWW) , 2018
The code resides in: typing/
> Full paper at the International World Wide Web Conference (WWW), 2013
## Licensing

Expand Down
38 changes: 37 additions & 1 deletion kb/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,12 @@
<excludes>
<exclude>**/WikidataCleaner.java</exclude>
</excludes>
<source>11</source>
<target>11</target>
</configuration>
</plugin>
</plugins>

</build>

<dependencies>
Expand Down Expand Up @@ -59,7 +62,6 @@
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>

<dependency>
<groupId>it.unimi.dsi</groupId>
<artifactId>fastutil</artifactId>
Expand All @@ -70,5 +72,39 @@
<artifactId>jena-tdb</artifactId>
<version>0.9.0-incubating</version>
</dependency>

<!-- Communication layer dependencies -->
<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<version>7.9.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.googlecode.json-simple</groupId>
<artifactId>json-simple</artifactId>
<version>1.1.1</version>
</dependency>
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.10.1</version>
</dependency>
<!-- <dependency>-->
<!-- <groupId>com.sparkjava</groupId>-->
<!-- <artifactId>spark-core</artifactId>-->
<!-- <version>2.9.4</version>-->
<!-- </dependency>-->
<dependency>
<groupId>org.java-websocket</groupId>
<artifactId>Java-WebSocket</artifactId>
<version>1.5.6</version>
</dependency>
<dependency>
<groupId>commons-cli</groupId>
<artifactId>commons-cli</artifactId>
<version>1.4</version>
<scope>compile</scope>
</dependency>
</dependencies>
</project>
Loading

0 comments on commit 8c6ec84

Please sign in to comment.