Skip to content

martaannaj/SchemaTreeBuilder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

License: GPL v3

Installation

  1. Install the go runtime (and VS Code + Golang tools)
  2. Run go get . in this folder to install all dependencies
  3. Run go build . in this folder to build the executable
  4. Run go install . to install the executable in the $PATH

Example

# This example will assume that you are in the top directory.

# Download a dataset, for example the latest 32GB dataset from wikidata
# curl https://dumps.wikimedia.org/wikidatawiki/entities/latest-truthy.nt.gz --output latest-truthy.nt.gz
# (this example will assume that a dataset called `./testdata/handcrafted.nt` exists)

# Split the dataset for wikidata items and properties
# (TODO: The handcrafted dataset has to be improved  with a better combination of entries)
./SchemaTreeBuilder split-dataset by-prefix ./testdata/handcrafted.nt

# Prepare the dataset and build the Schema Tree (typed variant) (the sort is only required for future 1-in-n splits)
./SchemaTreeBuilder filter-dataset for-schematree ./testdata/handcrafted-item.nt.gz 
gzip -cd ./testdata/handcrafted-item-filtered.nt.gz | sort | gzip > ./testdata/handcrafted-item-filtered-sorted.nt.gz
./SchemaTreeBuilder build-tree-typed ./testdata/handcrafted-item-filtered-sorted.nt.gz

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages