This is a quickstart guide for graph-ops.
There are two environment setup methods, and we recommend using Docker.
Before starting, make sure you have installed
You will need to clone repository first:
git clone https://github.com/skill-diver/graph-ops.git
cd graph-ops
tar -xzvf examples/quickstart/neo4j_env.tar.gz -C examples/quickstart/
Then you can build the image and start the container via:
docker-compose --file examples/quickstart/docker-compose.yml build
docker-compose --file examples/quickstart/docker-compose.yml up
Once you have launched the container, you can choose one of the following development options:
Option 1: Command-line development (recommended for Vim users)
docker exec -it graph-ops-quickstart /bin/bash
Option 2: Visual Studio Code development (recommended for VSCode users)
- Install the
Remote Development
extension in Visual Studio Code. - Attach to the container
graph-ops-quickstart
from Remote Explorer > Dev Containers. - Open path
/graph-ops
in the container from VSCode.
For Windows users, use the WSL2 backend as recommended when installing Docker.
- If you cannot pull the image (Error response from daemon: Get "https://registry-1.docker.io/v2/ ": net/http: request canceled (Client.Timeout exceeded while awaiting headers)), check if a proxy setting is needed. If the problem is with the mirror, try adding a mirror by adding "registry-mirrors":["http://f1361db2.m.daocloud.io "] in daemon.json. See here for the config details. Another workaround is to try adding an access token as suggested by this post.
- If your Docker uses too much memory, or your containers exit with code 137, consider configuring your WSL2 backend by following this guide. For example,
[wsl2]
memory=5GB
processors=4
The procedure for using graph-ops manually is as follows:
- Set up the infrastructure you want to use, e.g., Neo4j, Redis, etc.
- Define the graph features you want to use (i.e.,
main.rs
), then deploy them (by executingmain.rs
). - Enjoy all the features you defined and the subgraph sampler provided by graph-ops (i.e.,
example.py
).
To run the quickstart, you need to install the following tools:
- Install Rust toolchain. We recommend using the latest stable version.
- Install Protobuf compiler . We used version 3.20 during development.
- Install Neo4j . We used the community version 4.4 during development.
- Install Etcd . We used version 3.5 during development.
- Install Redis . We used version 7.0 during development.
- Install Python 3.7 or above. We used version 3.9 during development. You can use miniconda to manage Python environments. Required Python packages are listed in
pyproject.toml
. Additional required Python packages for development arematurin
,pytest
.
# cd to the root directory of the project
cargo build
maturin build
pip install target/wheels/graph-ops*.whl # --force-reinstall if you want to update the package
- Start Neo4j, Etcd, Redis, and fill in their configurations (listening port and username/password if applicable) in
examples/quickstart/graph-ops.toml
andexamples/quickstart/.env
.
Currently supported infra_type
:
- neo4j
- redis
For each infra, you will need to give it a name (for registration) and its corresponding required connection info. For example, to add a Neo4j instance, append the following to the graph-ops.toml
:
[[infra]]
name = "neo4j_1"
infra_type = "neo4j"
# properties with `env_` prefix means that the value will be read from the environment variable (or `.env` file)
env_uri = "NEO4J_1_URI"
# properties without `env_` prefix means that the value will be directly read from this config file
username = "neo4j"
env_password = "NEO4J_1_PASSWORD"
And provide the corresponding env_*
environment variable in .env
:
NEO4J_1_URI = 'bolt://localhost:7687'
NEO4J_1_PASSWORD = neo4j
The scripts for preparing the data are in the repo.
As shown in the example code in main.rs
, the whole process of graph feature definition can be divided into these steps
- Register graph schema in the graph database. The concepts of
Graph
,Entity
,Field
, and so on can be found in../../docs/concepts.md
. - Define graph feature logic. Refer to
fn graph_feature_engineering()
for details. - Determine which features to serve. Refer to
fn graph_feature_serving()
for details. - Trigger the feature engineering transformation execution and serving process. Refer to
fn deploy()
ofFeatureStore
for details.
# cd to the root directory of the project
RUST_LOG=info cargo run --example quickstart
We provide a Python script examples/quickstart/example.py
to demonstrate how to consume graph features. You can run it with:
python example.py
In this example, we provide a DataLoader to load graph features from graph-ops. For graph topology, it will sample a subgraph from the graph database. For graph features, it will fetch the features defined in graph-ops.