Skip to content

Latest commit

 

History

History
121 lines (93 loc) · 5.96 KB

overview.md

File metadata and controls

121 lines (93 loc) · 5.96 KB

TF-GNN: TensorFlow Graph Neural Networks

The TensorFlow GNN library makes it easy to build Graph Neural Networks, that is, neural networks on graph data (nodes and edges with arbitrary features). It provides TensorFlow code for building GNN models as well as tools for preparing their input data and running the training.

Throughout, TF-GNN supports heterogeneous graphs, that is, graphs consisting of multiple sets of nodes and multiple sets of edges, each with their own set of features. These come up naturally when modeling different types of objects (nodes) and their different types of relations (edges).

User Documentation

Start with our introductory guides:

  • Introduction to Graph Neural Networks. This page introduces the concept of graph neural networks with a focus on their application at scale.

  • The GraphTensor type. This page introduces the tfgnn.GraphTensor class, which defines our representation of graph data in TensorFlow. We recommend that every user of our library understands its basic data model.

  • Describing your graph. This page explains how to declare the node sets and edge sets of your graph, including their respective features, with the GraphSchema protocol message. This defines the interface between data preparation (which creates such graphs) and the GNN model written in TensorFlow (which consumes these graphs as training data).

  • Data preparation and sampling. Training data for GNN models are graphs. This document describes their encoding as tf.Examples. Moreover, it introduces subgraph sampling for turning one huge graph into a stream of training inputs. TF-GNN offers two ways to run sampling:

    • The In-Memory Sampler lets you run graph sampling on a single machine from main memory. Start here for an easy demo.
    • The Beam Sampler lets you run distributed graph sampling, which scales way beyond in-memory sampling.
  • The TF-GNN Runner lets you train GNN models on the prepared input data for a variety of tasks (e.g., node prediction). We recommend using the Runner to get started quickly with a first model for the data at hand, and then customize it as needed.

The following docs go deeper into particular topics.

  • The Input pipeline guide explains how to set up a tf.data.Dataset for bulk input of the training and validation datasets produced by the data preparation step. The TF-GNN Runner already takes care of this for its users.

  • TF-GNN modeling explains how to build a Graph Neural Network with TensorFlow and Keras, using the GraphTensor data from the previous steps. The TF-GNN library provides both a collection of standard models and a toolbox for writing your own. Users of the TF-GNN Runner are encouraged to consult this page to define custom models in the Runner.

  • The Model saving guide covers technical details of saving TF-GNN models. (Most users of TF/Keras 2.13+ should be fine calling tf.keras.Model.export() without looking here.)

  • The Keras version config guide explains how to install and use Keras v2 with TF2.16 and above, which is required for TF-GNN.

Colab Tutorials

These Colab notebooks run complete examples of building and training a TF-GNN model on a Google server from within your browser.

  • Molecular Graph Classification trains a model for the MUTAG dataset (from the TUDatasets collection) that consists of 188 small, homogeneous graphs representing molecules. This is a good first read to get acquainted with GNNs.
  • Solving OGBN-MAG end-to-end trains a model on heterogeneous sampled subgraphs from the OGBN-MAG dataset (from Stanford's Open Graph Benchmark) that contains 1 million research papers, their authors, and other relations. This colab introduces the node classification task from sampled subgraphs as well as the nuts and bolts of training in parallel on multiple accelerators (GPU, TPU).
  • An in-depth OGBN-MAG tutorial that solves OGBN-MAG again while demonstrating how users can exercise greater control over the GNN model definition and the training code.
  • Learning shortest paths with GraphNetworks demonstrates an Encoder/Process/Decoder architecture for predicting the edges of a shortest path, using an Graph Network with edge states. Take a look if you are interested in advanced modeling.

API Reference

TF-GNN comes with reference documentation for its API, extracted from the source code.

Developer Documentation

How to contribute to the TF-GNN library.

  • CONTRIBUTING.md describes the process for open-source contributions.
  • The Developer guide describes how to clone our github repo and install the tools and libraries required to build and run TF-GNN code.

Papers

The following research paper describes the design of this library: