Skip to content
This repository has been archived by the owner on Aug 31, 2021. It is now read-only.

Documentation updates #94

Merged
merged 19 commits into from
May 15, 2019
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 34 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,16 @@

## Background
The same data structures and encodings that make Ethereum an effective and trust-less distributed virtual machine
complicate data accessibility and usability for dApp developers. VulcanizeDB improves Ethereum data accessibility by
providing a suite of tools to ease the extraction and transformation of data into a more useful state.
complicate data accessibility and usability for dApp developers. VulcanizeDB improves Ethereum data accessibility by
providing a suite of tools to ease the extraction and transformation of data into a more useful state, including
allowing for exposing aggregate data from a suite of smart contracts.

VulanizeDB includes processes that sync, transform and expose data. Syncing involves
querying an Ethereum node and then persisting core data into a Postgres database. Transforming focuses on using previously synced data to
query for and transform log event and storage data for specifically configured smart contract addresses. Exposing data is a matter of getting
data from VulcanizeDB's underlying Postgres database and making it accessible.

![VulcanizeDB Overview Diagram](../documentation-updates/documentation/diagrams/vdb-overview.png)

## Dependencies
- Go 1.11+
Expand Down Expand Up @@ -80,8 +87,9 @@ In some cases (such as recent Ubuntu systems), it may be necessary to overcome f
- The IPC file is called `geth.ipc`.
- The geth IPC file path is printed to the console when you start geth.
- The default location is:
- Mac: `<full home path>/Library/Ethereum`
- Mac: `<full home path>/Library/Ethereum/geth.ipc`
- Linux: `<full home path>/ethereum/geth.ipc`
- Note: the geth.ipc file may not exist until you've started the geth process

- For Parity:
- The IPC file is called `jsonrpc.ipc`.
Expand All @@ -98,25 +106,33 @@ In some cases (such as recent Ubuntu systems), it may be necessary to overcome f


## Usage
Usage is broken up into two processes:
As mentioned above, VulcanizeDB's processes can be split into three categories: syncing, transforming and exposing data.

### Data syncing
To provide data for transformations, raw Ethereum data must first be synced into vDB.
To provide data for transformations, raw Ethereum data must first be synced into VulcanizeDB.
This is accomplished through the use of the `headerSync`, `sync`, or `coldImport` commands.
These commands are described in detail [here](../staging/documentation/sync.md).
These commands are described in detail [here](../documentation-updates/documentation/sync.md).

### Data transformation
Contract watchers use the raw data that has been synced into Postgres to filter out and apply transformations to specific data of interest.

There is a built-in `contractWatcher` command which provides generic transformation of most contract data.
The `contractWatcher` command is described further [here](../staging/documentation/contractWatcher.md).

In many cases a custom transformer or set of transformers will need to be written to provide complete or more comprehensive coverage or to optimize other aspects of the output for a specific end-use.
In this case we have provided the `compose`, `execute`, and `composeAndExecute` commands for running custom transformers from external repositories.

Usage of the `compose`, `execute`, and `composeAndExecute` commands is described further [here](../staging/documentation/composeAndExecute.md).
Data transformation uses the raw data that has been synced into Postgres to filter out and apply transformations to
specific data of interest. Since there are different types of data that may be useful for observing smart contracts, it
follows that there are different ways to transform this data. We've started by categorizing this into Generic and
Custom transformers:

- Generic Contract Transformer: Generic contract transformation can be done using a built-in command,
`contractWatcher`, which transforms contract events provided the contract's ABI is available. It also
provides some state variable coverage by automating polling of public methods, with some restrictions.
`contractWatcher` is described further [here](../documentation-updates/documentation/generic-transformer.md).

- Custom Transformers: In many cases custom transformers will need to be written to provide
more comprehensive coverage of contract data. In this case we have provided the `compose`, `execute`, and
`composeAndExecute` commands for running custom transformers from external repositories. Documentation on how to write,
build and run custom transformers as Go plugins can be found
[here](../documentation-updates/documentation/custom-transformers.md).

### Exposing the data
[Postgraphile](https://www.graphile.org/postgraphile/) is used to expose GraphQL endpoints for our database schemas, this is described in detail [here](../staging/documentation/postgraphile.md).

Documentation on how to build custom transformers to work with these commands can be found [here](../staging/documentation/transformers.md).

## Tests
- Replace the empty `ipcPath` in the `environments/infura.toml` with a path to a full node's eth_jsonrpc endpoint (e.g. local geth node ipc path or infura url)
Expand All @@ -126,15 +142,11 @@ Documentation on how to build custom transformers to work with these commands ca
- `make test` will run the unit tests and skip the integration tests
- `make integrationtest` will run just the integration tests

## API
[Postgraphile](https://www.graphile.org/postgraphile/) is used to expose GraphQL endpoints for our database schemas, this is described in detail [here](../staging/documentation/postgraphile.md).


## Contributing
Contributions are welcome! For more on this, please see [here](../staging/documentation/contributing.md).

Small note: If editing the Readme, please conform to the [standard-readme specification](https://github.com/RichardLitt/standard-readme).
Contributions are welcome!

For more on this, please see [here](../documentation-updates/documentation/contributing.md).

## License
[AGPL-3.0](../staging/LICENSE) © Vulcanize Inc
4 changes: 2 additions & 2 deletions cmd/contractWatcher.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ import (

st "github.com/vulcanize/vulcanizedb/libraries/shared/transformer"
ft "github.com/vulcanize/vulcanizedb/pkg/contract_watcher/full/transformer"
lt "github.com/vulcanize/vulcanizedb/pkg/contract_watcher/header/transformer"
ht "github.com/vulcanize/vulcanizedb/pkg/contract_watcher/header/transformer"
"github.com/vulcanize/vulcanizedb/utils"
)

Expand Down Expand Up @@ -99,7 +99,7 @@ func contractWatcher() {
con.PrepConfig()
switch mode {
case "header":
t = lt.NewTransformer(con, blockChain, &db)
t = ht.NewTransformer(con, blockChain, &db)
case "full":
t = ft.NewTransformer(con, blockChain, &db)
default:
Expand Down
16 changes: 14 additions & 2 deletions documentation/contributing.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,23 @@
# Contribution guidelines

Contributions are welcome! In addition to core contributions, developers are encouraged to build their own custom transformers which
Contributions are welcome! Please open an Issues or Pull Request for any changes.

In addition to core contributions, developers are encouraged to build their own custom transformers which
can be run together with other custom transformers using the [composeAndExeucte](../../staging/documentation/composeAndExecute.md) command.

## Pull Requests
- `go fmt` is run as part of `make test` and `make integrationtest`, please make sure to check in the format changes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gofmt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're specifically using the go fmt package, as opposed to gofmt - I had no idea they were two different things! 🤯

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! This is news to me also. Whoops, I probably should have noticed that in our Makefile now by now 🙃

- Ensure that new code is well tested, including integration testing if applicable.
- Make sure the build is passing.
- Update the README or any [documentation files](./) as necessary. If editing the Readme, please
conform to the
[standard-readme specification](https://github.com/RichardLitt/standard-readme).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we're really conforming to the standard readme spec?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you are right we aren't. I think the Dependencies section is the only deviation from the standard, though.

Copy link
Contributor Author

@elizabethengelman elizabethengelman May 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I think in the standard readme spec it includes Dependencies as a subsection of Install. It also doesn't include a Tests section - which I think could make sense as a subsection to either Install or Usage. I'd be happy to make those changes - what do you all think?

updated here: 69b4431

- You may merge a Pull Request once you have an approval from core developer.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the general workflow that we've been following, but I wonder if we should rethink this in the future when additional contributors come onto the project so that we can continue to sanely add new features. Do folks think that we should require 2 approvals? Only allow "core developer" merging permissions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should also mention to tag a few of us as Reviewers on any new Pull Request?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely on board with requiring 2 approves and only allowing core members to merge PRs. I think (hope) the latter is already the case

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also onboard with with requiring 2 approves and restricting merges to core.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 cool, I'll update the language in this document, and change the repo settings to require 2 reviews before allowing a merge.


## Creating a new migration file
1. `make new_migration NAME=add_columnA_to_table1`
- This will create a new timestamped migration file in `db/migrations`
1. Write the migration code in the created file, under the respective `goose` pragma
- Goose automatically runs each migration in a transaction; don't add `BEGIN` and `COMMIT` statements.
1. Core migrations should be committed in their `goose fix`ed form.
1. Core migrations should be committed in their `goose fix`ed form. To do this, run `make version_migrations` which
converts timestamped migrations to migrations versioned by an incremented integer.
Original file line number Diff line number Diff line change
@@ -1,43 +1,78 @@
# composeAndExecute
The `composeAndExecute` command is used to compose and execute over an arbitrary set of custom transformers.
This is accomplished by generating a Go pluggin which allows the `vulcanizedb` binary to link to external transformers, so
long as they abide by one of the standard [interfaces](../staging/libraries/shared/transformer).

Additionally, there are separate `compose` and `execute` commands to allow pre-building and linking to a pre-built .so file.

**NOTE:**
1. It is necessary that the .so file was built with the same exact dependencies that are present in the execution environment,
i.e. we need to `compose` and `execute` the plugin .so file with the same exact version of vulcanizeDB.
1. The plugin migrations are run during the plugin's composition. As such, if `execute` is used to run a prebuilt .so in a different
environment than the one it was composed in then the migrations for that plugin will first need to be manually ran against that environment's Postgres database.

These commands require Go 1.11+ and use [Go plugins](https://golang.org/pkg/plugin/) which only work on Unix-based systems.
There is also an ongoing [conflict](https://github.com/golang/go/issues/20481) between Go plugins and the use vendored dependencies which
imposes certain limitations on how the plugins are built.

## Commands
The `compose` and `composeAndExecute` commands assume you are in the vulcanizdb directory located at your system's `$GOPATH`,
and that all of the transformer repositories for building the plugin are present at their `$GOPATH` directories.

The `execute` command does not require the plugin transformer dependencies be located in their
`$GOPATH` directories, instead it expects a prebuilt .so file (of the name specified in the config file)
to be in `$GOPATH/src/github.com/vulcanize/vulcanizedb/plugins/` and, as noted above, also expects the plugin
db migrations to have already been ran against the database.

compose:

`./vulcanizedb compose --config=./environments/config_name.toml`

execute:

`./vulcanizedb execute --config=./environments/config_name.toml`

composeAndExecute:

`./vulcanizedb composeAndExecute --config=./environments/config_name.toml`

## Flags

# Custom Transformers
When the capabilities of the generic `contractWatcher` are not sufficient, custom transformers tailored to a specific
purpose can be leveraged.

Individual custom transformers can be composed together from any number of external repositories and executed as a
single process using the `compose` and `execute` commands or the `composeAndExecute` command. This is accomplished by
generating a Go plugin which allows the `vulcanizedb` binary to link to the external transformers, so long as they
abide by one of the standard [interfaces](../staging/libraries/shared/transformer).

## Writing custom transformers
For help with writing different types of custom transformers please see below:

Storage Transformers
* [Guide](../../staging/libraries/shared/factories/storage/README.md)
* [Example](../../staging/libraries/shared/factories/storage/EXAMPLE.md)

Event Transformers
* [Guide](../../staging/libraries/shared/factories/event/README.md)
* [Example 1](https://github.com/vulcanize/ens_transformers/tree/master/transformers/registar)
* [Example 2](https://github.com/vulcanize/ens_transformers/tree/master/transformers/registry)
* [Example 3](https://github.com/vulcanize/ens_transformers/tree/master/transformers/resolver)

Contract Transformers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be useful to include brief description of what each transformer does and how they differ from one another? Idk, maybe the links are sufficient, was just thinking something like:

Storage Transformers - transform data derived from contract storage tries
Event Transformers - transform data derived from Ethereum log events
Contract Transformers - ???

Copy link
Collaborator

@i-norden i-norden May 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the last one, maybe something like

Contract Transformers - transform data derived from Ethereum log events and use it to poll public contract methods

?

Copy link
Contributor Author

@elizabethengelman elizabethengelman May 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 7ddf728

* [Example 1](https://github.com/vulcanize/account_transformers)
* [Example 2](https://github.com/vulcanize/ens_transformers/tree/master/transformers/domain_records)

## Preparing custom transformers to work as part of a plugin
To plug in an external transformer we need to:

1. Create a package that exports a variable `TransformerInitializer`, `StorageTransformerInitializer`, or `ContractTransformerInitializer` that are of type [TransformerInitializer](../staging/libraries/shared/transformer/event_transformer.go#L33)
or [StorageTransformerInitializer](../../staging/libraries/shared/transformer/storage_transformer.go#L31),
or [ContractTransformerInitializer](../../staging/libraries/shared/transformer/contract_transformer.go#L31), respectively
2. Design the transformers to work in the context of their [event](../staging/libraries/shared/watcher/event_watcher.go#L83),
[storage](../../staging/libraries/shared/watcher/storage_watcher.go#L53),
or [contract](../../staging/libraries/shared/watcher/contract_watcher.go#L68) watcher execution modes
3. Create db migrations to run against vulcanizeDB so that we can store the transformer output
* Do not `goose fix` the transformer migrations, this is to ensure they are always ran after the core vulcanizedb migrations which are kept in their fixed form
* Specify migration locations for each transformer in the config with the `exporter.transformer.migrations` fields
* If the base vDB migrations occupy this path as well, they need to be in their `goose fix`ed form
as they are [here](../../staging/db/migrations)

To update a plugin repository with changes to the core vulcanizedb repository, run `dep ensure` to update its dependencies.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is super hard. I find it tricky to follow, but also don't know how to simplify without breaking out new docs. I guess I'm leaning toward just including a pretty general statement along the lines of "custom transformers can be developed as plugins and contained in separate repos. For details on building a plugin that's compatible with VulcanizeDB, see docs/building-transformer-plugins.md`.

And then I feel like in that other doc, it would be helpful to explain -

  • what does it mean to create a package that "exports a variable TransformerInitializer, StorageTransformerInitializer, or ContractTransformerInitializer that are of type TransformerInitializer or StorageTransformerInitializer, or ContractTransformerInitializer, respectively"?
  • what does it mean to design a transformer to work in the context of its watcher execution mode? Perhaps it'd be helpful to link to factories/transformer and talk about how the plugin developer only needs to build the dependencies that we inject there?
  • how does one organize the plugin config? this could probably be a separate doc, but it seems tricky to me that this text only mentions "pecify migration locations for each transformer in the config with the exporter.transformer.migrations fields"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with this sentiment, and I can take the lead on cleaning this up since this my doing! Although if we extract this elsewhere I am wondering if it might make sense to include this information in the guides for writing transformers? Points 1 and 2 would fit well in their transformer's respective guides. For the third point, the config organization is expanded on further down in the config section, but I agree it should probably be lifted into a separate doc and overall needs more clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be great if you could take the lead on reworking this @i-norden, let me know if you need someone to bounce ideas off, I'd be happy to help!


## Building and Running Custom Transformers
### Commands
* The `compose`, `execute`, `composeAndExecute` commands require Go 1.11+ and use [Go plugins](https://golang
.org/pkg/plugin/) which only work on Unix-based systems.

* There is an ongoing [conflict](https://github.com/golang/go/issues/20481) between Go plugins and the use vendored
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/use vendored/use of vendored

dependencies which imposes certain limitations on how the plugins are built.

* Separate `compose` and `execute` commands allow pre-building and linking to a pre-built .so file. So, if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be helpful to give a brief overview of composeAndExecute, or at least the general approach that these commands yield, before we dive into the nitty gritty. Something like "compose, execute, and composeAndExecute are commands that facilitate running vdb plugins. They compile plugin code, run associated database migrations, and execute plugin transformers to populate contract-specific data"

these are run independently, instead of using `composeAndExecute`, a couple of things need to be considered:
* It is necessary that the .so file was built with the same exact dependencies that are present in the execution
environment, i.e. we need to `compose` and `execute` the plugin .so file with the same exact version of vulcanizeDB.
* The plugin migrations are run during the plugin's composition. As such, if `execute` is used to run a prebuilt .so
in a different environment than the one it was composed in then the migrations for that plugin will first need to
be manually ran against that environment's Postgres database.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is minor but I think you could also load the plugin's schema to get around this, and that might be a more simple process if you're really interested in running a prebuilt .so file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated it in b3019c9, let me know what you think!


* The `compose` and `composeAndExecute` commands assume you are in the vulcanizdb directory located at your system's
`$GOPATH`, and that all of the transformer repositories for building the plugin are present at their `$GOPATH` directories.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if "all of the transformer repositories for building the plugin are present" could be simplified to "the plugin is present"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems a little confusing to me since I think of the "plugin" as the output .so file and it wouldn't be present beforehand, but that is just semantics. What about simplifying to "the plugin dependencies are present"?


* The `execute` command does not require the plugin transformer dependencies be located in their `$GOPATH` directories,
instead it expects a prebuilt .so file (of the name specified in the config file) to be in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really minor but "a prebuilt" seems unnecessary here

`$GOPATH/src/github.com/vulcanize/vulcanizedb/plugins/` and, as noted above, also expects the plugin db migrations to
have already been ran against the database.

* Usage:
* compose: `./vulcanizedb compose --config=./environments/config_name.toml`

* execute: `./vulcanizedb execute --config=./environments/config_name.toml`

* composeAndExecute: `./vulcanizedb composeAndExecute --config=./environments/config_name.toml`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the . before /environments here


### Flags
The `compose` and `composeAndExecute` commands can be passed optional flags to specify the operation of the watchers:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

execute and composeAndExecute commands receive these flags, but not the compose


- `--recheck-headers`/`-r` - specifies whether to re-check headers for events after the header has already been queried for watched logs.
Expand All @@ -50,7 +85,7 @@ Defaults to `false`.
Argument is expected to be a duration (integer measured in nanoseconds): e.g. `-q=10m30s` (for 10 minute, 30 second intervals).
Defaults to `5m` (5 minutes).

## Configuration
### Configuration
A .toml config file is specified when executing the commands.
The config provides information for composing a set of transformers from external repositories:

Expand Down
Binary file added documentation/diagrams/vdb-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# contractWatcher
# Generic Transformer
The `contractWatcher` command is a built-in generic contract watcher. It can watch any and all events for a given contract provided the contract's ABI is available.
It also provides some state variable coverage by automating polling of public methods, with some restrictions:
1. The method must have 2 or less arguments
Expand Down
27 changes: 27 additions & 0 deletions documentation/repository-maintenance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Repository Maintenance

## Diagrams
- Diagrams were created with [draw.io](draw.io).
- To update a diagram:
1. Go to [draw.io](draw.io).
1. Click on *File > Open from* and choose the location of the diagram you want to update.
1. Once open in draw.io, you may update it.
1. Export the diagram to this repository's directory and add commit it.


## Generating the Changelog
We use [github-changelog-generator](https://github.com/github-changelog-generator/github-changelog-generator) to
generate release Changelogs. To be consistent with previous Changelogs, the following flags should be passed to the
command:

```
--user vulcanize
--project vulcanizedb
--token {YOUR_GITHUB_TOKEN}
--no-issues
--usernames-as-github-logins
--since-tag {PREVIOUS_RELEASE_TAG}
```

More information on why your github token is needed, and how to generate it here:https://github
.com/github-changelog-generator/github-changelog-generator#github-token
Loading