Skip to content
This repository has been archived by the owner on Aug 31, 2021. It is now read-only.

Documentation updates #94

Merged
merged 19 commits into from
May 15, 2019
Merged

Documentation updates #94

merged 19 commits into from
May 15, 2019

Conversation

elizabethengelman
Copy link
Contributor

@elizabethengelman elizabethengelman commented May 8, 2019

A couple of things that I'm planning to add to the README/documentation, if they seem useful:

  • mention how reorgs are handled
  • clarify the relationship between which Ethereum client sync mode can be used with each command
  • rename sync to fullSync
  • include a config file example for sync/headersSync

Other questions/ideas for improvements to the code base (that I would love some input on!):

  1. is there a different name we want to discuss for the Watcher abstraction?
  2. allow for the validation window during a sync to be configurable UPDATE: added a story in Jira, VDB-591
  3. based on the workflows we currently have, I don't think we're using the coldImport command - should we mention this explicitly somehow? Perhaps we need some more clarity about what it's utility is.
  4. should we rename the sync command to fullSync since that is what the tables are called? I am worried about confusing this with geth's full sync though 🤔 UPDATE: added to checklist above
  5. As I was revisiting it, I found it a bit confusing that EventWatcher.execute is adding transactions to the database. I may be overthinking this, but I can imaging that this may be confusing for a new user, who though they did (and understood) a headerSync, and then after executing transformers realize that they now have transactions in their db. I may be overthinking it, and this is just something that we need to explicitly document in the right place, and the right way. So I'm totally up for suggestions!

@elizabethengelman elizabethengelman force-pushed the documentation-updates branch 11 times, most recently from 5771543 to 3a9860d Compare May 8, 2019 17:51
README.md Outdated
@@ -126,9 +138,6 @@ Documentation on how to build custom transformers to work with these commands ca
- `make test` will run the unit tests and skip the integration tests
- `make integrationtest` will run just the integration tests

## API
[Postgraphile](https://www.graphile.org/postgraphile/) is used to expose GraphQL endpoints for our database schemas, this is described in detail [here](../staging/postgraphile/README.md).


## Contributing
Contributions are welcome! For more on this, please see [here](../staging/documentation/contributing.md).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to include a Code of Conduct in here, for when we receive community contributions. The Contributor Covenant is one that comes to mind, but I'm certainly happy to do some more research/work on finding one that fits Vulcanize.

Copy link
Contributor Author

@elizabethengelman elizabethengelman May 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in 8dc9ddf

- Update the README or any documentation files as necessary. If editing the Readme, please
conform to the
[standard-readme specification](https://github.com/RichardLitt/standard-readme).
- You may merge a Pull Request once you have an approval from core developer.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the general workflow that we've been following, but I wonder if we should rethink this in the future when additional contributors come onto the project so that we can continue to sanely add new features. Do folks think that we should require 2 approvals? Only allow "core developer" merging permissions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should also mention to tag a few of us as Reviewers on any new Pull Request?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely on board with requiring 2 approves and only allowing core members to merge PRs. I think (hope) the latter is already the case

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also onboard with with requiring 2 approves and restricting merges to core.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 cool, I'll update the language in this document, and change the repo settings to require 2 reviews before allowing a merge.

@elizabethengelman elizabethengelman force-pushed the documentation-updates branch 2 times, most recently from 5275a0d to 3caae7c Compare May 8, 2019 18:40
@elizabethengelman elizabethengelman force-pushed the documentation-updates branch 3 times, most recently from bbebb2e to 3c048a7 Compare May 8, 2019 22:28
broken into generic and custom
Copy link
Contributor

@rmulhol rmulhol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for digging into this! The docs look a lot better.

I left a lot of comments, but I don’t think they all need to be addressed for us to get value out of merging. Mostly just wanted to expose my thinking and identify places where we might want to put in more work later.

Regarding your top-level questions -

  1. I’d definitely be up for re-naming the Watcher to something that better captures what it’s doing. Will think on concise ways to describe something that keeps a collection of transformers and delegates data to the appropriate one for a given chunk of data (and also think about whether that description really captures all of the responsibilities we’ve put onto the watcher 😄 ).
  2. Definitely 👍 to making the validation window configurable
  3. tbh I’m pretty tempted to just delete all the cold import code/maybe leave it on a branch somewhere. It’s a decent performance improver for sync but…
  4. I’m also pretty tempted to delete sync 😄 Would want to check with @AFDudley because I think there are potentially some use cases for sync right now, but I’d also be curious to see if we’re actually able to get that running as a working proof of concept that has acceptable performance - and if it doesn't need to be significantly updated anyway if the goal is to replace RPC queries to an eth node. Definitely on board with at least renaming it though in the interim though!
  5. I agree that adding transactions is a weird responsibility to put into the event watcher, and that was me who did that. My rationale was that it seems tricky to fetch only transactions for log events you care about (as opposed to all transactions in a block) and not duplicate transactions (for multiple log events in the same transaction) if we put it anywhere else. But there could also definitely be other ways of accomplishing those goals - I’m open to suggestions as well!

And big 👍 to including a code of conduct

@@ -1,22 +1,35 @@
# Syncing commands
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been wondering if we might want to rename this file. Seems a little off to me that we have cmd/sync and documentation/sync but the latter encompasses more than the former.

I don't have any great ideas here, but maybe something like step_one? just spitballing - but thinking we basically want to draw folks here as the first thing they should consider after cloning the repo and doing basic setup

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

step_one or maybe something generic like eth_syncing? step_one lends itself in my eyes to having docs named as subsequent steps which might be tricky since the next step would be a split choice between the custom transformers or the generic transformer.

Copy link
Contributor Author

@elizabethengelman elizabethengelman May 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I totally agree to a rename.

I agree that having it called step_one would make it feel like the rest would need to be named step_x. If we did make that change, then we could rename custom-transformers and generic-transformer to step-two-custom-transform and step-two-generic-transform, respectively?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed it data-syncing for now, though I'm not sure that's quite right either. 🤔

829a581

1. Start Ethereum node
Syncs block headers from a running Ethereum node into the VulcanizeDB table `headers`.
- Queries the Ethereum node using RPC calls.
- Validates headers from the last 15 blocks to ensure that data is up to date.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely on board with making this # configurable 👍

Copy link
Contributor Author

@elizabethengelman elizabethengelman May 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, just added a story to the backlog: https://makerdao.atlassian.net/browse/VDB-591

- Useful when you want to maintain a broad cache of what's happening on the blockchain.

##### Usage
1. Start Ethereum node (**if fast syncing your Ethereum node, wait for initial sync to finish**).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if this might be more accurately captured as "specify ipc path to running ethereum node". also idk if maybe we should just make it a cli flag to the example command (e.g. --client-ipcPath <ipc-path>).

and then maybe it would make sense to have a gotchas doc for capturing things likes errors on initial state sync not finished

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 updated in 0cc93b7

@@ -48,8 +48,8 @@ func PopulateMissingHeaders(blockchain core.BlockChain, headerRepository datasto
return len(blockNumbers), nil
}

func RetrieveAndUpdateHeaders(chain core.BlockChain, headerRepository datastore.HeaderRepository, blockNumbers []int64) (int, error) {
headers, err := chain.GetHeaderByNumbers(blockNumbers)
func RetrieveAndUpdateHeaders(blockchain core.BlockChain, headerRepository datastore.HeaderRepository, blockNumbers []int64) (int, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is super minor but GoLand complains to me on the word blockchain, and seems to prefer blockChain or chain. I know we're worlds away from squashing all the code analysis issues there, but would love to keep chipping away

* [Example 2](https://github.com/vulcanize/ens_transformers/tree/master/transformers/registry)
* [Example 3](https://github.com/vulcanize/ens_transformers/tree/master/transformers/resolver)

Contract Transformers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be useful to include brief description of what each transformer does and how they differ from one another? Idk, maybe the links are sufficient, was just thinking something like:

Storage Transformers - transform data derived from contract storage tries
Event Transformers - transform data derived from Ethereum log events
Contract Transformers - ???

Copy link
Collaborator

@i-norden i-norden May 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the last one, maybe something like

Contract Transformers - transform data derived from Ethereum log events and use it to poll public contract methods

?

Copy link
Contributor Author

@elizabethengelman elizabethengelman May 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 7ddf728

environment, i.e. we need to `compose` and `execute` the plugin .so file with the same exact version of vulcanizeDB.
* The plugin migrations are run during the plugin's composition. As such, if `execute` is used to run a prebuilt .so
in a different environment than the one it was composed in then the migrations for that plugin will first need to
be manually ran against that environment's Postgres database.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is minor but I think you could also load the plugin's schema to get around this, and that might be a more simple process if you're really interested in running a prebuilt .so file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated it in b3019c9, let me know what you think!

be manually ran against that environment's Postgres database.

* The `compose` and `composeAndExecute` commands assume you are in the vulcanizdb directory located at your system's
`$GOPATH`, and that all of the transformer repositories for building the plugin are present at their `$GOPATH` directories.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if "all of the transformer repositories for building the plugin are present" could be simplified to "the plugin is present"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems a little confusing to me since I think of the "plugin" as the output .so file and it wouldn't be present beforehand, but that is just semantics. What about simplifying to "the plugin dependencies are present"?

`$GOPATH`, and that all of the transformer repositories for building the plugin are present at their `$GOPATH` directories.

* The `execute` command does not require the plugin transformer dependencies be located in their `$GOPATH` directories,
instead it expects a prebuilt .so file (of the name specified in the config file) to be in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really minor but "a prebuilt" seems unnecessary here


* execute: `./vulcanizedb execute --config=./environments/config_name.toml`

* composeAndExecute: `./vulcanizedb composeAndExecute --config=./environments/config_name.toml`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the . before /environments here

- Make sure the build is passing.
- Update the README or any [documentation files](./) as necessary. If editing the Readme, please
conform to the
[standard-readme specification](https://github.com/RichardLitt/standard-readme).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we're really conforming to the standard readme spec?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you are right we aren't. I think the Dependencies section is the only deviation from the standard, though.

Copy link
Contributor Author

@elizabethengelman elizabethengelman May 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I think in the standard readme spec it includes Dependencies as a subsection of Install. It also doesn't include a Tests section - which I think could make sense as a subsection to either Install or Usage. I'd be happy to make those changes - what do you all think?

updated here: 69b4431

Copy link
Collaborator

@i-norden i-norden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!! Thank you for these updates 🙏🙏🙏

I am late to the party and mostly piggybacking on Rob's comments, but in regards to the questions:

  1. Nothing immediately comes to mind that is better than watcher but will keep thinking on this!
  2. Configureable validation window is a good idea. I suppose in that case we would want to impose some min/max limits, and leave the default where it is?
  3. I think moving the coldImport code out like Rob said is a good idea and
  4. even the full sync code too if that is appropriate. Having to explain 3 sync modes and when/why a user chooses one vs the other and then also having to explain how the transformers need to be built differently to work with one of them vs the other two adds a lot of complexity to these docs.
  5. I think with proper documentation this is fine, but then again I am guilty of introducing the extremely confusing contractWatcher type 🙃 If the issue is deduping transactions in the db, could a unique constraint enforce that?

- Make sure the build is passing.
- Update the README or any [documentation files](./) as necessary. If editing the Readme, please
conform to the
[standard-readme specification](https://github.com/RichardLitt/standard-readme).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you are right we aren't. I think the Dependencies section is the only deviation from the standard, though.

- Update the README or any documentation files as necessary. If editing the Readme, please
conform to the
[standard-readme specification](https://github.com/RichardLitt/standard-readme).
- You may merge a Pull Request once you have an approval from core developer.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also onboard with with requiring 2 approves and restricting merges to core.

* [Example 2](https://github.com/vulcanize/ens_transformers/tree/master/transformers/registry)
* [Example 3](https://github.com/vulcanize/ens_transformers/tree/master/transformers/resolver)

Contract Transformers
Copy link
Collaborator

@i-norden i-norden May 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the last one, maybe something like

Contract Transformers - transform data derived from Ethereum log events and use it to poll public contract methods

?

* If the base vDB migrations occupy this path as well, they need to be in their `goose fix`ed form
as they are [here](../../staging/db/migrations)

To update a plugin repository with changes to the core vulcanizedb repository, run `dep ensure` to update its dependencies.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with this sentiment, and I can take the lead on cleaning this up since this my doing! Although if we extract this elsewhere I am wondering if it might make sense to include this information in the guides for writing transformers? Points 1 and 2 would fit well in their transformer's respective guides. For the third point, the config organization is expanded on further down in the config section, but I agree it should probably be lifted into a separate doc and overall needs more clarity.

be manually ran against that environment's Postgres database.

* The `compose` and `composeAndExecute` commands assume you are in the vulcanizdb directory located at your system's
`$GOPATH`, and that all of the transformer repositories for building the plugin are present at their `$GOPATH` directories.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems a little confusing to me since I think of the "plugin" as the output .so file and it wouldn't be present beforehand, but that is just semantics. What about simplifying to "the plugin dependencies are present"?

can be run together with other custom transformers using the [composeAndExeucte](../../staging/documentation/composeAndExecute.md) command.

## Pull Requests
- `go fmt` is run as part of `make test` and `make integrationtest`, please make sure to check in the format changes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gofmt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're specifically using the go fmt package, as opposed to gofmt - I had no idea they were two different things! 🤯

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! This is news to me also. Whoops, I probably should have noticed that in our Makefile now by now 🙃


* composeAndExecute: `./vulcanizedb composeAndExecute --config=./environments/config_name.toml`

### Flags
The `compose` and `composeAndExecute` commands can be passed optional flags to specify the operation of the watchers:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

execute and composeAndExecute commands receive these flags, but not the compose

@elizabethengelman elizabethengelman force-pushed the documentation-updates branch 2 times, most recently from 5479151 to 506103b Compare May 10, 2019 15:57
@elizabethengelman elizabethengelman changed the title [WIP] Documentation updates Documentation updates May 13, 2019
@elizabethengelman elizabethengelman force-pushed the documentation-updates branch 2 times, most recently from 16b6117 to 679ec45 Compare May 13, 2019 21:33
@elizabethengelman
Copy link
Contributor Author

Thanks so much for the feedback @rmulhol & @i-norden! I think I've addressed most of what you've mentioned. A couple of things that I didn't touch because I think we probably need to think on them as a team a bit more are:

  • potentially a new name for the Watcher abstraction
  • removing/moving coldImport and fullSync commands
  • Does adding transactions in EventWatcher.execute make sense? Where else could this go?

I also removed the check list item for adding a diagram for the db schema for now. At the moment it seems like blocks and headers are functionally the most interesting tables for core vulcanize, and a diagram relating those two tables didn't really seem necessary. I'd be happy to revisit this later though!

@elizabethengelman elizabethengelman merged commit 79765c7 into staging May 15, 2019
@elizabethengelman elizabethengelman deleted the documentation-updates branch May 15, 2019 18:22
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants