Skip to content

Commit

Permalink
Merge pull request #14 from yongchand/main
Browse files Browse the repository at this point in the history
Add Documentation
  • Loading branch information
yongchand authored Nov 9, 2022
2 parents eb5fd78 + bd99af7 commit 0e8c3a1
Show file tree
Hide file tree
Showing 16 changed files with 1,593 additions and 4 deletions.
6 changes: 2 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,9 @@
Klaytn ETL lets you convert Klaytn blockchain data into convenient formats like JSONs, CSVs and relational databases.
This is a fork of [Ethereum ETL](https://github.com/blockchain-etl/ethereum-etl).

***Notice: Klaytn ETL is still on the beta version. However, CLIs are all functional.***

***Documents about command reference will be soon updated.***
[Full documentation available here](http://klaytn-etl.readthedocs.io/).

***Please check issues to figure out current works.***
***Notice: Klaytn ETL is still on the beta version. However, CLIs are all functional.***

## Quickstart
Install Klaytn ETL:
Expand Down
240 changes: 240 additions & 0 deletions docs/commands.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,240 @@
# Commands

All the commands accept `-h` parameter for help, e.g.:

```bash
> klaytnetl export_blocks_and_transactions -h

Usage: klaytnetl export_blocks_and_transactions [OPTIONS]

Options:
-s, --start-block INTEGER Start block [default: 0]
-e, --end-block INTEGER End block [required]
-b, --batch-size INTEGER The number of blocks to export at a time.
[default: 100]
-p, --provider-uri TEXT The URI of the web3 provider e.g.
file://$HOME/var/kend/data/klay.ipc or
https://cypress.fandom.finance/archive
[default:
https://cypress.fandom.finance/archive]
-w, --max-workers INTEGER The maximum number of workers. [default: 5]
--blocks-output TEXT The output file for blocks. If not provided
blocks will not be exported. Use "-" for stdout
--transactions-output TEXT The output file for transactions. If not
provided transactions will not be exported. Use
"-" for stdout
--network TEXT Input either baobab or cypress to obtain public
provider.If not provided, the option will be
disabled.
-h, --help Show this message and exit.
```

For the `--output` parameters the supported types are csv and json. The format type is inferred from the output file name.

#### export_blocks_and_transactions

```bash
> klaytnetl export_blocks_and_transactions --start-block 0 --end-block 500000 \
--provider-uri https://cypress.fandom.finance/archive \
--blocks-output blocks.csv --transactions-output transactions.csv
```

Omit `--blocks-output` or `--transactions-output` options if you want to export only transactions/blocks.

You can tune `--batch-size`, `--max-workers` for performance.

You can select either `baobab` or `cypress` in `--network`.

[Blocks and transactions schema](schema.md#blockscsv).

#### export_token_transfers

```bash
> klaytnetl export_token_transfers --start-block 0 --end-block 500000 \
--provider-uri https://cypress.fandom.finance/archive --batch-size 100 --output token_transfers.csv
```

Include `--tokens <token1> --tokens <token2>` to filter only certain tokens, e.g.

```bash
> klaytnetl export_token_transfers --start-block 42397700 --end-block 42397800 \
--provider-uri https://cypress.fandom.finance/archive --output token_transfers.csv \
--tokens 0xcee8faf64bb97a73bb51e115aa89c17ffa8dd167 --tokens 0x34d21b1e550d73cee41151c77f3c73359527a396
```

You can tune `--batch-size`, `--max-workers` for performance.

You can select either `baobab` or `cypress` in `--network`.

[Token transfers schema](schema.md#token_transferscsv).

#### export_receipts_and_logs

First extract transactions from [export_blocks_and_transactions](#export_blocks_and_transactions)

Then export receipts and logs from transactions.csv file:

```bash
> klaytnetl export_receipts_and_logs --transactions transactions.csv \
--provider-uri https://cypress.fandom.finance/archive --receipts-output receipts.csv --logs-output logs.csv
```

Omit `--receipts-output` or `--logs-output` options if you want to export only logs/receipts.

You can tune `--batch-size`, `--max-workers` for performance.

You can select either `baobab` or `cypress` in `--network`.

[Receipts and logs schema](schema.md#receiptscsv).

#### extract_token_transfers

First export receipt logs with [export_receipts_and_logs](#export_receipts_and_logs).

Then extract transfers from the logs.csv file:

```bash
> klaytnetl extract_token_transfers --logs logs.csv --output token_transfers.csv
```

You can tune `--batch-size`, `--max-workers` for performance.

You can select either `baobab` or `cypress` in `--network`.

[Token transfers schema](schema.md#token_transferscsv).

#### export_contracts

First extract receipts from [export_receipts_and_logs](#export_receipts_and_logs)

Then export contracts:

```bash
> klaytnetl export_contracts --receipts receipts.csv \
--provider-uri https://cypress.fandom.finance/archive --output contracts.csv
```

You can tune `--batch-size`, `--max-workers` for performance.

You can select either `baobab` or `cypress` in `--network`.

[Contracts schema](schema.md#contractscsv).

#### export_tokens

First extract token addresses from `contracts.json`
(Exported with [export_contracts](#export_contracts)):

```bash
> klaytnetl filter_items -i contracts.json -p "item['is_erc20'] or item['is_erc721'] or item['is_erc1155']" | \
klaytnetl extract_field -f address -o token_addresses.txt
```

Then export ERC20 / ERC721 tokens:

```bash
> klaytnetl export_tokens --token-addresses token_addresses.txt \
--provider-uri https://cypress.fandom.finance/archive --output tokens.csv
```

You can tune `--max-workers` for performance.

You can select either `baobab` or `cypress` in `--network`.

[Tokens schema](schema.md#tokenscsv).

#### export_traces

Also called internal transactions.
Since this is rerunning a block, this will take a long time based on the transactions that block contains.
Make sure your node is an archive node with at least 8GB of memory, or else you will face timeout errors.
See [this issue](https://github.com/blockchain-etl/ethereum-etl/issues/137)

```bash
> klaytnetl export_traces --start-block 0 --end-block 500000 \
--provider-uri https://cypress.fandom.finance/archive --batch-size 100 --output traces.csv
```

By adding `--enrich` flag, you can enrich output files with additional fields like `block-timestamp`.

You can tune `--batch-size`, `--max-workers` for performance.

You can set `--timeout` appropriately.

You can set `--file-format` to either `csv` or `json` and manipulate by `--file-maxlines` and `--compress`

You can export to cloud storage by adding `--s3-bucket` flag.

You can select either `baobab` or `cypress` in `--network`.

[Traces schema](schema.md#tracescsv).

#### export_block_group

Exports block groups - blocks, transactions, receipts, logs, token transfer - from Klaytn node.

```bash
> klaytnetl export_block_group --start-block 0 --end-block 500000 \
--provider-uri https://cypress.fandom.finance/archive --batch-size 100 \
--blocks-output blocks.csv --transactions-output transactions.csv \
--receipts-output receipts.csv --logs-output logs.csv --token-transfers-output token_transfer.csv
```

Omit `--blocks-output`/`--transactions-output`/`--receipts-output`/`--logs-output`/`--token-transfers-output` options
if you want to export only transactions/blocks/receipts/logs/token transfers.

By adding `--enrich` flag, you can enrich output files with additional fields like `block-timestamp`.

You can tune `--batch-size`, `--max-workers` for performance.

You can set `--timeout` appropriately.

You can set `--file-format` to either `csv` or `json` and manipulate by `--file-maxlines` and `--compress`

You can export to cloud storage by adding `--s3-bucket` or `--gcs-bucket` flag.

You can select either `baobab` or `cypress` in `--network`.


#### export_trace_group

Exports trace groups - traces, contracts, tokens - from Klaytn node.
Since this is rerunning a block, this will take a long time based on the transactions that block contains.
Make sure your node is an archive node with at least 8GB of memory, or else you will face timeout errors.

```bash
> klaytnetl export_trace_group --start-block 0 --end-block 500000 \
--provider-uri https://cypress.fandom.finance/archive --batch-size 100 \
--traces-output traces.csv --tokens-output tokens.csv --contracts-output contracts.csv
```

Omit `--traces-output`/`--tokens-output`/`--contracts-output` options
if you want to export only traces/tokens/contracts.

By adding `--enrich` flag, you can enrich output files with additional fields like `block-timestamp`.

You can tune `--batch-size`, `--max-workers` for performance.

You can set `--timeout` appropriately.

You can set `--file-format` to either `csv` or `json` and manipulate by `--file-maxlines` and `--compress`

You can export to cloud storage by adding `--s3-bucket` or `--gcs-bucket` flag.

Use `--detailed-trace-log` and `--log-percentage-step` to get trace count with wanted steps.

You can select either `baobab` or `cypress` in `--network`.

#### get_block_range_for_date

```bash
> klaytnetl get_block_range_for_date --provider-uri=https://cypress.fandom.finance/archive --date 2020-01-01
16369455,16455852
```

#### get_keccak_hash

```bash
> klaytnetl get_keccak_hash -i "transfer(address,uint256)"
0xa9059cbb2ab09eb219583f4a59a5d0623ade346d962bcd4e46b11da047c9049b
```
30 changes: 30 additions & 0 deletions docs/exporting-the-blockchain.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
## Exporting the Blockchain

1. Install python 3.7.2+: [https://www.python.org/downloads/](https://www.python.org/downloads/)

1. Launch an endpoint node (https://docs.klaytn.foundation/getting-started/quick-start/launch-an-en) or use pre-existing endpoint (https://docs.klaytn.foundation/dapp/json-rpc/public-en)

1. Install Klaytn ETL: `> pip3 install klaytn-etl-cli`

1. Export all:

```bash
> klaytnetl export_all --help
> klaytnetl export_all -s 0 -e 5999999 -b 100000 -p https://cypress.fandom.finance/archive -o output
```

In case `klaytnetl` command is not available in PATH, use `python3 -m klaytnetl` instead.

The result will be in the `output` subdirectory, partitioned in Hive style:
```bash
output/blocks/start_block=00000000/end_block=00099999/blocks_00000000_00099999.csv
output/blocks/start_block=00100000/end_block=00199999/blocks_00100000_00199999.csv
...
output/transactions/start_block=00000000/end_block=00099999/transactions_00000000_00099999.csv
...
output/token_transfers/start_block=00000000/end_block=00099999/token_transfers_00000000_00099999.csv
...
```

Should work on Linux, Mac, Windows.
Since `debug_traceBlockByNumber` takes a long time, please cautious when running anything related to trace.
18 changes: 18 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Overview

Klaytn ETL lets you convert Klaytn blockchain data into convenient formats like JSONs, CSVs and relational databases.
This is a fork of [Ethereum ETL](https://github.com/blockchain-etl/ethereum-etl).

## Features

Easily export:

* Blocks
* Transactions
* ERC20 / ERC721 / ERC 1155 tokens
* Token transfers
* Receipts
* Logs
* Contracts
* Traces (Internal transactions)

14 changes: 14 additions & 0 deletions docs/limitations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Limitation

- In case the contract is a proxy, which forwards all calls to a delegate, interface detection doesn’t work,
which means `is_erc20`/`is_erc721`/`is_erc1155` will always be false for proxy contracts and they will be missing in the `tokens`
table.
- The metadata methods (`symbol`, `name`, `decimals`, `total_supply`) for ERC20 are optional, so around 10% of the
contracts are missing this data.
- `token_transfers.value`, `tokens.decimals` and `tokens.total_supply` have type `STRING` in BigQuery tables,
because numeric types there can't handle 32-byte integers. You should use
`cast(value as FLOAT64)` (possible loss of precision) or
`safe_cast(value as NUMERIC)` (possible overflow) to convert to numbers.
- The contracts that don't implement `decimals()` function but have the
[fallback function](https://solidity.readthedocs.io/en/v0.4.21/contracts.html#fallback-function) that returns a `boolean`
will have `0` or `1` in the `decimals` column in the CSVs.
35 changes: 35 additions & 0 deletions docs/quickstart.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Quickstart
Install Klaytn ETL:

```bash
pip3 install klaytn-etl-cli
```

Export blocks and transactions

```bash
> klaytnetl export_blocks_and_transactions --start-block 0 --end-block 5000 \
--blocks-output blocks.json --transactions-output transactions.json
```

Export ERC20 and ERC721 transfers

```bash
> klaytnetl export_token_transfers --start-block 0 --end-block 5000 \
--output token_transfers.json
```

Export traces

```bash
> klaytnetl export_traces --start-block 0 --end-block 5000 \
--output traces.json
```

Find other commands [here](commands.md).

For the latest version, check out the repo and call
```bash
> pip3 install -e .
> python3 klaytnetl.py
```
Loading

0 comments on commit 0e8c3a1

Please sign in to comment.