Skip to content

Commit

Permalink
Merge pull request #8 from migalabs/feat/dockerize
Browse files Browse the repository at this point in the history
Feat/dockerize
  • Loading branch information
santi1234567 authored Feb 16, 2024
2 parents 32e4f28 + 43872fc commit c9a0b30
Show file tree
Hide file tree
Showing 10 changed files with 270 additions and 13 deletions.
1 change: 1 addition & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
app-data
10 changes: 10 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
LOG_LEVEL="info" # debug, info, warn, error
DB_URL="postgres://user:password@localhost:5432/dbName" # URL to connect to the postgres database
WORKER_NUM=15 # Number of workers to run concurrent alchemy/EL node requests
ALCHEMY_URL="https://eth-mainnet.g.alchemy.com/v2/KEY" # Alchemy API URL
EL_ENDPOINT="http://localhost:8545" # Ethereum Layer 1 endpoint, can also be alchemy or infura

DATABASE_NAME=name # Your database name
DATABASE_USERNAME=user # Your database username
DATABASE_PASSWORD=pass # Your database password
LOCAL_PORT=5439 # Port where you connect to the database container
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,5 @@

.env
.vscode

app-data
15 changes: 15 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
FROM golang:1.21-alpine as builder
RUN mkdir /app
WORKDIR /app
ADD . .

RUN go get
RUN go build -o ./build/eth_pokhar


FROM alpine:latest
WORKDIR /
COPY --from=builder /app/build/eth_pokhar ./
COPY --from=builder /app/db/migrations ./db/migrations

ENTRYPOINT ["sh", "-c"]
178 changes: 177 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,162 @@
# eth-pokhar

Tool to identify validators <> entities in the Ethereum consensus layer
Eth-pokhar is a go tool that helps in the process of identifying the pool/entity that operates each validator in the Ethereum beacon chain.

Identifying staking entities is tricky since this isn’t on-chain data in most cases. In the case of pools like Lido and Rocketpool, since they use smart contracts for creating their validators, the data is on-chain and can be easily identified.

For the rest of the entities, other methods can be used like observing patterns in the depositor addresses like the ones found in this repository: [eth-deposits](https://github.com/alrevuelta/eth-deposits), and other off-chain data from contacts/data sources. When creating validators an address must deposit 32 ETH on the beaconchain contract. In most cases, entities share the same deposit address throughout multiple validators. By knowing a few of these cases, one can extrapolate the information and identify all of the validators that were generated by those addresses and thus identify the entities.

This tool is used for tagging validators in [ethseer.io](https://ethseer.io/?network=mainnet).

## Pre-requisites

To use the tool, the following requirements must be met:

- An alchemy API key (the free tier is enough). See [here](https://www.alchemy.com/pricing)
- Access to a Ethereum EL node

Expect this tool to make the following amount of requests to the Ethereum EL node on the first run:

- ~1.5m [eth_getTransactionReceipt](https://docs.alchemy.com/reference/eth-gettransactionreceipt) requests
- ~1.5m [eth_call](https://docs.alchemy.com/reference/eth-call) requests

And the following amount of requests to the alchemy API on each run:

- ~300k [alchemy_getAssetTransfers](https://docs.alchemy.com/reference/alchemy-getassettransfers) requests

## Available commands

### `beacon_depositors_transactions`

Fetches the transactions of the depositors of the beaconchain contract.

Available options (configurable in the `.env` file):

```
OPTIONS:
--el-endpoint value Execution node endpoint (default: http://localhost:8545) [$EL_ENDPOINT]
--db-url value Database where to store transactions (default: postgres://user:password@localhost:5432/dbName) [$DB_URL]
--log-level value Log level: debug, warn, info, error (default: info) [$LOG_LEVEL]
--workers-num value Number of workers to process API requests (default: 10) [$WORKER_NUM]
--alchemy-url value Alchemy url (default: https://eth-mainnet.g.alchemy.com/v2/KEY) [$ALCHEMY_URL]
--help, -h show help
```

### `identify`

Identify the pool in which validators are participating or the entity who operates the validators.

Available options (configurable in the `.env` file):

```
OPTIONS:
--el-endpoint value Execution node endpoint (default: http://localhost:8545) [$EL_ENDPOINT]
--db-url value Database where to store transactions (default: postgres://user:password@localhost:5432/dbName) [$DB_URL]
--log-level value Log level: debug, warn, info, error (default: info) [$LOG_LEVEL]
--alchemy-url value Alchemy url (default: https://eth-mainnet.g.alchemy.com/v2/KEY) [$ALCHEMY_URL]
--workers-num value Number of workers to process API requests (default: 10) [$WORKER_NUM]
--recreate-table Recreate the t_identified_validators table, meant to be used when one of the methodologies of identification changes (default: false)
--help, -h show help
```

## Running with Docker (recommended)

To run the tool with docker, you can use the following commands:

First, create a `.env` file on the root folder. You can use the `.env.example` file as a template.

Then, run the following command to build the tool:

```bash
docker-compose build
```

Finally, run the tool with the following command:

```bash
docker-compose up -d
```

## Output

The tool will create a database with the following tables:

### `t_beacon_deposits`

This table stores the deposits made to the beaconchain contract. It has the following columns:

- `f_block_num`: The block number in which the deposit was made.
- `f_depositor`: The address of the depositor.
- `f_tx_hash`: The transaction hash of the deposit.
- `f_validator_pubkey`: The public key of the validator.

### `t_beacon_depositors_transactions`

This table stores the incoming/outgoing transactions of the depositors of the beaconchain contract. It has the following columns:

- `f_block_num`: The block number in which the transaction was made.
- `f_value`: The value of the transaction.
- `f_from`: The address from which the transaction was made.
- `f_to`: The address to which the transaction was made.
- `f_tx_hash`: The transaction hash of the transaction.
- `f_depositor`: The address of the depositor to which the transaction is related.

### `t_depositors_insert`

This table stores the depositors that are used to identify the pool in which the validators are participating See [Utilizing custom off-chain data](#utilizing-custom-off-chain-data) for more information. It has the following columns:

- `f_depositor`: The address of the depositor.
- `f_pool_name`: The name of the pool in which the validators are participating.

### `t_validators_insert`

This table stores the validators that are used to identify the pool in which the validators are participating. See [Utilizing custom off-chain data](#utilizing-custom-off-chain-data) for more information. It has the following columns:

- `f_validator_pubkey`: The public key of the validator.
- `f_pool_name`: The name of the pool in which the validators are participating.

### `t_lido`

This table stores the validators that are participating in the Lido pool. See [Lido operators](https://operatorportal.lido.fi/) for more information. It has the following columns:

- `f_validator_pubkey`: The public key of the validator.
- `f_operator`: The name of the operator of the validator.
- `f_operator_index`: The index of the operator in the Lido pool.

### `t_rocketpool`

This table stores the validators that are participating in the Rocketpool pool. It has the following columns:

- `f_validator_pubkey`: The public key of the validator.

### `t_identified_validators` (End result)

This table stores the validators with the pool/entity that operates them. Unidentified validators will have a `f_pool_name` value of `others`. It has the following columns:

- `f_validator_pubkey`: The public key of the validator.
- `f_pool_name`: The name of the pool in which the validators are participating.

## Utilizing custom off-chain data

As mentioned before, the tool can be used to identify validators by using off-chain data. For this purpose, two tables are created in the database on the first run: `t_depositors_insert` and `t_validators_insert`.

**Important note**: addresses must be all lowercase and without the `0x` prefix.

### `t_depositors_insert`

This table has the columns `f_depositor` and `f_pool_name`. The `identify` command will use this table to identify the pool in which the validators are participating. The `f_depositor` column is the address of the depositor and the `f_pool_name` is the name of the pool in which the validators are participating. All validators that have the same depositor address will be tagged with the `f_pool_name` value.

### `t_validators_insert`

This table has the columns `f_validator_pubkey` and `f_pool_name`. The `identify` command will use this table to identify the pool in which the validators are participating. The `f_validator_pubkey` column is the address of the validator and the `f_pool_name` is the name of the pool in which the validators are participating. These values will be used to tag the validators.

## Identification priority

Since the end table `t_identified_validators` is the result of the identification process, validators' pool/entity will be tagged in the following order (if the validator is already tagged, the next step will override the previous tag, resulting in the last tag being the one that is stored):

<p align="center">
<img src="repository-images/table_priorities.jpg" alt="table_priorities" />
</p>

## Database migrations

Expand All @@ -9,3 +165,23 @@ More specifically, one could clean the migrations by forcing the version with <b
`migrate -path / -database "postgresql://username:secretkey@localhost:5432/database_name?sslmode=disable" force <current_version>` <br>
If specific upgrades or downgrades need to be done manually, one could do this with <br>
`migrate -path database/migration/ -database "postgresql://username:secretkey@localhost:5432/database_name?sslmode=disable" -verbose up`

## Benchmarks

Utilizing Alchemy as EL node and API, with a local database, these are the benchmarks for the tool:

- `beacon_depositors_transactions`:
- Fetching deposits on the first run: 1h 16m
- Fetching depositors transactions: 15h 22m
- Total time: 16h 38m
- `identify`:
- Fetching Rocketpool validators on the first run: 38m
- Fetching Lido validators on the first run: 4h 13m
- Total time: 4h 51m
- Total time for the first run: 21h 29m

If running the tool on a weekly basis, expect the tool to take around 16h to run considering that the process of fetching depositors transactions will be done on every run.

## Maintainers

@santi1234567
Original file line number Diff line number Diff line change
Expand Up @@ -86,18 +86,18 @@ func (b *BeaconDepositorsTransactions) Run() {
log.Info("Downloading new beacon deposits...")
b.downloadBeaconDeposits()
duration := time.Since(initTime).Seconds()
log.Info("Finished downloading new beacon deposits in ", duration)
log.Infof("Finished downloading new beacon deposits in %f seconds", duration)

}
if !b.stop {
initTime := time.Now()
log.Info("Updating depositors transactions...")
b.updateDepositorsTransactions()
duration := time.Since(initTime).Seconds()
log.Info("Finished updating depositors transactions in ", duration)
log.Infof("Finished updating depositors transactions in %f seconds", duration)
}
totalDuration := time.Since(totalInitTime).Seconds()
log.Info("Finished beacon_depositors_transactions in ", totalDuration)
log.Infof("Finished beacon_depositors_transactions in %f seconds", totalDuration)

b.CloseConnections()
log.Debug("Sending signal that beacon_depositors_transactions finished")
Expand Down
2 changes: 1 addition & 1 deletion beacon-depositors-transactions/routines.go
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ func (b *BeaconDepositorsTransactions) downloadBeaconDeposits() {
if err != nil {
log.Fatalf("Error parsing block number: %s", err.Error())
}
log.Debugf("Downloaded 1000 more deposits on block %d", num)
log.Infof("Downloaded 1000 more deposits on block %d", num)
params.PageKey = newPageKey
firstCall = false
err = b.processDepositTransfers(newTransfers, b.iConfig.Workers)
Expand Down
42 changes: 42 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
version: "3.7"

services:
eth-pokhar:
build:
context: ./
dockerfile: Dockerfile
init: true
command: >-
"./eth_pokhar beacon_depositors_transactions
--log-level=${LOG_LEVEL}
--el-endpoint=${EL_ENDPOINT}
--db-url=${DB_URL}
--workers-num=${WORKER_NUM}
--alchemy-url=${ALCHEMY_URL}
&&
./eth_pokhar identify
--log-level=${LOG_LEVEL}
--el-endpoint=${EL_ENDPOINT}
--db-url=${DB_URL}
--workers-num=${WORKER_NUM}
--alchemy-url=${ALCHEMY_URL}
--recreate-table"
network_mode: "host"
depends_on:
- db
deploy:
restart_policy:
condition: on-failure
max_attempts: 5

db:
image: postgres
restart: always
environment:
POSTGRES_USER: ${DATABASE_USERNAME}
POSTGRES_PASSWORD: ${DATABASE_PASSWORD}
POSTGRES_DB: ${DATABASE_NAME}
volumes:
- ./app-data/:/var/lib/postgresql/data/
ports:
- "127.0.0.1:${LOCAL_PORT}:5432"
27 changes: 19 additions & 8 deletions identify/identify.go
Original file line number Diff line number Diff line change
Expand Up @@ -70,47 +70,57 @@ func (i *Identify) Run() {
log.Info("Starting Identify routine")

if i.iConfig.RecreateTable && !i.stop {
startTime := time.Now()
log.Info("Truncating identified validators table")
err := i.dbClient.TruncateIdentifiedValidators()
if err != nil {
log.Fatalf("Error truncating identified validators table: %v", err)
}
log.Info("Truncated identified validators")
endTime := time.Now()
log.Infof("Truncated identified validators table in %v", endTime.Sub(startTime))
}

if !i.stop {
startTime := time.Now()
log.Info("Adding new validators to database")
err := i.dbClient.AddNewValidators()
if err != nil {
log.Fatalf("Error adding new validators to database: %v", err)
}
log.Info("Added new validators to database")
endTime := time.Now()
log.Infof("Added new validators to database in %v", endTime.Sub(startTime))
}

if !i.stop {
startTime := time.Now()
log.Info("Applying validators insert")
err := i.dbClient.ApplyValidatorsInsert()
if err != nil {
log.Fatalf("Error applying validators insert: %v", err)
}
log.Info("Applied validators insert")
endTime := time.Now()
log.Infof("Applied validators insert in %v", endTime.Sub(startTime))
}
if !i.stop {
startTime := time.Now()
log.Info("Applying depositors insert")
err := i.dbClient.ApplyDepositorsInsert()
if err != nil {
log.Fatalf("Error applying depositors insert: %v", err)
}
log.Info("Applied depositors insert")
endTime := time.Now()
log.Infof("Applied depositors insert in %v", endTime.Sub(startTime))
}

if !i.stop {
startTime := time.Now()
log.Info("Identifying coinbase validators")
err := i.dbClient.IdentifyCoinbaseValidators()
if err != nil {
log.Fatalf("Error identifying coinbase validators: %v", err)
}
log.Info("Identified coinbase validators")
endTime := time.Now()
log.Infof("Identified coinbase validators in %v", endTime.Sub(startTime))
}

if !i.stop {
Expand All @@ -123,7 +133,7 @@ func (i *Identify) Run() {
}
log.WithFields(log.Fields{
"NewDetectedKeys": len(newRocketpoolKeys),
"Duration": time.Since(startTime),
"Duration (s)": time.Since(startTime),
}).Info("RocketPool Keys:")
i.dbClient.CopyRocketpoolValidators(newRocketpoolKeys)
err = i.dbClient.IdentifyRocketpoolValidators()
Expand All @@ -133,10 +143,11 @@ func (i *Identify) Run() {
log.Info("Identified rocketpool validators")
}
if !i.stop {
startTime := time.Now()
log.Info("Identifying lido validators")
i.IdentifyLidoValidators()

log.Info("Identified lido validators")
endTime := time.Now()
log.Infof("Identified lido validators in %v", endTime.Sub(startTime))
}

endTime := time.Now()
Expand Down
Binary file added repository-images/table_priorities.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit c9a0b30

Please sign in to comment.