Skip to content

Commit bc59817

Browse files
authored
fix: import BNS v1 data during event replay (#1301)
* fix: import v1 data during replay * fix: import names first, subdomains last * feat: obtain genesis block data from tsv * fix: v1 import tests * fix: import route * fix: api test * fix: move to for await of * docs: update README to reflect new replay
1 parent 763d99e commit bc59817

File tree

11 files changed

+235
-157
lines changed

11 files changed

+235
-157
lines changed

readme.md

Lines changed: 53 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -98,19 +98,51 @@ For running offline mode set an environment variable `STACKS_API_MODE=offline`
9898

9999
## Event Replay
100100

101-
The stacks-node is only able to emit events live as they happen. This poses a problem in the scenario where the stacks-blockchain-api needs to
102-
be upgraded and its database cannot be migrated to a new schema. One way to handle this upgrade is to wipe the stacks-blockchain-api's database
103-
and stacks-node working directory, and re-sync from scratch.
101+
The stacks-node is only able to emit events live as they happen. This poses a problem in the
102+
scenario where the stacks-blockchain-api needs to be upgraded and its database cannot be migrated to
103+
a new schema. One way to handle this upgrade is to wipe the stacks-blockchain-api's database and
104+
stacks-node working directory, and re-sync from scratch.
104105

105-
Alternatively, an event-replay feature is available where the API records the HTTP POST requests from the stacks-node event emitter, then streams
106-
these events back to itself. Essentially simulating a wipe & full re-sync, but much quicker.
106+
Alternatively, an event-replay feature is available where the API records the HTTP POST requests
107+
from the stacks-node event emitter, then streams these events back to itself. Essentially simulating
108+
a wipe & full re-sync, but much quicker.
107109

108-
The feature can be used via program args. For example, if there are breaking changes in the API's sql schema, like adding a new column which requires
109-
event's to be re-played, the following steps could be ran:
110+
The feature can be used via program args. For example, if there are breaking changes in the API's
111+
sql schema, like adding a new column which requires event's to be re-played, the following steps
112+
could be ran:
110113

111114
### Event Replay Instructions
112115

113-
1. Ensure the API process is not running. When stopping the API, let the process exit gracefully so that any in-progress SQL writes can finish.
116+
#### V1 BNS Data
117+
118+
**Optional but recommended** - If you want the V1 BNS data, there are going to be a few extra steps:
119+
120+
1. Download BNS data:
121+
```shell
122+
curl -L https://storage.googleapis.com/blockstack-v1-migration-data/export-data.tar.gz -o /stacks-node/bns/export-data.tar.gz
123+
```
124+
1. Extract it:
125+
```shell
126+
tar -xzvf ./bns/export-data.tar.gz -C /stacks-node/bns/
127+
```
128+
1. Each file in `./bns` will have a corresponding `sha256` value. To Verify, run a script like the
129+
following to check the sha256sum:
130+
131+
```bash
132+
for file in `ls /stacks-node/bns/* | grep -v sha256 | grep -v .tar.gz`; do
133+
if [ $(sha256sum $file | awk {'print $1'}) == $(cat ${file}.sha256 ) ]; then
134+
echo "sha256 Matched $file"
135+
else
136+
echo "sha256 Mismatch $file"
137+
fi
138+
done
139+
```
140+
1. Set the data's location as the value of `BNS_IMPORT_DIR` in your `.env` file.
141+
142+
#### Export and Import
143+
144+
1. Ensure the API process is not running. When stopping the API, let the process exit gracefully so
145+
that any in-progress SQL writes can finish.
114146
1. Export event data to disk with the `export-events` command:
115147
116148
```shell
@@ -119,19 +151,25 @@ event's to be re-played, the following steps could be ran:
119151
1. Update to the new stacks-blockchain-api version.
120152
1. Perform the event playback using the `import-events` command:
121153
122-
**WARNING**: This will **drop _all_ tables** from the configured Postgres database, including any tables not automatically added by the API.
154+
**WARNING**: This will **drop _all_ tables** from the configured Postgres database, including any
155+
tables not automatically added by the API.
123156
124157
```shell
125158
node ./lib/index.js import-events --file /tmp/stacks-node-events.tsv --wipe-db --force
126159
```
127160
128161
This command has two modes of operation, specified by the `--mode` option:
129-
* `archival` (default): The process will import and ingest *all* blockchain events that have happened since the first block.
130-
* `pruned`: The import process will ignore some prunable events (mempool, microblocks) until the import block height has reached `chain tip - 256` blocks. This saves a considerable amount of time during import, but sacrifices some historical data. You can use this mode if you're mostly interested in running an API that prioritizes real time information.
131-
132-
Alternatively, instead of performing the `export-events` command in step 1, an environmental variable can be set which enables events to be streamed to a file
133-
as they are received, while the application is running normally. To enable this feature, set the `STACKS_EXPORT_EVENTS_FILE` env var to the file path where
134-
events should be appended. Example:
162+
* `archival` (default): The process will import and ingest *all* blockchain events that have
163+
happened since the first block.
164+
* `pruned`: The import process will ignore some prunable events (mempool, microblocks) until the
165+
import block height has reached `chain tip - 256` blocks. This saves a considerable amount of
166+
time during import, but sacrifices some historical data. You can use this mode if you're mostly
167+
interested in running an API that prioritizes real time information.
168+
169+
Alternatively, instead of performing the `export-events` command in step 1, an environmental
170+
variable can be set which enables events to be streamed to a file as they are received, while the
171+
application is running normally. To enable this feature, set the `STACKS_EXPORT_EVENTS_FILE` env var
172+
to the file path where events should be appended. Example:
135173
```
136174
STACKS_EXPORT_EVENTS_FILE=/tmp/stacks-node-events.tsv
137175
```

running_an_api.md

Lines changed: 1 addition & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -78,34 +78,14 @@ Since we'll need to create some files/dirs for persistent data we'll first creat
7878
We'll be using:
7979

8080
```bash
81-
$ mkdir -p ./stacks-node/{persistent-data/postgres,persistent-data/stacks-blockchain,bns,config}
81+
$ mkdir -p ./stacks-node/{persistent-data/postgres,persistent-data/stacks-blockchain,config}
8282
$ docker pull blockstack/stacks-blockchain-api \
8383
&& docker pull blockstack/stacks-blockchain \
8484
&& docker pull postgres:alpine
8585
$ docker network create stacks-blockchain > /dev/null 2>&1
8686
$ cd ./stacks-node
8787
```
8888

89-
**Optional but recommended**: If you need the v1 BNS data, there are going to be a few extra steps.
90-
91-
1. Download the BNS data:
92-
`curl -L https://storage.googleapis.com/blockstack-v1-migration-data/export-data.tar.gz -o ./bns/export-data.tar.gz`
93-
2. Extract the data:
94-
`tar -xzvf ./bns/export-data.tar.gz -C ./bns/`
95-
3. Each file in `./bns` will have a corresponding `sha256` value.
96-
97-
To Verify, run a script like the following to check the sha256sum:
98-
99-
```bash
100-
for file in `ls ./bns/* | grep -v sha256 | grep -v .tar.gz`; do
101-
if [ $(sha256sum $file | awk {'print $1'}) == $(cat ${file}.sha256 ) ]; then
102-
echo "sha256 Matched $file"
103-
else
104-
echo "sha256 Mismatch $file"
105-
fi
106-
done
107-
```
108-
10989
## Postgres
11090

11191
The `postgres:alpine` image can be run with default settings, the only requirement is that a password Environment Variable is set for the `postgres` user: `POSTGRES_PASSWORD=postgres`
@@ -161,16 +141,9 @@ STACKS_BLOCKCHAIN_API_PORT=3999
161141
STACKS_BLOCKCHAIN_API_HOST=0.0.0.0
162142
STACKS_CORE_RPC_HOST=stacks-blockchain
163143
STACKS_CORE_RPC_PORT=20443
164-
BNS_IMPORT_DIR=/bns-data
165144
API_DOCS_URL=https://docs.hiro.so/api
166145
```
167146

168-
**Note** that here we are importing the bns data with the env var `BNS_IMPORT`.
169-
170-
To Disable this import, simply comment the line: `#BNS_IMPORT_DIR=/bns-data`
171-
172-
***If you leave this enabled***: please allow several minutes for the one-time import to complete before continuing.
173-
174147
The other Environment Variables to pay attention to:
175148

176149
- `PG_HOST`: Set this to your **postgres** instance. In this guide, we'll be using a container named `postgres`.
@@ -184,7 +157,6 @@ docker run -d --rm \
184157
--name stacks-blockchain-api \
185158
--net=stacks-blockchain \
186159
--env-file $(pwd)/.env \
187-
-v $(pwd)/bns:/bns-data \
188160
-p 3700:3700 \
189161
-p 3999:3999 \
190162
blockstack/stacks-blockchain-api

running_api_from_source.md

Lines changed: 2 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -35,15 +35,15 @@ Since we'll need to create some files/dirs for persistent data,
3535
we'll first create a base directory structure and set some permissions:
3636

3737
```bash
38-
$ sudo mkdir -p /stacks-node/{persistent-data/stacks-blockchain,bns,config,binaries}
38+
$ sudo mkdir -p /stacks-node/{persistent-data/stacks-blockchain,config,binaries}
3939
$ sudo chown -R $(whoami) /stacks-node
4040
$ cd /stacks-node
4141
```
4242

4343
## Install Requirements
4444

4545
```bash
46-
$ PG_VERSION=12 \
46+
$ PG_VERSION=14 \
4747
&& NODE_VERSION=16 \
4848
&& sudo apt-get update \
4949
&& sudo apt-get install -y \
@@ -65,26 +65,6 @@ $ PG_VERSION=12 \
6565
nodejs
6666
```
6767

68-
**Optional but recommended** - If you want the V1 BNS data, there are going to be a few extra steps:
69-
70-
1. Download the BNS data:
71-
`curl -L https://storage.googleapis.com/blockstack-v1-migration-data/export-data.tar.gz -o /stacks-node/bns/export-data.tar.gz`
72-
2. Extract the data:
73-
`tar -xzvf ./bns/export-data.tar.gz -C /stacks-node/bns/`
74-
3. Each file in `./bns` will have a corresponding `sha256` value.
75-
76-
To Verify, run a script like the following to check the sha256sum:
77-
78-
```bash
79-
for file in `ls /stacks-node/bns/* | grep -v sha256 | grep -v .tar.gz`; do
80-
if [ $(sha256sum $file | awk {'print $1'}) == $(cat ${file}.sha256 ) ]; then
81-
echo "sha256 Matched $file"
82-
else
83-
echo "sha256 Mismatch $file"
84-
fi
85-
done
86-
```
87-
8868
## postgres
8969

9070
### postgres permissions
@@ -127,8 +107,6 @@ $ git clone https://github.com/hirosystems/stacks-blockchain-api /stacks-node/st
127107
The stacks blockchain api requires several Environment Variables to be set in order to run properly.
128108
To reduce complexity, we're going to create a `.env` file that we'll use for these env vars.
129109
130-
** Note: ** to enable BNS names, uncomment `BNS_IMPORT_DIR` in the below `.env` file.
131-
132110
Create a new file: `/stacks-node/stacks-blockchain-api/.env` with the following content:
133111
134112
```bash
@@ -148,7 +126,6 @@ STACKS_BLOCKCHAIN_API_PORT=3999
148126
STACKS_BLOCKCHAIN_API_HOST=0.0.0.0
149127
STACKS_CORE_RPC_HOST=localhost
150128
STACKS_CORE_RPC_PORT=20443
151-
#BNS_IMPORT_DIR=/stacks-node/bns
152129
EOF
153130
$ cd /stacks-node/stacks-blockchain-api && nohup node ./lib/index.js &
154131
```

src/datastore/common.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -439,7 +439,7 @@ export interface DataStoreAttachmentData {
439439
blockHeight: number;
440440
}
441441

442-
export interface DataStoreSubdomainBlockData {
442+
export interface DataStoreBnsBlockData {
443443
index_block_hash: string;
444444
parent_index_block_hash: string;
445445
microblock_hash: string;
@@ -449,7 +449,7 @@ export interface DataStoreSubdomainBlockData {
449449

450450
export interface DataStoreAttachmentSubdomainData {
451451
attachment?: DataStoreAttachmentData;
452-
blockData?: DataStoreSubdomainBlockData;
452+
blockData?: DataStoreBnsBlockData;
453453
subdomains?: DbBnsSubdomain[];
454454
}
455455

src/datastore/postgres-store.ts

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ import {
101101
DbAssetEventTypeId,
102102
DbTxGlobalStatus,
103103
DataStoreAttachmentData,
104-
DataStoreSubdomainBlockData,
104+
DataStoreBnsBlockData,
105105
DataStoreAttachmentSubdomainData,
106106
} from './common';
107107
import {
@@ -2175,7 +2175,7 @@ export class PgDataStore
21752175
);
21762176
let isCanonical = true;
21772177
let txIndex = -1;
2178-
const blockData: DataStoreSubdomainBlockData = {
2178+
const blockData: DataStoreBnsBlockData = {
21792179
index_block_hash: '',
21802180
parent_index_block_hash: '',
21812181
microblock_hash: '',
@@ -7222,7 +7222,7 @@ export class PgDataStore
72227222
// The `names` and `zonefiles` tables only track latest zonefile changes. We need to check
72237223
// `nft_custody` for the latest name owner, but only for names that were NOT imported from v1
72247224
// since they did not generate an NFT event for us to track.
7225-
if (nameZonefile.rows[0].registered_at !== 0) {
7225+
if (nameZonefile.rows[0].registered_at !== 1) {
72267226
let value: Buffer;
72277227
try {
72287228
value = bnsNameCV(name);
@@ -7427,7 +7427,7 @@ export class PgDataStore
74277427
names
74287428
WHERE
74297429
address = $1
7430-
AND registered_at = 0
7430+
AND registered_at = 1
74317431
AND canonical = TRUE
74327432
AND microblock_canonical = TRUE
74337433
`,

src/event-replay/event-replay.ts

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,8 @@ import * as fs from 'fs';
33
import { cycleMigrations, dangerousDropAllTables, PgDataStore } from '../datastore/postgres-store';
44
import { startEventServer } from '../event-stream/event-server';
55
import { getApiConfiguredChainID, httpPostRequest, logger } from '../helpers';
6-
import { findTsvBlockHeight, getDbBlockHeight } from './helpers';
6+
import { findBnsGenesisBlockData, findTsvBlockHeight, getDbBlockHeight } from './helpers';
7+
import { importV1BnsNames, importV1BnsSubdomains, importV1TokenOfferingData } from '../import-v1';
78

89
enum EventImportMode {
910
/**
@@ -107,6 +108,8 @@ export async function importEventsFromTsv(
107108
if (eventImportMode === EventImportMode.pruned) {
108109
console.log(`Ignoring all prunable events before block height: ${prunedBlockHeight}`);
109110
}
111+
// Look for the TSV's genesis block information for BNS import.
112+
const tsvGenesisBlockData = await findBnsGenesisBlockData(resolvedFilePath);
110113

111114
const db = await PgDataStore.connect({
112115
usageName: 'import-events',
@@ -122,6 +125,18 @@ export async function importEventsFromTsv(
122125
httpLogLevel: 'debug',
123126
});
124127

128+
await importV1TokenOfferingData(db);
129+
130+
// Import V1 BNS names first. Subdomains will be imported after TSV replay is finished in order to
131+
// keep the size of the `subdomains` table small.
132+
if (process.env.BNS_IMPORT_DIR) {
133+
logger.info(`Using BNS export data from: ${process.env.BNS_IMPORT_DIR}`);
134+
await importV1BnsNames(db, process.env.BNS_IMPORT_DIR, tsvGenesisBlockData);
135+
} else {
136+
logger.warn(`Notice: full BNS functionality requires 'BNS_IMPORT_DIR' to be set.`);
137+
}
138+
139+
// Import TSV chain data
125140
const readStream = fs.createReadStream(resolvedFilePath);
126141
const rawEventsIterator = PgDataStore.getRawEventRequests(readStream, status => {
127142
console.log(status);
@@ -163,6 +178,9 @@ export async function importEventsFromTsv(
163178
}
164179
}
165180
await db.finishEventReplay();
181+
if (process.env.BNS_IMPORT_DIR) {
182+
await importV1BnsSubdomains(db, process.env.BNS_IMPORT_DIR, tsvGenesisBlockData);
183+
}
166184
console.log(`Event import and playback successful.`);
167185
await eventServer.closeAsync();
168186
await db.close();

src/event-replay/helpers.ts

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,15 @@
1+
import * as fs from 'fs';
2+
import * as readline from 'readline';
3+
import { decodeTransaction, TxPayloadTypeID } from 'stacks-encoding-native-js';
4+
import { DataStoreBnsBlockData } from '../datastore/common';
15
import { PgDataStore } from '../datastore/postgres-store';
26
import { ReverseFileStream } from './reverse-file-stream';
37

8+
export type BnsGenesisBlock = DataStoreBnsBlockData & {
9+
tx_id: string;
10+
tx_index: number;
11+
};
12+
413
/**
514
* Traverse a TSV file in reverse to find the last received `/new_block` node message and return
615
* the `block_height` reported by that event. Even though the block produced by that event might
@@ -26,6 +35,47 @@ export async function findTsvBlockHeight(filePath: string): Promise<number> {
2635
return blockHeight;
2736
}
2837

38+
/**
39+
* Traverse a TSV file to find the genesis block and extract its data so we can use it during V1 BNS
40+
* import.
41+
* @param filePath - TSV path
42+
* @returns Genesis block data
43+
*/
44+
export async function findBnsGenesisBlockData(filePath: string): Promise<BnsGenesisBlock> {
45+
const rl = readline.createInterface({
46+
input: fs.createReadStream(filePath),
47+
crlfDelay: Infinity,
48+
});
49+
for await (const line of rl) {
50+
const columns = line.split('\t');
51+
const eventName = columns[2];
52+
if (eventName === '/new_block') {
53+
const payload = JSON.parse(columns[3]);
54+
// Look for block 1
55+
if (payload.block_height === 1) {
56+
for (const tx of payload.transactions) {
57+
const decodedTx = decodeTransaction(tx.raw_tx);
58+
// Look for the only token transfer transaction in the genesis block. This is the one
59+
// that contains all the events, including all BNS name registrations.
60+
if (decodedTx.payload.type_id === TxPayloadTypeID.TokenTransfer) {
61+
rl.close();
62+
return {
63+
index_block_hash: payload.index_block_hash,
64+
parent_index_block_hash: payload.parent_index_block_hash,
65+
microblock_hash: payload.parent_microblock,
66+
microblock_sequence: payload.parent_microblock_sequence,
67+
microblock_canonical: true,
68+
tx_id: decodedTx.tx_id,
69+
tx_index: tx.tx_index,
70+
};
71+
}
72+
}
73+
}
74+
}
75+
}
76+
throw new Error('BNS genesis block data not found');
77+
}
78+
2979
/**
3080
* Get the current block height from the DB. We won't use the `getChainTip` method since that
3181
* adds some conversions from block hashes into strings that we're not interested in. We also can't

0 commit comments

Comments
 (0)