Save parsed events into database; Design Event schema #2

QianSwirlds · 2019-07-30T18:07:13Z

t_events Table

Type	Column Name	Description	NotNull
bigint	id	Primary Key, auto-increment	NotNull
bigint	consensus_order	order in history (0 first)	NotNull
bigint	creator_node_id	nodeID of this event's creator	NotNull
bigint	creator_seq	sequence number for this by its creator (0 is first)	NotNull
bigint	other_node_id	ID of otherParent 's creator
bigint	other_seq	sequence number for otherParent event (by its creator)
bytea	signature	creator's sig for this	NotNull
bytea	hash	hash of this event	NotNull
bigint	self_parent_id	the id for the self parent
bigint	other_parent_id	the id for other parent
bytea	self_parent_hash	hash of the self parent
bytea	other_parent_hash	hash of other parent
bigint	self_parent_generation	the generation for the self parent
bigint	other_parent_generation	the generation for other parent
bigint	generation	generation (which is 1 plus max of parents' generations)	NotNull
bigint	created_timestamp_ns	seconds * (10 ^ 9) + nanos of creation time, as claimed by its creator	NotNull
bigint	consensus_timestamp_ns	seconds * (10 ^ 9) + nanos of the community's consensus timestamp for this event	NotNull
bigint	latency_ns	consensus_timestamp_ns - created_timestamp_ns	NotNull
integer	txs_bytes_count	number of bytes in transactions in this event	NotNull
integer	platform_tx_count	number of platform Transactions in this event	NotNull
integer	app_tx_count	number of application Transactions in this event	NotNull

QianSwirlds · 2019-07-30T21:03:35Z

Hi @gregscullard I am working on creating event and transaction table, and inserting data to the tables while parsing eventStream files. I will send you a pull request once I finish it. Thanks.

gregscullard · 2019-07-30T21:57:58Z

can you call the tables t_events and t_event_transactions please ?

QianSwirlds · 2019-07-30T21:58:47Z

@gregscullard Sure. Will do. Thanks

…eam, accountBalance can use the same connection object; #2

gregscullard · 2019-08-01T06:13:45Z

@QianSwirlds,
-Overall design: Shouldn't additional NOT NULL constraints be added on columns we expect to always contain data ? This enforces consistency of the data, bug avoidance, etc...

-Overall design: Should there be self-referenced foreign keys between "otherId" and the event it refers to (and any other such relationship) ? Although this would require the NOT NULL constraint (above) to be removed so that the first event can be inserted. For example

create table test (
  id serial primary key,
  parent integer not null,
  foreign key (parent) references test(id)
);

-Is "creatorId" the ID of the parent event ? If not, is the ID of the parent event missing ?

-timeCreated and consensusTimestamp should not be specified WITH TIMEZONE

-Timestamps in Postgres have a resolution of microseconds, if the timestamp includes nano seconds, it may be wise to split the timeCreated into timeCreatedSecs and timeCreatedNanos. Likewise for consensusTimestamp.

QianSwirlds · 2019-08-01T14:37:43Z

@gregscullard
I just added NOT NULL to the columns;

otherId is the id of this event's other parent's creator, not an id of event. Currently, the ID of the parent events is not serialized in the EventStream files, because we call Event.writeEvent() method in platform sdk to serialize events, and this method doesn't write the id of parent events.

I just split the timeCreated into timeCreatedSecs and timeCreatedNanos , and did the same for consensusTimestamp. please let me know if there is anything need to be modified. Thanks!

gregscullard · 2019-08-01T15:16:28Z

Ok, so the IDs are in fact node numbers/IDs (0,1,2,3) ? Should we name them accordingly ?

gregscullard · 2019-08-01T15:35:16Z

Also, if hashes are what joins events, shouldn't they be stored in a separate table for efficiency ? A table of hashes with an ID and the ID is used in the t_events table to refer to the hash in question.

QianSwirlds · 2019-08-01T15:38:18Z

yes, the IDs are node IDs. each node's id is the sequence of the node in the AddressBook (start from 0). I think it's ok to save node id. because when we load the nodes information from the file 0.0.102, the information we get are: nodeId, ipAddress, portno, memo (node AccountID), RSA_PubKey. if we want to search events created by a node AccountID or pubKey, we can get its nodeID from a map and then search in the database.

message NodeAddress {
    bytes ipAddress = 1; // The ip address of the Node with separator & octets
    int32 portno = 2; // The port number of the grpc server for the node
    bytes memo = 3; // The memo field of the node
    string RSA_PubKey = 4; // The RSA public key of the node.
}

gregscullard · 2019-08-01T15:38:59Z

Each hash is 48 bytes I think, each hash appears twice (parent + self) so 96 bytes in total. If we stored in a separate table, we'd have 48 bytes + 8 for the bigint ID. And 2x 8 bytes for the foreign key ID in the t_events table (72 bytes). Not a huge saving, but it all counts at the speed we're storing data.
It will have an impact on processing, making it slightly slower.
However, when we query (event->other->other->other), matching bigints will be more efficient than matching whole byte arrays (if that's even possible).

gregscullard · 2019-08-01T15:43:20Z

Can we call those nodeIds "creatorNodeId" and "otherNodeId" then ? This will make more sense to someone who's not familiar with the hashgraph algorithm.

QianSwirlds · 2019-08-01T15:45:02Z

Can we call those nodeIds "creatorNodeId" and "otherNodeId" then ? This will make more sense to someone who's not familiar with the hashgraph algorithm.

Sure, the names ware the same as the fields name of Event. I will modify them as you suggest.

QianSwirlds · 2019-08-01T15:54:08Z

Each hash is 48 bytes I think, each hash appears twice (parent + self) so 96 bytes in total. If we stored in a separate table, we'd have 48 bytes + 8 for the bigint ID. And 2x 8 bytes for the foreign key ID in the t_events table (72 bytes). Not a huge saving, but it all counts at the speed we're storing data.
It will have an impact on processing, making it slightly slower.
However, when we query (event->other->other->other), matching bigints will be more efficient than matching whole byte arrays (if that's even possible).

Do yo mean we add a EventHash table, which contains EventConsensusOrder and Hash columns, when we are storing an event, we query its parent event hash in a EventHash table, and get its two parents' consensusOrder (which is the identifier), and save them into the Event table as selfParentID and otherParentID instead of selfParentHash and otherParentHash?

but it is possible that two different events has the same Hash, isn't it?

gregscullard · 2019-08-01T18:20:11Z

I guess a hash should be unique but there is a small possibility of duplicates (maybe Leemon can comment). If the hashes are the same we have the same issue regardless, we won't be able to follow the event history (rebuild) the graph in the future from the data no ?

gregscullard · 2019-08-01T21:05:49Z

Looks good, one last thing, talking to Mike Burrage, he suggested storing seconds+nanos in a single bigInt which is large enough for a few years to come. This saves us 16 bytes per row.
Apologies for asking you to split them from timestamp earlier, although this was also a good thing to do.

QianSwirlds · 2019-08-01T21:41:25Z

Looks good, one last thing, talking to Mike Burrage, he suggested storing seconds+nanos in a single bigInt which is large enough for a few years to come. This saves us 16 bytes per row.
Apologies for asking you to split them from timestamp earlier, although this was also a good thing to do.

No worries, I agree that splitting them is a good thing to do.
If we store seconds * (10 ^ 9) + nanos in a single bigInt, the Maximum value of bigInt represents the timestamp: 2262-04-11T23:47:16.854775807Z. I think it is sufficient. I have modified the schema. Thanks.

gregscullard · 2019-08-02T17:31:35Z

I'm ok with the schema. Suggest we close this issue.

Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com>

* Initial HCS design document Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Minor tweaks Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Review feedback Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Meeting feedback Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Small tweaks Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Review feedback #2 Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Use reactive-grpc & split consensus service Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Change realm column to realm_num Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com>

* Initial HCS design document Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Minor tweaks Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Review feedback Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Meeting feedback Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Small tweaks Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Review feedback #2 Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Use reactive-grpc & split consensus service Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> * Change realm column to realm_num Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com> Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com>

- Move topic messages, contract result, file data, and live hashes from RecordFileLogger to PostgresWritingRecordParsedItemHandler - Logic for Transaction and batching will be moved together in followup Signed-off-by: Apekshit Sharma <apekshit.sharma@hedera.com>

- Move topic messages, contract result, file data, and live hashes from RecordFileLogger to PostgresWritingRecordParsedItemHandler - Logic for Transaction and batching will be moved together in followup Partially fixes #566 Signed-off-by: Apekshit Sharma <apekshit.sharma@hedera.com>

Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com>

QianSwirlds assigned QianSwirlds, gregscullard and atulmahamuni Jul 30, 2019

noshmody assigned gregscullard, QianSwirlds and atulmahamuni and unassigned gregscullard, QianSwirlds and atulmahamuni Jul 30, 2019

QianSwirlds added a commit that referenced this issue Jul 31, 2019

add sql for creating t_events; #2

4e5f820

QianSwirlds added a commit that referenced this issue Jul 31, 2019

fix sql; #2

f2724d4

QianSwirlds added a commit that referenced this issue Jul 31, 2019

modify DatabaseUtilities.getConnection so that eventStream, recordStr…

3945d99

…eam, accountBalance can use the same connection object; #2

QianSwirlds added a commit that referenced this issue Jul 31, 2019

revert the changes; the connection don't need to be shared; #2

3846c00

kenthejr unassigned atulmahamuni Aug 1, 2019

QianSwirlds assigned noshmody Aug 1, 2019

QianSwirlds added a commit that referenced this issue Aug 2, 2019

Store parsed events into t_events and t_eventHashes; #2

e844b61

gregscullard closed this as completed Aug 2, 2019

QianSwirlds added a commit that referenced this issue Aug 2, 2019

Refactor; modified EventStreamFileParser; #2

939da56

QianSwirlds added a commit that referenced this issue Aug 2, 2019

sample eventStream files; #2

e1dc39b

QianSwirlds mentioned this issue Aug 2, 2019

event stream parser + downloader #17

Merged

4 tasks

QianSwirlds added a commit that referenced this issue Aug 6, 2019

add latency column; #2

73a1668

QianSwirlds added a commit that referenced this issue Aug 6, 2019

Modify t_events schema; Modify related inserting Event code; #2;

1d89545

QianSwirlds added a commit that referenced this issue Aug 6, 2019

Add creating table scripts for t_events; #2;

b729b1f

steven-sheehy added this to the 0.1.0 milestone Aug 27, 2019

steven-sheehy added a commit that referenced this issue Sep 4, 2019

Test decreasing coverage #2

320c607

Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com>

steven-sheehy added a commit that referenced this issue Nov 12, 2019

Review feedback #2

4c8b203

Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com>

steven-sheehy added enhancement Type: New feature P3 parser Area: File parsing labels Nov 26, 2019

steven-sheehy added a commit that referenced this issue Apr 1, 2020

Fix REST bundled dependencies #2

0fe29e9

Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com>

steven-sheehy added a commit that referenced this issue Apr 1, 2020

Fix REST bundled dependencies #2 (#642)

22010f4

Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com>

Nana-EC mentioned this issue Oct 23, 2020

Add custom alerts and dashboards #1170

Merged

2 tasks

Nana-EC mentioned this issue Jan 7, 2021

TimescaleDB: Add psql to timescaledb migration scripts #1364

Merged

2 tasks

Nana-EC mentioned this issue Apr 6, 2021

Several REST API requests to make building an explorer easier #1620

Closed

steven-sheehy added a commit that referenced this issue Sep 2, 2021

Add cluster label to remote loki #2

b4b413c

Signed-off-by: Steven Sheehy <steven.sheehy@hedera.com>

IvanKavaldzhiev mentioned this issue Aug 16, 2022

Validate that eth_call works for reading dynamic data after it has been updated #4298

Closed

MarcKriguerAtHedera mentioned this issue Oct 18, 2022

Fix balance monitor api tests #4676

Merged

2 tasks

gitmewai mentioned this issue Feb 14, 2023

c.h.m.i.r.BalanceReconciliationService Reconciliation completed unsuccessfully #5354

Closed

xin-hedera mentioned this issue Jun 21, 2024

Improve k6 perf test results consistency #8598

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Save parsed events into database; Design Event schema #2

Save parsed events into database; Design Event schema #2

QianSwirlds commented Jul 30, 2019 •

edited

Loading

QianSwirlds commented Jul 30, 2019

gregscullard commented Jul 30, 2019

QianSwirlds commented Jul 30, 2019

gregscullard commented Aug 1, 2019

QianSwirlds commented Aug 1, 2019

gregscullard commented Aug 1, 2019

gregscullard commented Aug 1, 2019

QianSwirlds commented Aug 1, 2019

gregscullard commented Aug 1, 2019

gregscullard commented Aug 1, 2019

QianSwirlds commented Aug 1, 2019

QianSwirlds commented Aug 1, 2019 •

edited

Loading

gregscullard commented Aug 1, 2019

gregscullard commented Aug 1, 2019 •

edited

Loading

QianSwirlds commented Aug 1, 2019

gregscullard commented Aug 2, 2019

Save parsed events into database; Design Event schema #2

Save parsed events into database; Design Event schema #2

Comments

QianSwirlds commented Jul 30, 2019 • edited Loading

QianSwirlds commented Jul 30, 2019

gregscullard commented Jul 30, 2019

QianSwirlds commented Jul 30, 2019

gregscullard commented Aug 1, 2019

QianSwirlds commented Aug 1, 2019

gregscullard commented Aug 1, 2019

gregscullard commented Aug 1, 2019

QianSwirlds commented Aug 1, 2019

gregscullard commented Aug 1, 2019

gregscullard commented Aug 1, 2019

QianSwirlds commented Aug 1, 2019

QianSwirlds commented Aug 1, 2019 • edited Loading

gregscullard commented Aug 1, 2019

gregscullard commented Aug 1, 2019 • edited Loading

QianSwirlds commented Aug 1, 2019

gregscullard commented Aug 2, 2019

QianSwirlds commented Jul 30, 2019 •

edited

Loading

QianSwirlds commented Aug 1, 2019 •

edited

Loading

gregscullard commented Aug 1, 2019 •

edited

Loading