Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use structs when appending rows to a BatchInsertBuilder #2578

Open
tamirms opened this issue May 11, 2020 · 3 comments · Fixed by #2604
Open

Use structs when appending rows to a BatchInsertBuilder #2578

tamirms opened this issue May 11, 2020 · 3 comments · Fixed by #2604
Assignees
Labels

Comments

@tamirms
Copy link
Contributor

tamirms commented May 11, 2020

Most of the Horizon ingestion uses a pattern where a bunch of rows are inserted into a table using a BatchInsertBuilder. For example, the TransactionProcessor uses a TransactionBatchInsertBuilder to record all the transactions for a given ledger in the history_transactions table in the Horizon DB.

TransactionBatchInsertBuilder is a wrapper around BatchInsertBuilder which ultimately calls Row() to append a transaction to the current batch:

// Row adds a new row to the batch. All rows must have exactly the same columns
// (map keys). Otherwise, error will be returned. Please note that rows are not
// added one by one but in batches when `Exec` is called (or `MaxBatchSize` is
// reached).
func (b *BatchInsertBuilder) Row(row map[string]interface{}) error 

map[string]interface{} isn't an ideal representation for a row because there aren't any guarantees provided by the type system that two map[string]interface{} values have the same length and the same set of keys.

In #2562 , we had to fix an issue where a TransactionBatchInsertBuilder was calling Row() on two map[string]interface{} values which were not consistent with each other. This manifested in a runtime error which blocked ingestion.

If we can pass history.Transaction instances to Row() instead of a map[string]interface{} values, we can know that all row values will have the same number of columns and the same set of column names. This would make ingestion more robust to the kind of bug we had to fix in #2562 .

If we use the following signature func (b *BatchInsertBuilder) Row(row interface{}) error we can use reflection to extract all the columns by looking at all struct fields with the db tag.

@bartekn
Copy link
Contributor

bartekn commented May 29, 2020

Let's keep this open to apply this change to other batch inserters.

@bartekn bartekn reopened this May 29, 2020
@ire-and-curses
Copy link
Contributor

What are the remaining batch inserters? Let's list them here so it's easy to pick this up and work on.

@bartekn
Copy link
Contributor

bartekn commented May 29, 2020

I think I found all of them:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants