-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sqldb: add invoice schema and sql queries #7354
sqldb: add invoice schema and sql queries #7354
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, impressive work! :)
Did a first pass-through in preparation of the review club next week.
sqldb/test_postgres.go
Outdated
@@ -0,0 +1,33 @@ | |||
//go:build test_db_postgres | |||
// +build test_db_postgres | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Could you explain why we are naming the test files with the prefix test
instead of the suffix test
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, comes from taproot_assets
but as you noticed it would be good to change it to match the Go's good practices
This feedback will be added in #7343
sqldb/sqlite.go
Outdated
@@ -0,0 +1,112 @@ | |||
package sqldb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: would it be reasonable to only include this file when building for example with the kvdb_sqlite package, this came to my mind because later in the test
files you mark the with specific tags like test_db_postgres
or test_db_sqlite
so was wondering why in the test code but not here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @positiveblue . Added some qs and need to research the SEQUENCES
stuff a bit
sqldb/postgres.go
Outdated
postgresFS := newReplacerFS(sqlSchemas, map[string]string{ | ||
"BLOB": "BYTEA", | ||
"INTEGER PRIMARY KEY": "SERIAL PRIMARY KEY", | ||
"TIMESTAMP": "TIMESTAMP WITHOUT TIME ZONE", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the implications of storing with or without the time zone. E.g. assuming I'm running my node in my home in utc+0 and later migrating that to a remote server that is in a different timezone, my gut feeling is that would be a conflict here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's ok to use no timezone until we make sure to display all printed timestamps as UTC or alternatively transform to the user's timezone (when printing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sputn1ck your gut feeling is right here: time/timezone == software engineering headache
As Andras said, the idea is to not use timezone (and everything is sitll consistent) until you want to show it to the user
last_amount_paid_msat BIGINT NOT NULL, | ||
|
||
is_amp BOOLEAN NOT NULL, | ||
is_hodl BOOLEAN NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: hodl 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha I checked it in our code and we call them hodl invoices
I think it's they real name hold-invoices though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool PR, great work so far @positiveblue 🚀 🚀
sqldb/postgres.go
Outdated
postgresFS := newReplacerFS(sqlSchemas, map[string]string{ | ||
"BLOB": "BYTEA", | ||
"INTEGER PRIMARY KEY": "SERIAL PRIMARY KEY", | ||
"TIMESTAMP": "TIMESTAMP WITHOUT TIME ZONE", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's ok to use no timezone until we make sure to display all printed timestamps as UTC or alternatively transform to the user's timezone (when printing).
sqldb/sqlite.go
Outdated
@@ -0,0 +1,112 @@ | |||
package sqldb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
65b3307
to
3972d08
Compare
0fd6898
to
c9c2e47
Compare
c902354
to
c9d9247
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking really good to me, awesome work @positiveblue 🥇
ad5df9e
to
3297bf7
Compare
@bhandras: review reminder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great PR 🤓, learned a lot how to cleanly setup sql schemas with sqlc.
Just had some nits and questions.
Regarding your remarks:
-
I would prefer int64 for the primary key, especially because sql dbs can easily deployed on other hosts and compacted on the fly so I consider the additonal space not as critical. On the other hand a migration later one is not a big deal as well?
-
So regarding the payment information when fetching an invoice, do I understand it right that you propose just adding the latest successful payment rather then all when using MPP or AMP for example?
-
Very much in favour of making the feature set space requirement more efficient but also keep the possibility to have it separate for each invoice just in case we want to be more flexible?
amount_msat BIGINT NOT NULL, | ||
|
||
-- Delta to use for the time-lock of the CLTV extended to the final hop. | ||
cltv_delta INTEGER NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: For Bolt12 invoices there will be a "cltv_delta" included in the blinded path, however there will be no specific final_hop delta as in bolt11 invoices, there might be still a receiver required delta but it will be handled differently I think. So my question is, maybe we should allow a NULL value then?
is_keysend BOOLEAN NOT NULL, | ||
|
||
-- Timestamp of when this invoice was created. | ||
created_at TIMESTAMP NOT NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so for every update there will be a corresponding invoice_event
?
id INTEGER PRIMARY KEY, | ||
|
||
-- The uint64 htlc id stored as text. | ||
htlc_id TEXT NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering why the set_id is stored as bytes but he htlc_id we go for text ? Is text more efficient that BLOB?
-- not have leave this empty, like Keysends. | ||
payment_request TEXT UNIQUE, | ||
|
||
-- The invoice state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the next commit you filter for pending_only which only hits for 0 and 3, would it make sense then to define the states in the comments here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can leave that for the application layer (Go) given that "pending_only"/"pending" is not a state by itself but an invoice is considered pending when its state is "open"/"accepted"
|
||
-- name: SelectAMPInvoicePayments :many | ||
SELECT aip.*, ip.* | ||
FROM amp_invoice_payments aip LEFT JOIN invoice_payments ip ON aip.settled_index = ip.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Could you elaborate why we want to show information from the invoice_payments table here? Will those not be equal to the ones in the amp_invoice_payments table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sql generates a struct where fields in both tables get a standard name, fore example:
type SelectAMPInvoicePaymentsRow struct {
SetID []byte
State int16
CreatedAt time.Time
SettledIndex sql.NullInt32
InvoiceID int32
ID sql.NullInt32
SettledAt sql.NullTime
AmountPaidMsat sql.NullInt64
InvoiceID_2 sql.NullInt32
}
We could "avoid" the overlap and specify a .*
for one of the tables and only the distinct columns in the other table. The problem is that next time that we add a column to the second table, we need to update this query too. Because it is not much data, I think it's better to leave it as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, now I understand thanks.
CREATE TABLE IF NOT EXISTS invoice_event_types( | ||
id INTEGER PRIMARY KEY, | ||
|
||
description TEXT NOT NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: We do not use ENUM types here because we would need to redefine them in the go code which makes things more complicated than having everything in one place ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, pending @ziggie1984's comments.
Great to see this coming together, will be super nice to use SQL finally! 🚀 🚀
sqlc is a tool that generates fully type-safe idiomatic code from SQL. The result is Go code can then used execute the queries in the database. The noraml flow looks like: - The developer write some sql that will update the schema in the database: new tables, indices, etc - The developer updates the set of queries that will use the new schema. - `sqlc` generates type-safe interfaces to those queries. - The developer can then write application code that calls the methods generated by sqlc. The tool configuration needs to live in the repo's root and its name is `sqlc.yaml`. LND will support out of the box sqlite and postgres. The sql code needs to be (almost) the same for both engines, so we cannot use custom functions like `ANY` in postgres. The SQLC config file needs to define what is the target engine, we will set postgres but the generated code can be executed by sqlite too. In some specific cases, we will `match and replace` some sql lines to be sure the table definitions are valid for the targeted engine.
This is the schema for "ordinal" BOLT11 invoices. The invoices table aims to keep an entry for each invoice, BOLT11 or not, that will be supported. Invoice related HTLCs will be stored in a separete table than forwarded htlcs. SQLite does not support `SEQUENCE`. We achieve atomic autoincrementals using primary keys with autoincrement/serial. An invoice `AddIndex` translates to `invoices(id)` while `SettleIndex` is `invoice_payments(id)`.
Set of queries to deal with invoices. A couple of things to take into account: - Because the queries are not rewritten at runtime, we cannot have a generic `INSERT` with different tuples. - Because the queries are not rewritten at runtime, we cannot build one query with only the filters that matter for that queries. The two options are a combinatorial approach (a new query for every permutation) or a generic query using the pattern ``` SELECT * FROM table WHERE ( -- Can be read as: -- Match the filter 1 value if filter_1 != nil column_1 >= sqlc.narg('filter_1') OR sqlc.narg('filter_1') IS NULL ) AND ( column_2 >= sqlc.narg('filter_2') OR sqlc.narg('filter_2') IS NULL ) ... ```
Schema for AMP invocies. AMP invoices can be paid multiple times and each payment to an AMP invoice is identified by a `set_id`. The A in AMP stands for `Atomic`. All the htlcs belonging to the same AMP payment (share the same set_id) will be resolved at the same time with the same result: settled/canceled. AMP invoices do not have an "invoice preimage". Instead, each htcl has its own hash/preimage. When a new htlc is added the hash for that htlc is attached to it. When all the htlcs of a set_id have been received we are able to compute the preimage for each one of them.
run `make sqlc` All the code in this commit is auto-generated.
322e460
to
8fcb404
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work @positiveblue 🥇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great Job 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent PR! This series beings a marked improvement for the general persistence situation within lnd.
Left some non-blocking comments that can be followed up on later. Some may end up changing the schema, but we can do so in a breaking manner, as people can't yet use the definitions, etc.
LGTM 🌪️
|
||
-- The encoded payment request for this invoice. Some invoice types may | ||
-- not have leave this empty, like Keysends. | ||
payment_request TEXT UNIQUE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have a bit of room to further decouple things here. Eg: in the future a logical invoice will have both a BOLT 11 and BOLT 12 invoice format, along w/ w/e might arise in the future. So we could spin this out into another table that then references this main invoice table.
Similarly, not every invoice will have a 1:1 payment hash to pre-image relationship. Eg: AMP invoices only have a payment_addr
as the hash+preimage are generated by the sender.
state SMALLINT NOT NULL, | ||
|
||
-- The accumulated value of all the htlcs settled for this invoice. | ||
amount_paid_msat BIGINT NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This in theory could be derived via a view that sums over the set of payments to the invoice itself. Another factor here is the set of active HTLCs as well.
-- The accumulated value of all the htlcs settled for this invoice. | ||
amount_paid_msat BIGINT NOT NULL, | ||
|
||
-- This field will be true for AMP invoices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So then the general invoice feature table was dropped in favor of this/
CREATE INDEX IF NOT EXISTS invoice_htlc_custom_records_htlc_id_idx ON invoice_htlc_custom_records(htlc_id); | ||
|
||
-- invoice_payments contains the information of a settled invoice payment. | ||
CREATE TABLE IF NOT EXISTS invoice_payments ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
In the future we can add another event table here for stuff like: cancel, accept, hodl, etc.
@@ -0,0 +1,54 @@ | |||
-- amp_invoices_payments | |||
CREATE TABLE IF NOT EXISTS amp_invoice_payments ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps also nice to have easy access to the amount here as well? Then can just read this out rather than needing to also fetch+sum over the set of invoice HTLCs.
SELECT aip.*, ip.* | ||
FROM amp_invoice_payments aip LEFT JOIN invoice_payments ip ON aip.settled_index = ip.id | ||
WHERE ( | ||
set_id = sqlc.narg('set_id') OR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at the way the queries exist as is, I think what you also want here is based on the payment_addr
as well. Then given that, you can fetch them all for a payment addr w/o an intermediate look up for the invoice itself.
CREATE TABLE IF NOT EXISTS invoice_event_types( | ||
id INTEGER PRIMARY KEY, | ||
|
||
description TEXT NOT NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future if we want very strict typing here, we can make a set "enum values" table, then point to that itself (all items unique, essentially pre-populated).
You can also do stuff like:
type CHECK(type in ('THIS', THAT'))
etc
This is the first PR for #6288
In this PR we define the tables, indexes and queries for our invoice schema.
We aim to support sqlite/postgres out of the box. The sql code in the migration/query files need to work for both engines with the minimum changes and we want to avoid using specific functionalities like postgres
SEQUENCE
s.We will use sqlc to generate go code from SQL. The tool uses the config defined in
sqlc.yaml
. By default we will set the engine to postgres but the generated code also works for sqlite.The invoice schema can be divided in three parts:
set_id
) and the extra data of each htlc.Because the queries are not generated at runtime, we need to write generic ones that allow us to filter/order using different filter parameters that can be set to
NULL
when we don't want to apply in a specific query.For filtering, we use the pattern
We also can also use
Ordering is a bit more complicated, because we cannot add
ASC/DESC
in aCASE-WHEN-THEN-ELSE
so we need to unroll the casesA good example of a real query using this patterns is
FilterInvocies
.There are a couple of things that I am considering to add, but I would like to know others opinions.
Currently, row ids (INTEGER PRIMARY KEY) are mapped to
int32
. That means that we have a counter valid until2,147,483,647
which is 1 invoice/second for the next ~68 years. In Go code we useuint64
for them.uint64
is not a supported sql type, but we may want to use the largest integer value (int64) for this.Invoice payments are stored in their own table. Some invoice types can have a
one invoice -> many peyments
so aJOIN
make sense in this case. However, it could be interesting to add information about the "latest" payment in the invoice table. In that way we would fetch the invoice and its payment information in one trip to the db, whenever it's possible.
Most of the invoices will have the same feature set, based on their type. Maybe we could add an "features_set_enum" instead of storing them for each invoice. That would also gives us the flexibility of having a different feature set for the same kind of invoice in case we add it only to new invoices.
Here is a diagram of the proposed schema.
NOTE: the structure of this PR has changed so some of the old comments apply now to other PRs. This one is only about the SQL schema.