-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: describe protocol version 2.0 #90
base: master
Are you sure you want to change the base?
Conversation
I read the definition of the new status hash and I think EPS should be able to implement it. The changes to |
@SomberNight Thank you so much for taking the time to think about this deeply, research it, and do PoC implementations to test feasibility. After having discussed this with your on IRC I am pretty optimistic about these changes, and the new status_hash mechanism which solves a long-standing problem. Note that there is a small performance hit for the first time you compute the status hash for a huge history of like 1 million items (in Python 300msec versus 2 seconds or something like that on my machine). But this cost is paid only once, since the 10k tx checkpointing thing will solve having to compute too long a set of hashes. That cost can also be mitigated with additional design (such as pushing off the response to such a request to some low priority background task)... or it can be simply paid upfront since it will only be incurred once. And there aren't that many 500k or 1 million tx addresses (although there are more than one would expect). Anyway, I'm a fan of these changes. I haven't yet tried a PoC implementation to see if there are any gotchas but reading the new spec it seems very sane and reasonable. |
One additional comment and/or question: I know for BTC you guys definitely need On Fulcrum, one thought I had was that in the BCH case (but definitely not in the BTC case), I may opt to not offer that by default (or maybe do offer it by default but make it potentially opt-out). This makes it less painful for server admins to update to the new version since the (very slow to build) spent_txo index won't need to be built for them in that case the first time they run the updated version. Now, maybe I am overcomplicating things -- and maybe I should just make them eat the cost. But aside from them having to wait X hours for it to build that index, it may also be unpopular due to the additional space requirements. So.. my thinking is that maybe in the BCH case I will "extend" this protocol to add additional keys to server.features, so that a server can advertise if it lacks the index (if key is missing, can assume it has the index). What are your thoughts on this? I know this is not your primary concern, and since this is a BCH issue mostly, I know you have plenty to do already -- but I was wondering if you had recommendations on what to call this key or .. what. i was thinking as a BCH extension, in the |
Something very similar could be achieved with the current protocol using SHA256 midstate. But making it recursive would make things easier on the implementation's side. |
My thoughts on your post @shesek
Edit: Another thought on the new status hash, I think EPS/BWT servers won't even take advantage of the possibility of caching hashes, but just recalculate them from the start each time. This should be fine because the client would just be attacking themselves if they DoS their own personal server. Plus it helps keep the server stateless. |
Agreed, of course. But this still adds another vector of attack for users that have an insecure setup (I suspect there are quite a few of these, unfortunately) that should be taken into account.
Oh, yes, nice! That is much better. The electrum server will still have to occasionally poll for this, but it doesn't require checking each block separately. But what happens if the output gets funded then spent immediately after, before the electrum server had a chance to poll
I would consider that this RPC command could be used in the future for other things too, either by Electrum itself or by third party software that leverages Electrum servers. But I agree that if its expected that the Electrum Lightning implementation wouldn't normally subscribe to spent outpoints, then it could be good enough for now. |
You cannot expect that. Electrum needs to know if an outpoint has been spent, so the server needs to distinguish between 3 different cases: utxo does not exist, utxo exists and is unspent, and utxo was spent. |
Here are two possible ways to solve the edge case of an outpoint being immediately spent after it is created: The node could check Another way is to add a method to this protocol which does nothing but notify the server about what address will be later requested in |
Distinguishing between spent txos and non-existent txos in a generic manner that works for any txo is inherently incompatible with pruning. It seems to me that this could only work with pruning if we loosen the requirements by making some assumptions about electrum's specific usage patterns, and tailoring the electrum server-side solution to work specifically for this.
How could it tell that its a lightning transaction? Wouldn't it have to import all p2wsh addresses to be sure?
For the pruning / no
This would indeed make things easier. The server will simply have to import the addresses, and all the information for the relevant txos will be available in the Bitcoin Core wallet, without any specialized logic for tracking txos. If we can guarantee that the address notification is always sent before the funding transaction confirms, then this becomes trivial. But even if not, because the address is known, the server could more easily issue a rescan to look for recent funding/spending transactions (say, in the last 144 blocks or so?), without having to check individual blocks manually. |
I quite like it that the protocol no longer uses addresses (but script hashes). I refuse to reintroduce them! :D |
Yes, or rather than importing straight away it could save every p2wsh address in a 'txid' -> 'address' map or dict.
Yes(!) This is a much better idea than a separate protocol method. That should totally solve the edge case. |
docs/protocol-basics.rst
Outdated
If the history contains ``n`` txs, the status is ``status_n``. The ``tx_n`` | ||
series consists of, first the confirmed txs touching the script hash, | ||
followed by the mempool txs touching it; formatted as described above, as | ||
bytearrays of length 40 or 48. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see long-term problems mixing mempool and confirmed history here; I would recommend moving to separate tracking of mempool and confirmed histories. Of course that is more complexity in the short-term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean specifically for the status, for get_history
, or in general?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general
**Signature** | ||
|
||
.. function:: blockchain.scripthash.get_history(scripthash) | ||
.. function:: blockchain.scripthash.get_history(scripthash, from_height=0, to_height=-1, | ||
client_statushash=None, client_height=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems wrong to be putting the burden of figuring out where the client is (presumably the point of the client_statushash and client_height) on the server. I would have preferred to see something where client state doesn't enter the protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. This is the best I could come up with though. There are too many constraints when trying to support long histories...
Still, I think it is possible to implement this server-side in a way where it is relatively cheap to figure out the state of the client using these fields for well-behaving clients. By properly crediting resource usage cost to sessions, malicious or buggy clients can be banned.
adba763
to
fc86e26
Compare
Regarding a new get_history, from wallet point of view what we really need is to get the most recent history. Gemmer wallet already runs into 'history too large' error on multiple occasions with normal use. But from Gemmer's point of view, it only needs to fetch the most recent 3 transactions(including unconfirmed) from electrumx server, regardless of how many historic transactions there are for a given address. So the question is If the api allows for a parameter for how many recent transactions client wants, that would actually reduce a lot of server burden on long history. In fact, we would have no problem with the server hard limiting the parameter to a max of e.g. 1k for performance considerations. So basically what we are suggesting is something like
( Then for typical wallet and explorer, it would not burden the server on long history addresses. For example, Gemmer would do
which could be very efficient on the server for long history address compared to current situation where they run into 'history too large'. |
Maybe your altcoin needs that but it is useless for Bitcoin. There are many considerations to keep in mind:
One way to achieve this, is what the pre-1.5 protocol does: define status hash in a way that the client will notice it is missing txs. This proposal here does the same. However, the client then will necessarily have to have downloaded all txs to calculate the status hash itself. So I see no point in designing an API that allows fetching just the most recent txs. If you want to know if there are txs you are missing; that's already how it works; you compare the status hashes. Then, to be able to recalc the status hash yourself again, trustlessly, you need to obtain all missing txs. |
One can imagine an optimization in the "happy" case where you just are able to download only the tx's you believe you are missing. You don't actually need the entire history to calculate the status hash in the case where everything lines up... only as a fallback if you cannot reconcile would you go ahead and try the full download, and then if that fails, decide which server you believe and try again. Most of the time nobody is lying to anybody -- and being able to detect omission is already captured by the status hash. The "download last few tx's" thing would be probably enough 99% of the time... and may save some load... Although already I believe with the changes in 1.5 it should be possible now retrieve a partial history towards the "end" now.. right? |
Yes. You can call
The issue is that you have no idea how many txs you are missing: the status either matches or differs. |
Yeah, just start with what you last saw, and if that fails to reconcile.. get a full history. Anyway I was merely pointing out that it's currently possible to do it with the 1.5 spec and that it is a useful optimization for wallet implementors to consider in their interaction with the server... |
In most our usage scenarios, including wallets and explorers, servers are generally trusted, and applications don't try to remember transaction history it has downloaded from server. I see where the design is coming from, however it seems to me that the focus on SPV has really made an impact on its usability in general. The problem with the |
I see. Indeed if you wanted to create a block explorer (using a trusted Electrum server as backend) that can efficiently show the (e.g.) 100 most recent txs for an address that would need a different Indeed the use case we had in mind here is different. I am not sure if the currently proposed I suppose one thing the proposed |
Ok, sure, we can call it 2.0. We can try to use semver for the protocol going forward. |
8b08ebc
to
795c783
Compare
Wow thank you so much man! I really appreciate it. This means I can rename my BCH-specific extensions that I'm about to push out as 1.5.0 :) |
Yeah except we do some version negotiation to change the "personality" of the server depending on what the client is.. (in order to behave as older clients expect without limiting things for newer clients). |
I'm not sure if I understand your goal but maybe protocol extensions themselves should be versioned? |
Correct. That's what version negotiation is for .... |
Then I suggest you name the extensions |
Dude don't worry about it seriously. This is handled correctly and doesn't need to be discussed here. :) |
Why the fee is used here? |
Please ask such specific questions in-line (comment on a specific line, instead of in the main thread), so that discussions can be tracked better. |
from it, and those funding it), both confirmed and unconfirmed (in mempool). | ||
|
||
2. Order confirmed transactions by increasing height (and position in the | ||
block if there are more than one in a block). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script hash status is dependent on the transaction ordering.
If I order transactions incorrectly, I get the wrong script status hash and I'll do some extra requests which increase the server load.
This is more likely to happen if I have multiple transactions involving a script in the same block.
How the client should know the (relative) position of a transaction in the block?
Should the client infer it from the order in the returned list by previous calls to get_history
?
Does it make sense to use the same ordering that is used for mempool transactions for confirmed transactions too?
AFAICT it would be simpler for client libraries to implement correctly, and indirectly it should help the server too.
output of :func:`blockchain.scripthash.get_mempool` appended to the | ||
list. Each confirmed transaction is a dictionary with the following | ||
keys: | ||
The output is a dictionary, always containing all three of the following items: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has been considered to return also the statushash
computed by the server?
@@ -499,6 +581,172 @@ Unsubscribe from a script hash, preventing future notifications if its :ref:`sta | |||
Note that :const:`False` might be returned even for something subscribed to earlier, | |||
because the server can drop subscriptions in rare circumstances. | |||
|
|||
blockchain.outpoint.subscribe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if i do not want to subscribe but only get the status once, i would have to blockchain.outpoint.subscribe
and blockchain.outpoint.unsubscribe
, are there any plans on adding a blockchain.outpoint.status
(or a similar name) that would return the same as blockchain.outpoint.subscribe
but without the notifications and the subscription?
One thing I found kinda annoying when implementing electrum client is there's no way to know which request the message is for without accessing internal state which makes deserialization annoying. It doesn't seem to be solvable but just in case is there any way to modify the protocol to allow it? |
If the name of the type of the response was included somehow, you could know how to deserialize the message without looking up internal state. A suggestion/wish-list for the new protocol version: some way of detecting if the address is unused or not. We all want to practice good address hygiene and avoid address reuse. A protocol method for checking if an address is unused would help with this. My use case: I collect XPUBs from users, allowing them to make withdrawals from my service to new addresses each time they withdraw. These XPUBs could also receive funds from other services, so I need to check if the next address in the derivation path is unused, before sending to it. I can do this by looking up the history. However, this is inefficient if the address has received lots of transactions. Checking if the address is unused is really just a special-case of looking up the history, where we short circuit when finding the first history element and return early. I originally brought this up in romanz/electrs#920 |
This PR describes an updated Electrum Protocol, version 2.0 (formerly named version 1.5).
Some of these ideas have already been discussed and implemented as part of #80, however this PR includes even more changes, most notably pagination of long histories (taking into consideration ideas from kyuupichan/electrumx#348 and kyuupichan/electrumx#82).
While the changes described in #80 are already non-trivial and would be useful in themselves, I have realised I would also like to fix the long-standing issue of serving addresses with long histories, and IMO it would be better to only bump the protocol version once (and to minimise the number of db upgrades/times operators have to resync).
This PR is only documentation, I would like feedback before implementing it.
@romanz, @chris-belcher, @shesek, @cculianu, @ecdsa, @kyuupichan
Compared to the existing Electrum Protocol (1.4.2), the changes are:
server.version
message must be the first message sent by the client.That is, version negotiation must happen before any other messages.
blockchain.scripthash.get_history
changed significantly to allow pagination of long histories.blockchain.outpoint.subscribe
to subscribe to a transaction outpoint, lookup its spender tx, and get a notification when it gets spent.blockchain.outpoint.unsubscribe
to unsubscribe from a TXO.height
argument forblockchain.transaction.get_merkle
is now optional. (related: Compressed block headers & merkle proofs over low bandwidth communications #43 )mode
argument added toblockchain.estimatefee
. (see allow passing estimate_mode to estimatefee kyuupichan/electrumx#1001 )blockchain.scripthash.get_mempool
previously did not define an order for mempool transactions. We now mandate a specific ordering.Re pagination of long histories: