-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optionally displays vin.prevOut {...} for searchrawtransactions API only. #487
Conversation
@@ -314,6 +314,6 @@ func (b *BlockChain) FetchTransactionStore(tx *btcutil.Tx) (TxStore, error) { | |||
// the main chain without including fully spent trasactions in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should be updated to reflect reality. The flag controls the behavior now.
I've reviewed about half of this so far and want to say very nice work on keeping with the style, adding the appropriate additional tests, etc. One thing I'd like to request is that the pull request also updates the searchrawtransactions section of the JSON-RPC API documentation. Also, I'm not sure the extra flag to specify whether or not to add the extra prevOut information is needed. The more important part, in my opinion, was that @Roasbeef: Can you review this as well since it's working in the address index area. |
@davecgh hey thanks. This is my first excursion with both go and btcd, so I'm just trying to go with the flow. I'll make the changes you requested. I wasn't aware of the json-rpc docs. thx for the pointer. I added the prevout option for a few reasons (1) backwards compatibility. (2) pushback about the feature. (3) so end-user can avoid the extra processing and bytes if they don't need those values. In practice it doesn't seem to add much processing overhead when the flag is on, but I would need to test with some large transactions with many non-coinbase inputs to get good perf numbers. I can remove the flag if that is the consensus. I haven't seen other API's offering that flag. |
hmm, there seems to be a serious performance problem with my implementation when querying certain addresses with many large transactions. Without prevOut time btcctl searchrawtransactions 1XPTgDRhN8RFnzniWCddobD9iKZatrvH4 1 0 100000 > /tmp/laz2 real 0m2.721s $ grep -c blockhash /tmp/laz1 3320 With prevOut $time btcctl searchrawtransactions 1XPTgDRhN8RFnzniWCddobD9iKZatrvH4 1 0 100000 1 > /tmp/laz1 real 8m38.345s $ grep -c blockhash /tmp/laz2 3320 So, ~2 seconds vs ~8 minutes. Ouch! Other addresses don't seem to have much of a hit. without prevOut $ time btcctl searchrawtransactions 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa 1 0 100000 > /tmp/gen2 real 0m2.334s $ grep -c blockhash /tmp/gen2 986 with prevOut $ time btcctl searchrawtransactions 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa 1 0 100000 1 > /tmp/gen1 real 0m4.156s $ grep -c blockhash /tmp/gen1 986 So ~2 seconds vs ~4 seconds. not [as] horrible. The first address had 3320 transactions and the second one had 986, but the time difference for the prevOut case was enormous: ~8 minutes vs ~4 secs. I am guessing that the first had a lot more inputs. A strange thing is that dstat reports writes of 30-60Mb/sec and 0 reads while the API is running. The system has free mem and does not appear to be swapping. I will review the code some more and read up on golang profiling. helpful suggestions appreciated.... |
Without digging into the specifics a little more I can't say for certain, but I would wager the reason that address is taking so long is because it has a ton of inputs and you're hitting one of the main reasons for PR #380. To summarize, currently the blocks are stored in leveldb and therefore in order to get a specific transaction means leveldb has to load the entire block since it doesn't provide any way to reach into a specific offset of its values. That particular address is referencing thousands of transactions which in turn means it's having to load thousands of blocks in order to get at them. That problem is fixed with #380 since it stores the blocks on the file system instead and therefore allows any piece of a block, whether it be a transaction, header, script, etc, without having to load the entire thing. That said, while it will certainly be significantly faster with that PR, it's still going to take a little longer to get data for addresses which involve thousands of transactions like the one in question since they will involve thousands of disk seeks to load the relevant transactions. I've done a large amount of testing in regards to this particular topic and I'd be willing to bet the performance will be more than acceptable once #380 is in even without any other changes to this PR since I've loaded 100+ thousand transactions across as many blocks with the new code on an old Hitachi 7200RPM disk in a few seconds versus the minutes it takes now to do the same. |
ok, I found the source of slowdown. as usual PEBKAC. ;-) My code was stupidly doing the DB lookup for all transaction inputs for every input, when it only needs to do it once. After correcting this, the time went from ~8 minutes to ~6 seconds. And the prevout case is approx 3x slower than without prevout. Which makes sense because it is doing extra DB work. without prevOut btcd@btc2:~$ time btcctl searchrawtransactions 1XPTgDRhN8RFnzniWCddobD9iKZatrvH4 1 0 100000 > /tmp/laz-orig real 0m2.762s with prevOut btcd@btc2:~$ time btcctl searchrawtransactions 1XPTgDRhN8RFnzniWCddobD9iKZatrvH4 1 0 100000 1 > /tmp/laz-prevout-cache real 0m6.244s This is an address with 3000+ transactions and LOTS of inputs (haven't counted). So I think that is reasonable. But the approx 3x slowdown is an argument for keeping the vinextra flag, for folks that don't need the extra data. I'll update the pull request code in a bit. |
That would certainly do it! What I said above still applies though, so once that other PR and the requisite supporting code is in, this will benefit from it too. |
ok, I think its ready for review. Let me know if you want vinextra flag removed. as stated above, I think there is some value to it, as performance can be worse when loading transactions with lots of inputs. ( eg ~6.2 seconds vs ~2.7 seconds, as stated above. ) |
|
||
// we don't call fetchInputTransactions() here because it may not be | ||
// necessary so we defer until it is. ( lazy load ) | ||
var txStore *blockchain.TxStore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since blockchain.TxStore
is already a reference type, there is no need to make this a pointer. Also, why not just do the input fetch here when vinExtra
is set rather than checking on every loop iteration? Something like:
var txStore blockchain.TxStore
if vinExtra == 1 {
tx := btcutil.NewTx(mtx)
txStoreNew, err := s.server.txMemPool.fetchInputTransactions(tx, true)
if err == nil {
txStore = txStoreNew
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I figured there must be a cleaner way. thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, I was doing the fetchInputTransactions() in the loop as an optimization to avoid making that call at all if all inputs are coinbase and don't have prevOuts. Admittedly this is not the common case and probably rare-ish, but can happen for miners; especially p2pool miners typically have lots of coinbase tx.
I suppose though that in that case, the DB lookup will be pretty fast because it finds no prev transactions. So I went ahead and moved it to the top as per @davecgh's suggestion.
ok. ready! |
vinList := make([]btcjson.VinPrevOut, len(mtx.TxIn)) | ||
|
||
var txStore blockchain.TxStore | ||
if vinExtra == 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, this (and the other one below) should be != 0
. 0 is false, non-zero is true. With this check as it is, specifying 2 would be treated as false which wouldn't make much sense.
That said, I actually would rather the parameter be a boolean. I know the existing Verbose
field is an int
, but that is only because it is intentionally being compatible with the Bitcoin Core searchrawtranscations
RPC call which for some reason uses an integer instead of a JSON bool. Since this is extended in a custom way for btcd, I'd rather use the right type for the job.
EDIT: However, I'm a little torn because it would be inconsistent too. So I suppose leaving it an int
for consistency is a stronger case. At any rate, it should be checking for != 0
.
@davecgh , I incorporated your suggested code. thanks. btw, I went through the same thought process with int vs bool. I first implemented it as bool but then changed to int for consistency. |
oh, I tested that the latest code returns correct data for both a coinbase tx and prevOut tx. Also, tested with flag set to 0,1,2 and omitted. |
Thanks for the updates. This looks good now code wise to me. Also, before this can be merged, we'll need to prepare a separate PR for btcrpcclient which updates it to use the new return type and offer the new flag. Also, please rebase and squash this to a single commit which follows the model git commit message format. OK pending the separate PR for |
@@ -632,12 +632,12 @@ The following is an overview of the RPC methods which are implemented by btcd, b | |||
| | | | |||
|---|---| | |||
|Method|searchrawtransactions| | |||
|Parameters|1. address (string, required) - bitcoin address <br /> 2. verbose (int, optional, default=true) - specifies the transaction is returned as a JSON object instead of hex-encoded string <br />3. skip (int, optional, default=0) - the number of leading transactions to leave out of the final response <br /> 4. count (int, optional, default=100) - the maximum number of transactions to return| | |||
|Parameters|1. address (string, required) - bitcoin address <br /> 2. verbose (int, optional, default=true) - specifies the transaction is returned as a JSON object instead of hex-encoded string <br />3. skip (int, optional, default=0) - the number of leading transactions to leave out of the final response <br /> 4. count (int, optional, default=100) - the maximum number of transactions to return <br /> 5. (vinextra numeric, optional, default=0) - Specify that extra data from previous output will be returned in vin| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To maintain consistency with the existing doc entries, the documentation for this new parameter should read:
5. vinextra (int, optional, default=0) - specify that extra data from previous output will be returned in vin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. Agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
fwiw, I had it the way it was because i copied the exact output from btcctl help searchrawtransactions, which prints numeric instead of int. ie:
Arguments: 1. address (string, required) The Bitcoin address to search for 2. verbose (numeric, optional, default=1) Specifies the transaction is returned as a JSON object instead of hex -encoded string 3. skip (numeric, optional, default=0) The number of leading transactions to leave out of the final response 4. count (numeric, optional, default=100) The maximum number of transactions to return 5. vinextra (numeric, optional, default=0) Specify that extra data from previous output will be returned in vin
b8a61a9
to
43774fe
Compare
requested changes made. squashed and rebased. should be good to go... I don't know anything about btcrpcclient or what changes will be required there. I definitely would prefer if one of you with experience can do it (probably much faster). otherwise will look into it when I have some more free time... |
Thanks. I'll make a |
Created btcsuite/btcrpcclient#61 to coincide with this change. |
For background on this pull request, see #485.
Implementation overview
Testing done
I've run a few different addresses against it on both testnet and mainnet, without problems. Also ran goclean.sh script and all tests pass.
Commentary
This code is working for my application's use case. Hopefully it can be incorporated into mainline.
I have one additional searchrawtransctions tweak to add for my use case, which is filtering out inputs/outputs that do not match the input address. This should result in a large performance gain and reduction in unnecessary data sent over the wire. I plan to submit that as a separate pull request when ready.
Usage
Example Output, case: flag vinextra = 1
Example Output, case: flag vinextra = 0 ( or omitted )