Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation about peer/traversal stats #67

Open
ooninoob opened this issue Nov 24, 2022 · 4 comments
Open

Documentation about peer/traversal stats #67

ooninoob opened this issue Nov 24, 2022 · 4 comments

Comments

@ooninoob
Copy link

Hello! Thank you for maintaining this project and publishing it as FLOSS. I've been experimenting with using it as part of the ooni project to measure mainline DHT censorship.

However, it is not clear to me precisely what the exposed stats actually mean. For example, running the provided CLI tool returns the following:


$ ./dht get-peers
{
  "Peers": [],
  "DistinctPeerIps": 0,
  "TraversalStats": {
    "NumAddrsTried": 31,
    "NumResponses": 0
  },
  "ServerStats": {
    "GoodNodes": 2,
    "Nodes": 3,
    "OutstandingTransactions": 0,
    "SuccessfulOutboundAnnouncePeerQueries": 0,
    "BadNodes": 0,
    "OutboundQueriesAttempted": 31
  }
}

In this example, we can see NumResponses is zero, but in server stats GoodNodes is 2. This does not match my current mental model.

In a test case i wrote, where i start a local dht.Server without any bootstrap nodes, then try to connect from it from another dht.Server (with only the first one as bootstrap nodes), i get the following results after an Announce: in announce.TraversalStats(): 1 peer tried, 0 responded. And announce.Peers is empty. Yet i can see using tcpdump that the first peer did respond to the query.

Trying with the default set of bootstrap nodes (13 addresses), i get 19 NumResponses (NumAddrsTried 46) and announce.Peers contains 22 entries.

It's not clear to me what NumResponses and the entries in announce.Peers are. Is the first one the number of traversed peers queried about the specific infohash, and the second the number of found peers who announce that infohash? If so, is NumResponses only counting replies from peers announcing that infohash?

To get more comprehensive results, maybe i should call get_peers recursively myself? In the meantime, i feel like it's worth explaining more about what the exposed stats/data actually means.

Thanks for your attention

@anacrolix
Copy link
Owner

Thanks for the report, I'll take a deeper look soon. Without digging in, you might want to check the node ID is "secure", I can't remember if that's done for you or requires an extra step during DHT server configuration.

@anacrolix
Copy link
Owner

However, it is not clear to me precisely what the exposed stats actually mean. For example, running the provided CLI tool returns the following:


$ ./dht get-peers
{
  "Peers": [],
  "DistinctPeerIps": 0,
  "TraversalStats": {
    "NumAddrsTried": 31,
    "NumResponses": 0
  },
  "ServerStats": {
    "GoodNodes": 2,
    "Nodes": 3,
    "OutstandingTransactions": 0,
    "SuccessfulOutboundAnnouncePeerQueries": 0,
    "BadNodes": 0,
    "OutboundQueriesAttempted": 31
  }
}

In this example, we can see NumResponses is zero, but in server stats GoodNodes is 2. This does not match my current mental model.

In this case, the --info-hash argument to get-peers was not given, and so the Server is requesting the infohash 0000000000000000000000000000000000000000, which most implementations ignore. I've made an info-hash mandatory with 5c1bf59. On top of that, for the get-peers traversal wrapper (and that one only!) was using an old code path that didn't propagate the ResponseFrom field which is used to count for NumResponses. I've fixed this oversight in 3231c91.

In a test case i wrote, where i start a local dht.Server without any bootstrap nodes, then try to connect from it from another dht.Server (with only the first one as bootstrap nodes), i get the following results after an Announce: in announce.TraversalStats(): 1 peer tried, 0 responded. And announce.Peers is empty. Yet i can see using tcpdump that the first peer did respond to the query.

This might run differently with the fixes above.

Trying with the default set of bootstrap nodes (13 addresses), i get 19 NumResponses (NumAddrsTried 46) and announce.Peers contains 22 entries.

It's not clear to me what NumResponses and the entries in announce.Peers are. Is the first one the number of traversed peers queried about the specific infohash, and the second the number of found peers who announce that infohash? If so, is NumResponses only counting replies from peers announcing that infohash?

NumResponses is the count of the response-type replies to queries sent in the traversal, non-nil here:

// This is set non-nil if a query reply is a response-type as defined by the DHT BEP 5 (contains
// "r")
ResponseFrom *krpc.NodeInfo
.

The get_peers operation returns addresses of peers that have announced themselves for target infohash.

To get more comprehensive results, maybe i should call get_peers recursively myself? In the meantime, i feel like it's worth explaining more about what the exposed stats/data actually means.

That shouldn't be necessary. I think if you run the above get-peers cmdline invocation with the fixes (go get github.com/anacrolix/dht/v2@master) it should make more sense now. Let me know if I've missed anything.

@ooninoob
Copy link
Author

ooninoob commented Nov 30, 2022

Thanks for being so quick about this!

I think if you run the above get-peers cmdline invocation with the fixes (go get github.com/anacrolix/dht/v2@master) it should make more sense now.

The data looks better:


$ ./dht get-peers --info-hash 631a31dd0a46257d5078c0dee4e66e26f73e42ac
2022-11-30T12:00:17+0100 NIL [main.main.func7:138]: public ip: "80.67.176.183"
2022-11-30T12:00:17+0100 NIL [main.main.func7:145]: dht server on [::]:60181 with id f85bc430872ceda4f3ce067caadc2bba3d8c8b5b
2022-11-30T12:00:29+0100 NIL [main.GetPeers:53]: *dht.Announce 0xc0001b41b0 of 631a31dd0a46257d5078c0dee4e66e26f73e42ac on dht server on [::]:60181 (node id f85bc430872ceda4f3ce067caadc2bba3d8c8b5b) contacted 43 nodes
{
  "Peers": [
    {
      "Addr": "185.148.3.176:50516",
      "Frequency": 2
    },
    {
      "Addr": "70.66.134.138:6881",
      "Frequency": 3
    },
    {
      "Addr": "208.78.42.171:7926",
      "Frequency": 3
    },
    {
      "Addr": "208.78.42.171:22084",
      "Frequency": 3
    },
    {
      "Addr": "208.78.42.171:36491",
      "Frequency": 3
    },
    {
      "Addr": "85.73.138.103:33839",
      "Frequency": 4
    },
    {
      "Addr": "185.101.35.79:18542",
      "Frequency": 4
    },
    {
      "Addr": "198.167.193.189:50000",
      "Frequency": 4
    },
    {
      "Addr": "212.233.32.215:19152",
      "Frequency": 4
    },
    {
      "Addr": "98.19.178.189:28311",
      "Frequency": 6
    }
  ],
  "DistinctPeerIps": 8,
  "TraversalStats": {
    "NumAddrsTried": 43,
    "NumResponses": 23
  },
  "ServerStats": {
    "GoodNodes": 8,
    "Nodes": 8,
    "OutstandingTransactions": 0,
    "SuccessfulOutboundAnnouncePeerQueries": 0,
    "BadNodes": 0,
    "OutboundQueriesAttempted": 43
  }
}

However, it's not entirely clear to me why Peers contains 10 entries but server stat's Nodes is 8. Is it because Nodes/GoodNodes only accounts for "distinct peer IPs"? I also see that SuccessfulOutboundAnnouncePeerQueries is 0, but from a quick look at the code it seems it's only used when announcing to the DHT an infohash we have, right? On the other hand, from a quick read it's not clear to me what "OutstandingTransactions" is.

This might run differently with the fixes above.

The ooni probe test case produces the same kind of results as before:

{
  "queries": null,
  "runs": [
    {
      "bootstrap_nodes": [
        "[::]:51248"
      ],
      "bootstrap_num": 1,
      "peers_tried_num": 1,
      "peers_responded_num": 0,
      "infohash_peers_num": 0,
      "infohash_peers": null,
      "failure": "No DHT peers were found"
    }
  ],
  "failure": "No DHT peers were found",
  "failed_step": "run"
}

NumResponses is the count of the response-type replies to queries sent in the traversal, non-nil here (...) The get_peers operation returns addresses of peers that have announced themselves for target infohash.

These two sentences sound contradictory to me. Is NumResponses supposed to be incremented for every response to traversal queries, or only for responses matching the target infohash? For the moment, my test bootstrap node does not announce any infohash so maybe that's why it says 0 responses. Though i've just tried to run an AnnounceTraversal for the target infohash on the boostrap node, before connecting to it from the probe node, and that produces the same results.

That's also strange because in both cases i can see in wireshark that the bootstrap node produces what seems to be a non-empty response (and for sure non-nil and containing "r"). I've attached the wireshark dump so you can take a look if you have time. It contains 3 query/responses, the last of which is made after i've tried to announce the target infohash from the boostrap node. I've had to rename it to .txt so Github allows to upload it, renaming it to .pcapng should make it open in wireshark.

dht.txt

Of course this is not an important matter as the code still produces some values for real-world usage (and detects blocking successfully) so i could just ignore the broken test case. Don't let that thing bother you if you have a lot of stuff to deal with. Also, if you would rather i carefully study the source code and read the Kademlia paper, it's a valid response to my requests... i'm just taking the shortcut (because i know nothing about DHT stuff apart from high-level overview) to ask you questions as you are more familiar with the matter.

EDIT: I added a Bittorrent test on the ooni probe and it also fails (here) at metainfo query. Could it be that bootstrap nodes are ignored as part of the announce response because they're usually a just a bootstrap node and not an actual peer? I'll provide more info/logs over there...

@anacrolix
Copy link
Owner

However, it's not entirely clear to me why Peers contains 10 entries but server stat's Nodes is 8. Is it because Nodes/GoodNodes only accounts for "distinct peer IPs"? I also see that SuccessfulOutboundAnnouncePeerQueries is 0, but from a quick look at the code it seems it's only used when announcing to the DHT an infohash we have, right? On the other hand, from a quick read it's not clear to me what "OutstandingTransactions" is.

Peers are addresses returned by nodes close to the target infohash. Nodes/GoodNodes are actual DHT participants that have been added to our server's routing table.

SuccessfulOutboundAnnouncePeerQueries should only occur during an announce.

OutstandingTransactions are transactions that haven't timed out, and are still being considered by a traversal. It should be zero unless you terminate a query in the middle of something and don't give it time to clean up.

If any of the above aren't documented on their associated structs, they should be 😬.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants