Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(network): improve efficiency of known peers handling #2074

Merged
merged 2 commits into from
Mar 1, 2024

Conversation

onur-ozkan
Copy link
Member

@onur-ozkan onur-ozkan commented Feb 26, 2024

Fixes #2073

cc @KomodoPlatform/qa

@artemii235
Copy link
Member

@onur-ozkan @smk762 Have you already confirmed that #2073 is really fixed in this PR branch?

@shamardy
Copy link
Collaborator

Have you already confirmed that #2073 is really fixed in this PR branch?

It would be good to add a test for stats collection rpc (if possible) as a test is required for any hotfixes to avoid future regressions.

@onur-ozkan
Copy link
Member Author

onur-ozkan commented Feb 27, 2024

request-response from libp2p doesn't let you communicate with the peers that aren't in your connection/network already (which makes sense). So if you want to collect information from the peer, it should be either present in the hard-coded seed list, or in the mm2 configuration field "seednodes". And when they exceed the capacity, they don't get connected to your node.

@shamardy
Copy link
Collaborator

shamardy commented Feb 27, 2024

request-response from libp2p doesn't let you communicate with the peers that aren't in your connection/network already (which makes sense). So if you want to collect information from the peer, it should be either present in the hard-coded seed list, or in the mm2 configuration field "seednodes". And when they exceed the capacity, they don't get connected to your node.

As I remember, I modified this a bit in the original PR (a modified behaviour) ref. #1026 (comment) by adding reserved peer store.

Signed-off-by: onur-ozkan <work@onurozkan.dev>
@onur-ozkan onur-ozkan changed the title chore(network): increase mesh capacity configuration feat(network): bootstrap swarm dial when AddReservedPeer invoked Feb 27, 2024
@smk762
Copy link

smk762 commented Feb 27, 2024

I've restested this both locally and on the stats collection node (inside docker). Neither would return results for all registered nodes, and each restart the successful subset would change.
image
image
image
image
image

To eliminate this being due to the target seednodes not accepting connections due to being at their limit, would they need to be updated also for this PR to take effect?

@onur-ozkan
Copy link
Member Author

To eliminate this being due to the target seednodes not accepting connections due to being at their limit, would they need to be updated also for this PR to take effect?

They don't need the update. Errors should disappear after a couple of seconds and not occur again. Are they continuously logging errors?

@smk762
Copy link

smk762 commented Feb 28, 2024

The list populates early on, then previously connected nodes lose connection just after the 2 minute mark.

peers.mp4

stats collection interval is 10 seconds, as is the display of results on the right, using

SELECT t1.* FROM stats_nodes t1
INNER JOIN
(
    SELECT name, MAX(timestamp) AS max_timestamp
    FROM stats_nodes
    GROUP BY name
) t2 ON t1.name = t2.name AND t1.timestamp = t2.max_timestamp ORDER BY t1.name;

to return the latest result for each registered node, sorted by name.

Stats collection node in video is not using docker. All registered nodes are present in MM2.json seednodes config. I left the scanner running, and it is still only returning the few seednodes displayed at the end of the above video.

image

and again after an hour.
image

@onur-ozkan
Copy link
Member Author

Are you using this (c16a9eb) version?

I test the following nodes multiple times (copied from https://github.com/smk762/kmd_ntx_stats_docker/blob/master/code/scripts/collect_seednode_stats.py#L48):

declare -A nodes=(
    ['chmex_EU']='{"IP":"1.eu.seed.adex.dexstats.info","PeerID":"12D3KooWGP4ryfJHXjfnbXUWP6FJeDLiif8jMT8obQvCKMSPUB8X"}'
    ['chmex_NA']='{"IP":"1.na.seed.adex.dexstats.info","PeerID":"12D3KooWDNUgDwAAuJbyoS5DiRbhvMSwrUh1yepKsJH8URcFwPp3"}'
    ['chmex_SH']='{"IP":"1.sh.seed.adex.dexstats.info","PeerID":"12D3KooWE8Ju9SZyZrfkUgi25gFKv1Yc6zcQZ5GXtEged8rmLW3t"}'
    ['cipi_AR']='{"IP":"cipi_ar.cipig.net","PeerID":"12D3KooWMsfmq3bNNPZTr7HdhTQvxovuR1jo5qvM362VQZorTk3F"}'
    ['cipi_EU']='{"IP":"cipi_eu.cipig.net","PeerID":"12D3KooWBhGrTVfaK9v12eA3Et84Y8Bc6ixfZVVGShsad2GBWzm3"}'
    ['cipi_NA']='{"IP":"cipi_na.cipig.net","PeerID":"12D3KooWBoQYTPf4q2bnsw8fUA2LKoknccVLrAcF1caCa48ev8QU"}'
    ['caglarkaya_EU']='{"IP":"eu.caglarkaya.net","PeerID":"12D3KooWEg7MBp1P9k9rYVBcW5pa8tsHhyE5UuGAAerCARLzZBPn"}'
    ['computergenie_EU']='{"IP":"cgeu.computergenie.gay","PeerID":"12D3KooWGkPFi43Nq6cAcc3gib1iECZijnKZLgEf1q1MBRKLczJF"}'
    ['computergenie_NA']='{"IP":"cg.computergenie.gay","PeerID":"12D3KooWCJWT5PAG1jdYHyMnoDcxBKMpPrUVi9gwSvVLjLUGmtQg"}'
    ['dragonhound_AR']='{"IP":"ar.smk.dog","PeerID":"12D3KooWSUABQ2beSQW2nXLiqn4DtfXyqbJQDd2SvmgoVwXjrd9c"}'
    ['dragonhound_DEV']='{"IP":"dev.smk.dog","PeerID":"12D3KooWEnrvbqvtTowYMR8FnBeKtryTj9RcXGx8EPpFZHou2ruP"}'
    ['dragonhound_EU']='{"IP":"s7eu.smk.dog","PeerID":"12D3KooWDgFfyAzbuYNLMzMaZT9zBJX9EHd38XLQDRbNDYAYqMzd"}'
    ['dragonhound_NA']='{"IP":"s7na.smk.dog","PeerID":"12D3KooWSmizY35qrfwX8qsuo8H8qrrvDjXBTMRBfeYsRQoybHaA"}'
    ['fediakash_AR']='{"IP":"fediakash.mooo.com","PeerID":"12D3KooWCSidNncnbDXrX5G6uWdFdCBrMpaCAqtNxSyfUcZgwF7t"}'
    ['gcharang_DEV']='{"IP":"mm-dev.lordofthechains.com","PeerID":"12D3KooWMEwnQMPUHcGw65xMmhs1Aoc8WSEfCqTa9fFx2Y3PM9xg"}'
    ['gcharang_SH']='{"IP":"mm-sh.lordofthechains.com","PeerID":"12D3KooWHAk9eJ78pwbopZMeHMhCEhXbph3CJ8Hbz5L1KWTmPf8C"}'
    ['gcharang_AR']='{"IP":"mm-ar.lordofthechains.com","PeerID":"12D3KooWDsFMoRoL5A4ii3UonuQZ9Ti2hrc7PpytRrct2Fg8GRq9"}'
    ['mcrypt_SH']='{"IP":"mcrypt2.v6.rocks","PeerID":"12D3KooWCDAPYXtNzC3x9kYuZySSf1WtxjGgasxapHEdFWs8Bep3"}'
    ['nodeone_NA']='{"IP":"nodeone.computergenie.gay","PeerID":"12D3KooWBTNDr6ih5efzVSxXtDv9wcVxHNj8RCvUnpKfKb6eUYet"}'
    ['sheeba_SH']='{"IP":"sheeba.computergenie.gay","PeerID":"12D3KooWC1P69a5TwpNisZYBXRgkrJDjGfn4QZ2L4nHZDGjcdR2N"}'
    ['smdmitry_AR']='{"IP":"mm2-smdmitry-ar.smdmitry.com","PeerID":"12D3KooWJ3dEWK7ym1uwc5SmwbmfFSRmELrA9aPJYxFRrQCCNdwF"}'
    ['smdmitry2_AR']='{"IP":"mm2-smdmitry2-ar.smdmitry.com","PeerID":"12D3KooWEpiMuCc47cYUXiLY5LcEEesREUNpZXF6KZA8jmFgxAeE"}'
    ['smdmitry_EU']='{"IP":"mm2-smdmitry-eu.smdmitry.com","PeerID":"12D3KooWJTYiU9CqVyycpMnGC96WyP1GE62Ng5g93AUe9wRx5g7W"}'
    ['smdmitry_SH']='{"IP":"mm2-smdmitry-sh.smdmitry.com","PeerID":"12D3KooWQP7PNNX5DSyhPX5igPQKQhet4KX7YaDqiGuNnarr4vRX"}'
    ['strob_SH']='{"IP":"sh.strobfx.com","PeerID":"12D3KooWFY5TmKpusUJ3jJBYK4va8xQchnJ6yyxCD7wZ2pWVK23p"}'
    ['tonyl_AR']='{"IP":"ar.farting.pro","PeerID":"12D3KooWEMTeavnNtPPYr1u4aPFB6U39kdMD32SU1EpHGWqMpUJk"}'
    ['tonyl_DEV']='{"IP":"dev.farting.pro","PeerID":"12D3KooWDubAUWDP2PgUXHjEdN3SGnkszcyUgahALFvaxgp9Jcyt"}'
    ['van_EU']='{"IP":"van.computergenie.gay","PeerID":"12D3KooWMX4hEznkanh4bTShzCZNx8JJkvGLETYtdVw8CWSaTUfQ"}'
    ['webworker01_EU']='{"IP":"eu2.webworker.sh","PeerID":"12D3KooWGF5siktvWLtXoRKgbzPYHn4rib9Fu8HHJEECRcNbNoAs"}'
    ['webworker01_NA']='{"IP":"na2.webworker.sh","PeerID":"12D3KooWRiv4gFUUSy2772YTagkZYdVkjLwiXkdcrtDQQuEqQaJ9"}'
    ['who-biz_NA']='{"IP":"adex.blur.cash","PeerID":"12D3KooWQp97gsRE5LbcUPjZcP7N6qqk2YbxJmPRUDeKVM5tbcQH"}'
)

DNS resolution doesn't work on some of them (an entirely different issue not related to us). For the other nodes, I can add them to the mm2 and start collecting version stats.

These are the ones consistently failing with every request:

28 06:43:22, mm2_main::mm2::lp_stats:343] ERROR Node smdmitry2_AR responded to version request with error: Error on request the peer PeerId("12D3KooWEpiMuCc47cYUXiLY5LcEEesREUNpZXF6KZA8jmFgxAeE"): "Canceled". Request next peer
28 06:43:22, mm2_main::mm2::lp_stats:343] ERROR Node tonyl_AR responded to version request with error: Error on request the peer PeerId("12D3KooWEMTeavnNtPPYr1u4aPFB6U39kdMD32SU1EpHGWqMpUJk"): "Canceled". Request next peer
28 06:43:22, mm2_main::mm2::lp_stats:343] ERROR Node tonyl_DEV responded to version request with error: Error on request the peer PeerId("12D3KooWDubAUWDP2PgUXHjEdN3SGnkszcyUgahALFvaxgp9Jcyt"): "Canceled". Request next peer
28 06:43:22, mm2_main::mm2::lp_stats:343] ERROR Node gcharang_AR responded to version request with error: Error on request the peer PeerId("12D3KooWDsFMoRoL5A4ii3UonuQZ9Ti2hrc7PpytRrct2Fg8GRq9"): "Canceled". Request next peer
28 06:43:22, mm2_main::mm2::lp_stats:343] ERROR Node gcharang_DEV responded to version request with error: Error on request the peer PeerId("12D3KooWMEwnQMPUHcGw65xMmhs1Aoc8WSEfCqTa9fFx2Y3PM9xg"): "Canceled". Request next peer
28 06:43:22, mm2_main::mm2::lp_stats:343] ERROR Node who-biz_NA responded to version request with error: Error on request the peer PeerId("12D3KooWQp97gsRE5LbcUPjZcP7N6qqk2YbxJmPRUDeKVM5tbcQH"): "Canceled". Request next peer
28 06:43:22, mm2_main::mm2::lp_stats:343] ERROR Node chmex_EU responded to version request with error: Error on request the peer PeerId("12D3KooWGP4ryfJHXjfnbXUWP6FJeDLiif8jMT8obQvCKMSPUB8X"): "Canceled". Request next peer
28 06:43:22, mm2_main::mm2::lp_stats:343] ERROR Node gcharang_SH responded to version request with error: Error on request the peer PeerId("12D3KooWHAk9eJ78pwbopZMeHMhCEhXbph3CJ8Hbz5L1KWTmPf8C"): "Canceled". Request next peer
28 06:43:22, mm2_main::mm2::lp_stats:343] ERROR Node caglarkaya_EU responded to version request with error: Error on request the peer PeerId("12D3KooWEg7MBp1P9k9rYVBcW5pa8tsHhyE5UuGAAerCARLzZBPn"): "Canceled". Request next peer
28 06:43:22, mm2_main::mm2::lp_stats:343] ERROR Node chmex_SH responded to version request with error: Error on request the peer PeerId("12D3KooWE8Ju9SZyZrfkUgi25gFKv1Yc6zcQZ5GXtEged8rmLW3t"): "Canceled". Request next peer
28 06:43:22, mm2_main::mm2::lp_stats:343] ERROR Node chmex_NA responded to version request with error: Error on request the peer PeerId("12D3KooWDNUgDwAAuJbyoS5DiRbhvMSwrUh1yepKsJH8URcFwPp3"): "Canceled". Request next peer

It's highly likely that there is an issue with their deployment.

I even excluded all other nodes and tried this failing nodes alone couple times:

    ['gcharang_AR']='{"IP":"mm-ar.lordofthechains.com","PeerID":"12D3KooWDsFMoRoL5A4ii3UonuQZ9Ti2hrc7PpytRrct2Fg8GRq9"}'
    ['chmex_NA']='{"IP":"1.na.seed.adex.dexstats.info","PeerID":"12D3KooWDNUgDwAAuJbyoS5DiRbhvMSwrUh1yepKsJH8URcFwPp3"}'
    ['tonyl_DEV']='{"IP":"dev.farting.pro","PeerID":"12D3KooWDubAUWDP2PgUXHjEdN3SGnkszcyUgahALFvaxgp9Jcyt"}'
    ['who-biz_NA']='{"IP":"adex.blur.cash","PeerID":"12D3KooWQp97gsRE5LbcUPjZcP7N6qqk2YbxJmPRUDeKVM5tbcQH"}'
    ['chmex_EU']='{"IP":"1.eu.seed.adex.dexstats.info","PeerID":"12D3KooWGP4ryfJHXjfnbXUWP6FJeDLiif8jMT8obQvCKMSPUB8X"}'
    ['gcharang_SH']='{"IP":"mm-sh.lordofthechains.com","PeerID":"12D3KooWHAk9eJ78pwbopZMeHMhCEhXbph3CJ8Hbz5L1KWTmPf8C"}'
    ['smdmitry2_AR']='{"IP":"mm2-smdmitry2-ar.smdmitry.com","PeerID":"12D3KooWEpiMuCc47cYUXiLY5LcEEesREUNpZXF6KZA8jmFgxAeE"}'
    ['tonyl_AR']='{"IP":"ar.farting.pro","PeerID":"12D3KooWEMTeavnNtPPYr1u4aPFB6U39kdMD32SU1EpHGWqMpUJk"}'
    ['gcharang_DEV']='{"IP":"mm-dev.lordofthechains.com","PeerID":"12D3KooWMEwnQMPUHcGw65xMmhs1Aoc8WSEfCqTa9fFx2Y3PM9xg"}'
    ['caglarkaya_EU']='{"IP":"eu.caglarkaya.net","PeerID":"12D3KooWEg7MBp1P9k9rYVBcW5pa8tsHhyE5UuGAAerCARLzZBPn"}'
    ['chmex_SH']='{"IP":"1.sh.seed.adex.dexstats.info","PeerID":"12D3KooWE8Ju9SZyZrfkUgi25gFKv1Yc6zcQZ5GXtEged8rmLW3t"}'

But we can't connect to them; therefore, requests fail.

For the rest nodes, it works flawlessly all the time. Please note that during the initial request, there is a chance that some requests might fail for a few seconds as connection dials are just starting up. However, this is temporary and only lasts for a very short time frame (around 3-5 seconds).

@onur-ozkan
Copy link
Member Author

onur-ozkan commented Feb 28, 2024

When you invoke start_version_stat_collection RPC, you will see this kind of logs:

image

If any peer doesn't appear in the connection logs (Adding peer ...), that means there are some issues and you won't be able to connect to them (and all the requests on them will fail continiously).

@onur-ozkan
Copy link
Member Author

All registered nodes are present in MM2.json seednodes config.

This isn't needed btw.

@smk762
Copy link

smk762 commented Feb 28, 2024

Are you using this (c16a9eb) version?

Confirmed, recent video below. seednodes param removed from MM2.json. Running on server in CLI to eliminate any "bad connection" or docker effects. nodes respond at first, then just after the 3 min mark, most of them drop.

seednodes.mp4

I'll extend the query time to a minute (was 10 sec) and retry.

@onur-ozkan
Copy link
Member Author

I'll extend the query time to a minute (was 10 sec) and retry.

I did a test with 1 second query interval and run it for an hour, didn't see any problem such as dropping connections or so.

@onur-ozkan
Copy link
Member Author

I did another test with adding the following nodes using add_node_to_version_stat RPC:

declare -A nodes=(
	['fr1.cipig.net']='{"IP":"fr1.cipig.net","PeerID":"12D3KooWEaZpH61H4yuQkaNG5AsyGdpBhKRppaLdAY52a774ab5u"}'
	['icefyre.dragon-seed.com']='{"IP":"icefyre.dragon-seed.com","PeerID":"12D3KooWJDoV9vJdy6PnzwVETZ3fWGMhV41VhSbocR1h2geFqq9Y"}'
	['kalessin.dragon-seed.com']='{"IP":"kalessin.dragon-seed.com","PeerID":"12D3KooWPR2RoPi19vQtLugjCdvVmCcGLP2iXAzbDfP3tp81ZL4d"}'
	['balerion.dragon-seed.com']='{"IP":"balerion.dragon-seed.com","PeerID":"12D3KooWJWBnkVsVNjiqUEPjLyHpiSmQVAJ5t6qt1Txv5ctJi9Xd"}'
	['smaug.dragon-seed.com']='{"IP":"smaug.dragon-seed.com","PeerID":"12D3KooWEWzbYcosK2JK9XpFXzumfgsWJW1F7BZS15yLTrhfjX2Z"}'
	['falkor.dragon-seed.com']='{"IP":"falkor.dragon-seed.com","PeerID":"12D3KooWMrjLmrv8hNgAoVf1RfumfjyPStzd4nv5XL47zN4ZKisb"}'
	['drogon.dragon-seed.com']='{"IP":"drogon.dragon-seed.com","PeerID":"12D3KooWSmEi8ypaVzFA1AGde2RjxNW5Pvxw3qa2fVe48PjNs63R"}'
	['rhaegal.dragon-seed.com']='{"IP":"rhaegal.dragon-seed.com","PeerID":"12D3KooWAToxtunEBWCoAHjefSv74Nsmxranw8juy3eKEdrQyGRF"}'
	['viserion.dragon-seed.com']='{"IP":"viserion.dragon-seed.com","PeerID":"12D3KooWHKkHiNhZtKceQehHhPqwU5W1jXpoVBgS1qst899GjvTm"}'
)

The initial logs (the errors occurred because the connection dial was just started and not properly completed yet):

image

After that, here's how it proceeded:
image

Sent over 3000 requests and there hasn't been a single failure so far:

image

@smk762
Copy link

smk762 commented Feb 28, 2024

same drop after 3 min at 1 min intervals

3mins.mp4

Perhaps I'm getting issues due to a larger set of registered nodes? I tried 1 sec intervals and it seemed to last longer, perhaps like a "keepalive" at that freq - tho it will result in a significantly large MM2.db over the course of a notary season.

"Real world" conditions, I'd be running on a 15 min loop to check 4 times an hour to confirm a notary is eligible for a score for that hour by returning the correct version.

Is there a time period after which connected peers will update / change?

Here's my list of nodes for easier import:


./add_node_to_version_stat.sh alien_EU alien-eu.techloverhd.com 12D3KooWSCmjGYjmjEEiMYZyCZVuEYmGQCAtrMdpWcGSbGG39aHv
./add_node_to_version_stat.sh alien_NA alien-na.techloverhd.com 12D3KooWA9bym7s8gMdPVHcX872yjrz6Sq5rjpZAKBVFyoeWpJie
./add_node_to_version_stat.sh alien_SH alien-sh.techloverhd.com 12D3KooWBcVknefLZ3ZEfbFUHzfB2HzUjW4WLVDTe7TBqPmap9Cy
./add_node_to_version_stat.sh alienx_NA alienx-na.techloverhd.com 12D3KooWBXS7vcjYGQ5vy7nZj65FicpdxXsavPdLYB8gN7Ai3ruA
./add_node_to_version_stat.sh blackice_AR shadowbit-ar.mm2.kmd.sh 12D3KooWShhz3vfTqUXXVb9ivHeGBEEeMJvoda2ta8CVMhrX8RbZ
./add_node_to_version_stat.sh blackice_DEV shadowbit-dev.mm2.kmd.sh 12D3KooWDDZiyNn92StCdKXLLdxuYmkjJGPL5ezzyiJ2YVLMK56N
./add_node_to_version_stat.sh blackice_EU shadowbit-eu.mm2.kmd.sh 12D3KooWBT1UXwjqyavsDTVgWGeJkvrr8QgMScKpJF4oTLLgSk7k
./add_node_to_version_stat.sh chmex_AR 1.ar.seed.adex.dexstats.info 12D3KooWD3uwYqzDygMvU3jaJozEXfZiiRFnkVVwUgpu9kGqa5Yg
./add_node_to_version_stat.sh chmex_EU 1.eu.seed.adex.dexstats.info 12D3KooWGP4ryfJHXjfnbXUWP6FJeDLiif8jMT8obQvCKMSPUB8X
./add_node_to_version_stat.sh chmex_NA 1.na.seed.adex.dexstats.info 12D3KooWDNUgDwAAuJbyoS5DiRbhvMSwrUh1yepKsJH8URcFwPp3
./add_node_to_version_stat.sh chmex_SH 1.sh.seed.adex.dexstats.info 12D3KooWE8Ju9SZyZrfkUgi25gFKv1Yc6zcQZ5GXtEged8rmLW3t
./add_node_to_version_stat.sh cipi_AR cipi-ar.cipig.net 12D3KooWMsfmq3bNNPZTr7HdhTQvxovuR1jo5qvM362VQZorTk3F
./add_node_to_version_stat.sh cipi_EU cipi-eu.cipig.net 12D3KooWBhGrTVfaK9v12eA3Et84Y8Bc6ixfZVVGShsad2GBWzm3
./add_node_to_version_stat.sh cipi_NA cipi-na.cipig.net 12D3KooWBoQYTPf4q2bnsw8fUA2LKoknccVLrAcF1caCa48ev8QU
./add_node_to_version_stat.sh caglarkaya_EU eu.caglarkaya.net 12D3KooWEg7MBp1P9k9rYVBcW5pa8tsHhyE5UuGAAerCARLzZBPn
./add_node_to_version_stat.sh computergenie_EU cgeu.computergenie.gay 12D3KooWGkPFi43Nq6cAcc3gib1iECZijnKZLgEf1q1MBRKLczJF
./add_node_to_version_stat.sh computergenie_NA cg.computergenie.gay 12D3KooWCJWT5PAG1jdYHyMnoDcxBKMpPrUVi9gwSvVLjLUGmtQg
./add_node_to_version_stat.sh dragonhound_AR ar.smk.dog 12D3KooWSUABQ2beSQW2nXLiqn4DtfXyqbJQDd2SvmgoVwXjrd9c
./add_node_to_version_stat.sh dragonhound_DEV dev.smk.dog 12D3KooWEnrvbqvtTowYMR8FnBeKtryTj9RcXGx8EPpFZHou2ruP
./add_node_to_version_stat.sh dragonhound_EU s7eu.smk.dog 12D3KooWDgFfyAzbuYNLMzMaZT9zBJX9EHd38XLQDRbNDYAYqMzd
./add_node_to_version_stat.sh dragonhound_NA s7na.smk.dog 12D3KooWSmizY35qrfwX8qsuo8H8qrrvDjXBTMRBfeYsRQoybHaA
./add_node_to_version_stat.sh fediakash_AR fediakash.mooo.com 12D3KooWCSidNncnbDXrX5G6uWdFdCBrMpaCAqtNxSyfUcZgwF7t
./add_node_to_version_stat.sh gcharang_DEV mm-dev.lordofthechains.com 12D3KooWMEwnQMPUHcGw65xMmhs1Aoc8WSEfCqTa9fFx2Y3PM9xg
./add_node_to_version_stat.sh gcharang_SH mm-sh.lordofthechains.com 12D3KooWHAk9eJ78pwbopZMeHMhCEhXbph3CJ8Hbz5L1KWTmPf8C
./add_node_to_version_stat.sh gcharang_AR mm-ar.lordofthechains.com 12D3KooWDsFMoRoL5A4ii3UonuQZ9Ti2hrc7PpytRrct2Fg8GRq9
./add_node_to_version_stat.sh mcrypt_SH mcrypt2.v6.rocks 12D3KooWCDAPYXtNzC3x9kYuZySSf1WtxjGgasxapHEdFWs8Bep3
./add_node_to_version_stat.sh nodeone_NA nodeone.computergenie.gay 12D3KooWBTNDr6ih5efzVSxXtDv9wcVxHNj8RCvUnpKfKb6eUYet
./add_node_to_version_stat.sh sheeba_SH sheeba.computergenie.gay 12D3KooWC1P69a5TwpNisZYBXRgkrJDjGfn4QZ2L4nHZDGjcdR2N
./add_node_to_version_stat.sh smdmitry_AR mm2-smdmitry-ar.smdmitry.com 12D3KooWJ3dEWK7ym1uwc5SmwbmfFSRmELrA9aPJYxFRrQCCNdwF
./add_node_to_version_stat.sh smdmitry2_AR mm2-smdmitry2-ar.smdmitry.com 12D3KooWEpiMuCc47cYUXiLY5LcEEesREUNpZXF6KZA8jmFgxAeE
./add_node_to_version_stat.sh smdmitry_EU mm2-smdmitry-eu.smdmitry.com 12D3KooWJTYiU9CqVyycpMnGC96WyP1GE62Ng5g93AUe9wRx5g7W
./add_node_to_version_stat.sh smdmitry_SH mm2-smdmitry-sh.smdmitry.com 12D3KooWQP7PNNX5DSyhPX5igPQKQhet4KX7YaDqiGuNnarr4vRX
./add_node_to_version_stat.sh strob_SH sh.strobfx.com 12D3KooWFY5TmKpusUJ3jJBYK4va8xQchnJ6yyxCD7wZ2pWVK23p
./add_node_to_version_stat.sh tonyl_AR ar.farting.pro 12D3KooWEMTeavnNtPPYr1u4aPFB6U39kdMD32SU1EpHGWqMpUJk
./add_node_to_version_stat.sh tonyl_DEV dev.farting.pro 12D3KooWDubAUWDP2PgUXHjEdN3SGnkszcyUgahALFvaxgp9Jcyt
./add_node_to_version_stat.sh van_EU van.computergenie.gay 12D3KooWMX4hEznkanh4bTShzCZNx8JJkvGLETYtdVw8CWSaTUfQ
./add_node_to_version_stat.sh webworker01_EU eu2.webworker.sh 12D3KooWGF5siktvWLtXoRKgbzPYHn4rib9Fu8HHJEECRcNbNoAs
./add_node_to_version_stat.sh webworker01_NA na2.webworker.sh 12D3KooWRiv4gFUUSy2772YTagkZYdVkjLwiXkdcrtDQQuEqQaJ9
./add_node_to_version_stat.sh who-biz_NA adex.blur.cash 12D3KooWQp97gsRE5LbcUPjZcP7N6qqk2YbxJmPRUDeKVM5tbcQH
./add_node_to_version_stat.sh viserion viserion.dragon-seed.com 12D3KooWHKkHiNhZtKceQehHhPqwU5W1jXpoVBgS1qst899GjvTm
./add_node_to_version_stat.sh rhaegal rhaegal.dragon-seed.com 12D3KooWAToxtunEBWCoAHjefSv74Nsmxranw8juy3eKEdrQyGRF
./add_node_to_version_stat.sh drogon drogon.dragon-seed.com 12D3KooWSmEi8ypaVzFA1AGde2RjxNW5Pvxw3qa2fVe48PjNs63R
./add_node_to_version_stat.sh falkor falkor.dragon-seed.com 12D3KooWMrjLmrv8hNgAoVf1RfumfjyPStzd4nv5XL47zN4ZKisb
./add_node_to_version_stat.sh smaug smaug.dragon-seed.com 12D3KooWEWzbYcosK2JK9XpFXzumfgsWJW1F7BZS15yLTrhfjX2Z
./add_node_to_version_stat.sh balerion balerion.dragon-seed.com 12D3KooWJWBnkVsVNjiqUEPjLyHpiSmQVAJ5t6qt1Txv5ctJi9Xd
./add_node_to_version_stat.sh kalessin kalessin.dragon-seed.com 12D3KooWPR2RoPi19vQtLugjCdvVmCcGLP2iXAzbDfP3tp81ZL4d
./add_node_to_version_stat.sh icefyre icefyre.dragon-seed.com 12D3KooWJDoV9vJdy6PnzwVETZ3fWGMhV41VhSbocR1h2geFqq9Y
./add_node_to_version_stat.sh fr1.cipig.net fr1.cipig.net 12D3KooWEaZpH61H4yuQkaNG5AsyGdpBhKRppaLdAY52a774ab5u

and the import shell script:

source userpass
source rpc
curl --url "$url:$port" --data "{
    \"mmrpc\": \"2.0\",
    \"method\":\"add_node_to_version_stat\",
    \"userpass\":\"$userpass\",
    \"params\":{\"name\": \"${1}\",
    \"address\": \"${2}\",
    \"peer_id\": \"${3}\"}
}"
echo ""

@onur-ozkan
Copy link
Member Author

Perhaps I'm getting issues due to a larger set of registered nodes? I tried 1 sec intervals and it seemed to last longer, perhaps like a "keepalive" at that freq - tho it will result in a significantly large MM2.db over the course of a notary season.

I don't think so. I also did 10 seconds test with

declare -A nodes=(
	['fr1.cipig.net']='{"IP":"fr1.cipig.net","PeerID":"12D3KooWEaZpH61H4yuQkaNG5AsyGdpBhKRppaLdAY52a774ab5u"}'
	['icefyre.dragon-seed.com']='{"IP":"icefyre.dragon-seed.com","PeerID":"12D3KooWJDoV9vJdy6PnzwVETZ3fWGMhV41VhSbocR1h2geFqq9Y"}'
	['kalessin.dragon-seed.com']='{"IP":"kalessin.dragon-seed.com","PeerID":"12D3KooWPR2RoPi19vQtLugjCdvVmCcGLP2iXAzbDfP3tp81ZL4d"}'
	['balerion.dragon-seed.com']='{"IP":"balerion.dragon-seed.com","PeerID":"12D3KooWJWBnkVsVNjiqUEPjLyHpiSmQVAJ5t6qt1Txv5ctJi9Xd"}'
	['smaug.dragon-seed.com']='{"IP":"smaug.dragon-seed.com","PeerID":"12D3KooWEWzbYcosK2JK9XpFXzumfgsWJW1F7BZS15yLTrhfjX2Z"}'
	['falkor.dragon-seed.com']='{"IP":"falkor.dragon-seed.com","PeerID":"12D3KooWMrjLmrv8hNgAoVf1RfumfjyPStzd4nv5XL47zN4ZKisb"}'
	['drogon.dragon-seed.com']='{"IP":"drogon.dragon-seed.com","PeerID":"12D3KooWSmEi8ypaVzFA1AGde2RjxNW5Pvxw3qa2fVe48PjNs63R"}'
	['rhaegal.dragon-seed.com']='{"IP":"rhaegal.dragon-seed.com","PeerID":"12D3KooWAToxtunEBWCoAHjefSv74Nsmxranw8juy3eKEdrQyGRF"}'
	['viserion.dragon-seed.com']='{"IP":"viserion.dragon-seed.com","PeerID":"12D3KooWHKkHiNhZtKceQehHhPqwU5W1jXpoVBgS1qst899GjvTm"}'
)

and non of them failed.

Maybe try only testing these nodes with 10 seconds interval and see if you get any trouble? I started thinking that you have some connection issues in the server. If you can, please also remove the database before the test.

@smk762
Copy link

smk762 commented Feb 28, 2024

Most errors look like they are due to 28 12:47:13, mm2_p2p::behaviours::request_response:349] WARN Request 12 timed out . What is the timeout set to? Can we increase it?

@smk762
Copy link

smk762 commented Feb 28, 2024

I setup everything on a different server for a comparison test. Initially, with only the 9 hardcoded seednodes registered, everything was responding. I registered the 30 notary seeds (in same session), and strangely, none of the new ones would respond, and only 2/9 hardcoded seeds were still responding.

I restarted with a reduced set of 26 nodes I know have responded at least once during testing, as below:


./add_node_to_version_stat.sh alien_EU alien-eu.techloverhd.com 12D3KooWSCmjGYjmjEEiMYZyCZVuEYmGQCAtrMdpWcGSbGG39aHv
./add_node_to_version_stat.sh alien_NA alien-na.techloverhd.com 12D3KooWA9bym7s8gMdPVHcX872yjrz6Sq5rjpZAKBVFyoeWpJie
./add_node_to_version_stat.sh alien_SH alien-sh.techloverhd.com 12D3KooWBcVknefLZ3ZEfbFUHzfB2HzUjW4WLVDTe7TBqPmap9Cy
./add_node_to_version_stat.sh alienx_NA alienx-na.techloverhd.com 12D3KooWBXS7vcjYGQ5vy7nZj65FicpdxXsavPdLYB8gN7Ai3ruA
./add_node_to_version_stat.sh blackice_AR shadowbit-ar.mm2.kmd.sh 12D3KooWShhz3vfTqUXXVb9ivHeGBEEeMJvoda2ta8CVMhrX8RbZ
./add_node_to_version_stat.sh blackice_EU shadowbit-eu.mm2.kmd.sh 12D3KooWBT1UXwjqyavsDTVgWGeJkvrr8QgMScKpJF4oTLLgSk7k
./add_node_to_version_stat.sh cipi_AR cipi-ar.cipig.net 12D3KooWMsfmq3bNNPZTr7HdhTQvxovuR1jo5qvM362VQZorTk3F
./add_node_to_version_stat.sh cipi_EU cipi-eu.cipig.net 12D3KooWBhGrTVfaK9v12eA3Et84Y8Bc6ixfZVVGShsad2GBWzm3
./add_node_to_version_stat.sh cipi_NA cipi-na.cipig.net 12D3KooWBoQYTPf4q2bnsw8fUA2LKoknccVLrAcF1caCa48ev8QU
./add_node_to_version_stat.sh dragonhound_AR ar.smk.dog 12D3KooWSUABQ2beSQW2nXLiqn4DtfXyqbJQDd2SvmgoVwXjrd9c
./add_node_to_version_stat.sh dragonhound_DEV dev.smk.dog 12D3KooWEnrvbqvtTowYMR8FnBeKtryTj9RcXGx8EPpFZHou2ruP
./add_node_to_version_stat.sh dragonhound_EU s7eu.smk.dog 12D3KooWDgFfyAzbuYNLMzMaZT9zBJX9EHd38XLQDRbNDYAYqMzd
./add_node_to_version_stat.sh dragonhound_NA s7na.smk.dog 12D3KooWSmizY35qrfwX8qsuo8H8qrrvDjXBTMRBfeYsRQoybHaA
./add_node_to_version_stat.sh fediakash_AR fediakash.mooo.com 12D3KooWCSidNncnbDXrX5G6uWdFdCBrMpaCAqtNxSyfUcZgwF7t
./add_node_to_version_stat.sh smdmitry_AR mm2-smdmitry-ar.smdmitry.com 12D3KooWJ3dEWK7ym1uwc5SmwbmfFSRmELrA9aPJYxFRrQCCNdwF
./add_node_to_version_stat.sh smdmitry_EU mm2-smdmitry-eu.smdmitry.com 12D3KooWJTYiU9CqVyycpMnGC96WyP1GE62Ng5g93AUe9wRx5g7W
./add_node_to_version_stat.sh smdmitry_SH mm2-smdmitry-sh.smdmitry.com 12D3KooWQP7PNNX5DSyhPX5igPQKQhet4KX7YaDqiGuNnarr4vRX
./add_node_to_version_stat.sh viserion viserion.dragon-seed.com 12D3KooWHKkHiNhZtKceQehHhPqwU5W1jXpoVBgS1qst899GjvTm
./add_node_to_version_stat.sh rhaegal rhaegal.dragon-seed.com 12D3KooWAToxtunEBWCoAHjefSv74Nsmxranw8juy3eKEdrQyGRF
./add_node_to_version_stat.sh drogon drogon.dragon-seed.com 12D3KooWSmEi8ypaVzFA1AGde2RjxNW5Pvxw3qa2fVe48PjNs63R
./add_node_to_version_stat.sh falkor falkor.dragon-seed.com 12D3KooWMrjLmrv8hNgAoVf1RfumfjyPStzd4nv5XL47zN4ZKisb
./add_node_to_version_stat.sh smaug smaug.dragon-seed.com 12D3KooWEWzbYcosK2JK9XpFXzumfgsWJW1F7BZS15yLTrhfjX2Z
./add_node_to_version_stat.sh balerion balerion.dragon-seed.com 12D3KooWJWBnkVsVNjiqUEPjLyHpiSmQVAJ5t6qt1Txv5ctJi9Xd
./add_node_to_version_stat.sh kalessin kalessin.dragon-seed.com 12D3KooWPR2RoPi19vQtLugjCdvVmCcGLP2iXAzbDfP3tp81ZL4d
./add_node_to_version_stat.sh icefyre icefyre.dragon-seed.com 12D3KooWJDoV9vJdy6PnzwVETZ3fWGMhV41VhSbocR1h2geFqq9Y
./add_node_to_version_stat.sh fr1.cipig.net fr1.cipig.net 12D3KooWEaZpH61H4yuQkaNG5AsyGdpBhKRppaLdAY52a774ab5u

After initially populating the fresh DB, only 2 nodes which would be expected to be responsive were returning error. A couple of minutes later, 2 more that were responding initially were returning an error (4 total). After a couple more minutes, only 2 nodes were still returning a successful response. After 5-10 minutes, this remained unchanged - the same 2 nodes were still responding, the rest were still not.

I still believe the number of registered nodes could be a factor, and I dont think exclusively testing the hardcoded seeds is an ideal test case in relation to the intended use case of this method.

Signed-off-by: onur-ozkan <work@onurozkan.dev>
@onur-ozkan onur-ozkan changed the title feat(network): bootstrap swarm dial when AddReservedPeer invoked feat(network): improve efficiency of known peers handling Feb 28, 2024
@onur-ozkan
Copy link
Member Author

onur-ozkan commented Feb 28, 2024

064ee80 should greatly improve peer connection handling that you shouldn't even notice the connection drops (as they will immediately reconnect), even if the server has a slow connection.

@smk762 smk762 self-requested a review February 29, 2024 06:22
Copy link

@smk762 smk762 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Confirm that with the latest commit, all 26 registered nodes which are properly configured are returning a successful response within 1 minute of stats loop starting, and the connections remain persistent, returning a successful response without issue after an hour.

@shamardy shamardy merged commit 3df0e3f into dev Mar 1, 2024
26 of 32 checks passed
@shamardy shamardy deleted the increase-mesh-capacity branch March 1, 2024 13:41
dimxy added a commit to dimxy/komodo-defi-framework that referenced this pull request Mar 13, 2024
* dev:
  feat(indexeddb): advanced cursor filtering impl (KomodoPlatform#2066)
  update dockerhub destination repository (KomodoPlatform#2082)
  feat(event streaming): configurable worker path, use SharedWorker (KomodoPlatform#2080)
  fix(hd_tests): fix test_hd_utxo_tx_history unit test (KomodoPlatform#2078)
  feat(network): improve efficiency of known peers handling (KomodoPlatform#2074)
  feat(nft): enable eth with non fungible tokens (KomodoPlatform#2049)
  feat(ETH transport & heartbeats): various enhancements/features (KomodoPlatform#2058)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants