You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
A couple of miners experienced that deals with the same client suddenly started to stop in "StorageDealsTransferring" with 0 bytes transferred, after successfully sealing deals from the client over many days. This issue is based on all the feedback and logs from both miners and the client in the Slack-thread https://filecoinproject.slack.com/archives/C01AZP8BKRQ/p1609164686470600
On the miner side
lotus-miner storage-deals list -v
Dec 28 14:47:07 bafyreigpl5vm64hhowjc2g5e7reppu6hngme5ff4lkaa3ykd6h2r5zveza 0 StorageDealTransferring f3sar6tat2mhqadlqjjxewmpqya3whpkzuqaevpo5vwfvdhnm3uztd7p67d6ozpsfkiwvrvqrhp3zu6psi23ma 16GiB 0 FIL 1052529 12D3KooWSDhUpT9YMLxRTEAf3maSR34M64yYm4XBHKeYhwtbaUo7-12D3KooWMpuRzgVuRS6m285cucGgnXWwTnsNtGzjsMJeEqS7UtJw-426
Data-Transfers shows that the transfer is ongoing, but with 0 bytes transferred:
lotus-miner data-transfers list
ID Status Receiving From Root Cid Initiated? Transferred Voucher
426 Ongoing ...hwtbaUo7 ...peoy6mfa N 0B ...7reppu6hngme5ff4lkaa3ykd6h2r5zveza"}}
My miner has a connection with the client:
lotus-miner net connect /ip4/X.--..--.XX/tcp/36359/p2p/12D3KooWSDhUpT9YMLxRTEAf3maSR34M64yYm4XBHKeYhwtbaUo7
connect 12D3KooWSDhUpT9YMLxRTEAf3maSR34M64yYm4XBHKeYhwtbaUo7: success
If I try to restart the channel I can see in the logs that the channel has timed out:
Client Deals Dec 28 21:47:07 bafyreigpl5vm64hhowjc2g5e7reppu6hngme5ff4lkaa3ykd6h2r5zveza 0 f023467 StorageDealStartDataTransfer N N baga6ea4seaqg3tsrp56bchrhumenu4cjnl57cpxudfiggettmlejqggtwf74coa 15.88 GiB 0 FIL 1052529 true
Client Transfers 426 Requested ...EqS7UtJw ...peoy6mfa Y 0B ...7reppu6hngme5ff4lkaa3ykd6h2r5zveza"}}
The client checked his network bandwidth and didn't have any problems with wget or scp to another machine. But there are no clear lotus transfers in nethogs.
Client also checked his socket limit:
use ulimit -n
get 1024
Then is use netstat |wc
There are about 850 connects. So it may be socket connections number limit.
Then I revised the limit to 64435 and restart servers.
Using nethogs command, I find the send rate of lotus reach 10M/s. It seems ok.
Then I check the transfers, I find the max transfer id reach 655, but the previous max transfer id is 428. I didn't add any new deals in these times.
Possible problems might be:
Socket Connection Limit
Lotus Client
To Reproduce
Steps to reproduce the behavior:
Client send deals to miners over a longer period of time
Miners are able to seal the deals for a longer period of time, and then suddenly at almost the same time multiple miners experience that deals stop in "StorageDealsTransferring" state
Version (run lotus version):
Daemon: 1.4.0+git.e9989d0e4+api1.0.0
Local: lotus version 1.4.0+git.e9989d0e4
Additional context
Ongoing discussion is also happening in the slack-channel. And this issue also seems to be related to #4946 & #5211
The text was updated successfully, but these errors were encountered:
After experiencing another batch of deals stuck in "StorageDealsTransferring"-state with the same client, I think we have pin-pointed to at least one reason why this happens.
I restarted both my lotus daemon/miner due to some maintenance, and when I got the next storage-deal proposal after restarting my miner, the deal got stuck in "StorageDealsTransferring". To solve the problem the client also restarted his lotus-process, and then the transfers started.
So steps to reproduce:
Client sends deal to a miner.Transfers work normally, and the miner seals the deal.
Miner restarts the miner. Client send a new deal after miner is back online.
Describe the bug
A couple of miners experienced that deals with the same client suddenly started to stop in "StorageDealsTransferring" with 0 bytes transferred, after successfully sealing deals from the client over many days. This issue is based on all the feedback and logs from both miners and the client in the Slack-thread https://filecoinproject.slack.com/archives/C01AZP8BKRQ/p1609164686470600
On the miner side
Data-Transfers shows that the transfer is ongoing, but with 0 bytes transferred:
My miner has a connection with the client:
If I try to restart the channel I can see in the logs that the channel has timed out:
On the client-side
Client Deals
Dec 28 21:47:07 bafyreigpl5vm64hhowjc2g5e7reppu6hngme5ff4lkaa3ykd6h2r5zveza 0 f023467 StorageDealStartDataTransfer N N baga6ea4seaqg3tsrp56bchrhumenu4cjnl57cpxudfiggettmlejqggtwf74coa 15.88 GiB 0 FIL 1052529 true
Client Transfers
426 Requested ...EqS7UtJw ...peoy6mfa Y 0B ...7reppu6hngme5ff4lkaa3ykd6h2r5zveza"}}
The client checked his network bandwidth and didn't have any problems with wget or scp to another machine. But there are no clear lotus transfers in nethogs.
Client also checked his socket limit:
Possible problems might be:
To Reproduce
Steps to reproduce the behavior:
Version (run
lotus version
):Daemon: 1.4.0+git.e9989d0e4+api1.0.0
Local: lotus version 1.4.0+git.e9989d0e4
Additional context
Ongoing discussion is also happening in the slack-channel. And this issue also seems to be related to #4946 & #5211
The text was updated successfully, but these errors were encountered: