Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"too many open files" without doing anything #5739

Open
fiatjaf opened this issue Nov 5, 2018 · 26 comments
Open

"too many open files" without doing anything #5739

fiatjaf opened this issue Nov 5, 2018 · 26 comments
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@fiatjaf
Copy link

fiatjaf commented Nov 5, 2018

Version information:

go-ipfs version: 0.4.18-
Repo version: 7
System version: amd64/linux
Golang version: go1.11.1

Type:

Bug

Description:

I start my daemon and see this:

Swarm listening on /p2p-circuit
Swarm announcing /ip4/10.147.17.230/tcp/4001
Swarm announcing /ip4/10.147.20.230/tcp/4001
Swarm announcing /ip4/127.0.0.1/tcp/4001
Swarm announcing /ip4/191.249.146.196/tcp/17619
Swarm announcing /ip4/192.168.15.5/tcp/4001
Swarm announcing /ip6/2804:7f2:2080:2760:9e2a:70ff:fe8a:769/tcp/4001
Swarm announcing /ip6/2804:7f2:2080:2760:fce1:6fc7:5cdb:4739/tcp/4001
Swarm announcing /ip6/::1/tcp/4001
Swarm announcing /ip6/fc99:d06b:a6f9:f0:ab56::1/tcp/4001
API server listening on /ip4/0.0.0.0/tcp/5001
Gateway (readonly) server listening on /ip4/127.0.0.1/tcp/7070
Daemon is ready
2018/11/05 09:58:41 http: Accept error: accept tcp4 0.0.0.0:5001: accept4: too many open files; retrying in 5ms
2018/11/05 09:58:41 http: Accept error: accept tcp4 0.0.0.0:5001: accept4: too many open files; retrying in 10ms
09:58:56.900 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004166.ldb: too many open files engine.go:258
2018/11/05 09:59:15 http: Accept error: accept tcp4 0.0.0.0:5001: accept4: too many open files; retrying in 5ms
2018/11/05 09:59:15 http: Accept error: accept tcp4 0.0.0.0:5001: accept4: too many open files; retrying in 10ms
09:59:27.073 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004166.ldb: too many open files engine.go:258
09:59:31.985 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004166.ldb: too many open files engine.go:258
09:59:31.987 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004166.ldb: too many open files engine.go:258
09:59:31.989 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004166.ldb: too many open files engine.go:258
09:59:31.991 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004166.ldb: too many open files engine.go:258
09:59:31.992 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004163.ldb: too many open files engine.go:258
09:59:31.994 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004163.ldb: too many open files engine.go:258
09:59:45.244 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004166.ldb: too many open files engine.go:258
09:59:45.247 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004166.ldb: too many open files engine.go:258
09:59:45.248 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004166.ldb: too many open files engine.go:258
09:59:53.314 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004166.ldb: too many open files engine.go:258
10:00:01.211 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004159.ldb: too many open files engine.go:258
10:00:01.398 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004159.ldb: too many open files engine.go:258
10:00:01.504 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004160.ldb: too many open files engine.go:258
10:00:03.630 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004166.ldb: too many open files engine.go:258
10:00:03.637 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004166.ldb: too many open files engine.go:258
10:00:03.656 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004163.ldb: too many open files engine.go:258
10:00:05.864 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004157.ldb: too many open files engine.go:258
10:00:09.517 ERROR     engine: open /home/fiatjaf/.ipfs/datastore/004166.ldb: too many open files engine.go:258

And it goes on forever.
The only command I've run is ipfs files ls /, but I don't know what my node is doing, if it is serving some files or what, maybe it is. Is it failing to accept a connection from my own cli?

I don't know, but this seems related to ipfs/ipfs-companion#614 and #5738

@magik6k
Copy link
Member

magik6k commented Nov 5, 2018

Can you try running export DPID=$(pidof ipfs); watch -n0 'printf "sockets: %s\nleveldb: %s\nflatfs: %s\n" $(ls /proc/${DPID}/fd/ -l | grep "socket:" | wc -l) $(ls /proc/${DPID}/fd/ -l | grep "\\/datastore\\/" | wc -l) $(ls /proc/${DPID}/fd/ -l | grep "\\/blocks\\/" | wc -l); netstat -anpt 2>/dev/null | grep "$DPID/ipfs" | sort -k6 | column -N "a,b,c,d,e,f,g" -J | jq ".table[].f" --raw-output | uniq -c'
and see what consumes most of the FDs?

@fiatjaf
Copy link
Author

fiatjaf commented Nov 5, 2018

My column program doesn't have these options (apparently it is a BSD program, even though I'm on Ubuntu), apparently it can't output JSON. Where can I get a version equal to yours?

@eingenito
Copy link
Contributor

Mine doesn't either. You can try just:
export DPID=$(pidof ipfs); watch -n0 'printf "sockets: %s\nleveldb: %s\nflatfs: %s\n" $(ls /proc/${DPID}/fd/ -l | grep "socket:" | wc -l) $(ls /proc/${DPID}/fd/ -l | grep "\\/datastore\\/" | wc -l) $(ls /proc/${DPID}/fd/ -l | grep "\\/blocks\\/" | wc -l)'
And you should see counts of descriptors used by various IPFS subsystems. @magik6k what were you getting from netstat? Was that going to counts of connections by host?

@fiatjaf
Copy link
Author

fiatjaf commented Nov 10, 2018

The numbers go around this:

sockets: 941
leveldb: 504
flatfs: 1

On the logs I'm also seeing Nov 10 21:37:16 menger ipfs[31509]: 21:37:16.833 ERROR providers: error reading providers: open /home/fiatjaf/.ipfs/datastore/033341.ldb: too many open files providers.go:263 every now and then, in the middle of the "too many open files" logs.

@fiatjaf
Copy link
Author

fiatjaf commented Dec 4, 2018

I've stopped using my old repo since it has become unusable and started a new one. Today I tried to run the daemon on the old repo again and saw this:

fiatjaf@menger ~> set -x IPFS_PATH ~/.ipfs/
fiatjaf@menger ~> ipfs daemon
Initializing daemon...
go-ipfs version: 0.4.18-
Repo version: 7
System version: amd64/linux
Golang version: go1.11.1
Successfully raised file descriptor limit to 2048.
16:29:45.227 ERROR   cmd/ipfs: failed to read listening addresses: route ip+net: netlinkrib: too many open files daemon.go:507
Swarm announcing /ip4/186.213.114.36/tcp/49052
16:29:45.494 ERROR     flatfs: could not store final value of disk usage to file, future estimates may be inaccurate flatfs.go:846
Error: serveHTTPApi: manet.Listen(/ip4/0.0.0.0/tcp/5001) failed: listen tcp4 0.0.0.0:5001: socket: too many open files
Received interrupt signal, shutting down...
(Hit ctrl-c again to force-shutdown the daemon.)
fiatjaf@menger ~ [1]>

(Please note that I didn't interrupt the daemon or sent any signal to it.)

My other repo still works, however. Is this due to the fact that it has much less files in it? So IPFS will not work if I have a lot of files?

@magik6k
Copy link
Member

magik6k commented Dec 4, 2018

There is definitely something weird happening with leveldb, can you run du -sh ~/.ipfs/*

@fiatjaf
Copy link
Author

fiatjaf commented Dec 4, 2018

fiatjaf@menger ~> du -sh ~/.ipfs/*
7.3G	/home/fiatjaf/.ipfs/blocks
8.0K	/home/fiatjaf/.ipfs/config
8.0K	/home/fiatjaf/.ipfs/config-v5
667M	/home/fiatjaf/.ipfs/datastore
4.0K	/home/fiatjaf/.ipfs/datastore_spec
20K	/home/fiatjaf/.ipfs/keystore
4.0K	/home/fiatjaf/.ipfs/version

@magik6k
Copy link
Member

magik6k commented Dec 5, 2018

That's quite large leveldb. Are you using filestore/urlstore?

@fiatjaf
Copy link
Author

fiatjaf commented Dec 5, 2018

I don't know what these words mean. Is it possible that I may be using them in this case?

@fiatjaf
Copy link
Author

fiatjaf commented Dec 23, 2018

I have now two repos on my own machine, the one that was showing this problem has some data I wanted to take out of it and move to the other node, but I can't, because it can't stay alive to serve the data to the other. I see

fiatjaf@menger ~> env IPFS_PATH=/home/fiatjaf/.ipfs ipfs daemon
Initializing daemon...
go-ipfs version: 0.4.18-
Repo version: 7
System version: amd64/linux
Golang version: go1.11.1
Successfully raised file descriptor limit to 2048.
08:15:06.862 ERROR   cmd/ipfs: failed to read listening addresses: route ip+net: netlinkrib: too many open files daemon.go:507
Error: serveHTTPApi: manet.Listen(/ip4/0.0.0.0/tcp/5003) failed: listen tcp4 0.0.0.0:5003: socket: too many open files
fiatjaf@menger ~ [1]> 

Is there an emergency measure I can take just to save my files?

@schomatis
Copy link
Contributor

I don't know what these words mean. Is it possible that I may be using them in this case?

Those are options in your IPFS configuration, could you share your config and datastore_spec files to check?

@fiatjaf
Copy link
Author

fiatjaf commented Dec 23, 2018

config

{
  "API": {
    "HTTPHeaders": {
      "Access-Control-Allow-Methods": [
        "PUT",
        "GET",
        "POST"
      ],
      "Access-Control-Allow-Origin": [
        "http://127.0.0.1:5001",
        "https://webui.ipfs.io"
      ],
      "Access-Control-Allow-Credentials": [
        "true"
      ]
    }
  },
  "Addresses": {
    "API": "/ip4/0.0.0.0/tcp/5003",
    "Announce": null,
    "Gateway": "/ip4/127.0.0.1/tcp/7073",
    "NoAnnounce": null,
    "Swarm": [
      "/ip4/0.0.0.0/tcp/4003",
      "/ip6/::/tcp/4003"
    ]
  },
  "Bootstrap": [
    "/ip4/127.0.0.1/tcp/4001/ipfs/QmPaTioHoNL66UA5oXqAKiTLALkqm4R39kuFtm4Kz99Kkh"
  ],
  "Datastore": {
    "BloomFilterSize": 0,
    "GCPeriod": "1h",
    "HashOnRead": false,
    "Spec": {
      "mounts": [
        {
          "child": {
            "path": "blocks",
            "shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
            "sync": true,
            "type": "flatfs"
          },
          "mountpoint": "/blocks",
          "prefix": "flatfs.datastore",
          "type": "measure"
        },
        {
          "child": {
            "compression": "none",
            "path": "datastore",
            "type": "levelds"
          },
          "mountpoint": "/",
          "prefix": "leveldb.datastore",
          "type": "measure"
        }
      ],
      "type": "mount"
    },
    "StorageGCWatermark": 90,
    "StorageMax": "10GB"
  },
  "Discovery": {
    "MDNS": {
      "Enabled": false,
      "Interval": 10
    }
  },
  "Experimental": {
    "FilestoreEnabled": true,
    "Libp2pStreamMounting": false,
    "ShardingEnabled": false
  },
  "Gateway": {
    "HTTPHeaders": {
      "Access-Control-Allow-Headers": [
        "X-Requested-With",
        "Range"
      ],
      "Access-Control-Allow-Methods": [
        "GET"
      ],
      "Access-Control-Allow-Origin": [
        "*"
      ]
    },
    "PathPrefixes": [],
    "RootRedirect": "",
    "Writable": false
  },
  "Identity": {
    "PeerID": "QmZFLSEiUELWrT91KGLNDzGx5dkzVpQhoBjoLi6t1U3jdP",
    "PrivKey": "CAASqAkwggSkAgEAAoIBAQDLpgiaJqyr4SdO5XOrTIvsXOvRhpnG1d+o4IscUnqbG6qDE3OxzHZvNyK+WKLsOWY+veVmxaSIFEuJXSALat3cIw5EKp+7fQscQNwzV5lW4pKZyET4bHinHwNWcaMXj8KA3tM8E1BJR4T82LhgYVFcbm+7Bt4wSiP6K9G5jLxy9hKCD+mA4U4cNGQWbLRtKkgVd5Y2QTHSB/QNjymPruaDmaPRFwsa5RObViJL0VJSnqmXOCxA5zWSq29qzpV5qiho5kyQqsKZ3COe9KJkdBGFvZzn5RAPzo2eJ79RmDQ3KWqwp/tO+CMlA6h1tFxSgM6EBjSFG/EZA5L1DP36y2NJAgMBAAECggEBALOaiAWjzC9+UBOV63CM/u6DePr+McsZvrqK5kUhPL5lJPmbAzMwttcZEkw7kdyyNslo4tPDxXq6I3BPMD7Bjk9in2dhDCTngA/35/xj6nmlM1PrO2C5EaOah3AKoqLaB9luK2/VPL6UE+aHH/zod0AEqgeRZA3EpXwyfzGcvGrJpfC9RpfoCIzMgV0a6y3iVjXih6ltpxZikqZknfI3WrH8uLJgG19pv5nRpSWxzgkwkeLoUikv7hh+pqG6LqtmLpUbwmkQNMfZh/fSOQ5ZqMTVXbUFLrytoRHUY4fB0nRz1tflP3aN/yuTg9NCmM96H0QGoHIoU+qqRQhBUs+LWA0CgYEA3/HKiYxcyDKEyhTDnyM4CbFe5B5e17DiCOxCTx0txfZh6kRoM2qoB5IsGXnNMUZGvC7WHt6QxbmACYdgL2bMsHYTRgE4z6Rx7qWvTrwStABkU3vmKoGT9FDHDaG6MENVinipki978g/FX+peZp/KkTQ/Rrw6SDKzpIP4gym44RcCgYEA6MyFCI3XyxsLR5l+ogxBQBYdG+6pKAE05vC097hgkaTSunzzl8GB1N9sQTTO8NLgiPqwoR/xxANwHZb/lG90/VXWFpp9GQ/z8fj237oRMWC8pLNJo7nRo1z9CEjw94A8DWo04hDnAHCJxZtGPq5hZoGlL4A2qv/FJmbPybNG2p8CgYAnbDg8aJI4x/PqYydgz2FhC3Fp9RK7I69W5Mhzhu505/+qruotCvyTgJ70ySVfJED1hcU53/JabGJmywcasR0df1u7OiHXI9rOqSooUSF1wI/oxmnpV7BFFSdFdhAByQi4/K7VRjiqjy4uyWJe7IhLcYgmGqKj7REEyBqqdGDQdwKBgQCbhI1WwpMnTuDBKyxqiu9IJb26fDwqymuR37m1R0nT4h0YkgKVHaNjFwKVqPaZ8PYo6/f1G4cCIB3U1pvUiITJ/H6xyPDLPloECwK5QO7dYreC+3a1VpxSmvs6fqfjX5o+h/XeE9aN96BCD1Hk68+LkA5O5kMfBxCob8ReBVLPFwKBgFOMYYVla0uNNru8UCxlOgEq9Uvfshwc2oztbrDZlLmihoTEd72jWZevq2PWBxWGfvOmtqW1VxBY05pxuvbJQykKbZby00VPOPHuGIYnHLz/7+4Tem2L163ANsFpd/MPgERGeY4Rr9JKA2lFpeC5QiAbHFOduAu3EU8r/xErwUJ"
  },
  "Ipns": {
    "RecordLifetime": "",
    "RepublishPeriod": "",
    "ResolveCacheSize": 128
  },
  "Mounts": {
    "FuseAllowOther": false,
    "IPFS": "/ipfs",
    "IPNS": "/ipns"
  },
  "Reprovider": {
    "Interval": "12h",
    "Strategy": ""
  },
  "Routing": {
    "Type": "none"
  },
  "SupernodeRouting": {
    "Servers": null
  },
  "Swarm": {
    "AddrFilters": null,
    "ConnMgr": {
      "GracePeriod": "",
      "HighWater": 2,
      "LowWater": 1,
      "Type": ""
    },
    "DisableBandwidthMetrics": false,
    "DisableNatPortMap": false,
    "DisableRelay": false,
    "EnableRelayHop": false
  },
  "Tour": {
    "Last": ""
  }
}

datastore_spec

{
  "mounts": [
    {
      "mountpoint": "/blocks",
      "path": "blocks",
      "shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
      "type": "flatfs"
    },
    {
      "mountpoint": "/",
      "path": "datastore",
      "type": "levelds"
    }
  ],
  "type": "mount"
}

@fiatjaf
Copy link
Author

fiatjaf commented Dec 23, 2018

By the way, my datastore has increased to 1.2G since the last du I ran in a previous comment:

fiatjaf@menger ~> du -sh .ipfs/*
7.3G	.ipfs/blocks
8.0K	.ipfs/config
8.0K	.ipfs/config-v5
1.2G	.ipfs/datastore
4.0K	.ipfs/datastore_spec
20K	.ipfs/keystore
4.0K	.ipfs/version

That has happened despite I having done nothing at all with this repo (aside from trying to start the daemon on it some times and failing).

@schomatis
Copy link
Contributor

"FilestoreEnabled": true,

@magik6k

@schomatis
Copy link
Contributor

Note that me asking for your entire config file was definitely not a good security practice (should have asked for ipfs config show output) since this reveals your node private key, so after this issue is resolved be sure to regenerate that, sorry.

@schomatis
Copy link
Contributor

So maybe disabling the Filestore feature (which I'm not sure how it got turned on in the first place) could help alleviate the problem to let the node run long enough to extract the information you want,

ipfs config --json Experimental.FilestoreEnabled false

but if the data you look for was encoded in Filestore in the first place you won't be able to access it through IPFS (although that would also mean that the data is saved in actual files outside the repo).

Also, note that you don't need to have the daemon running to query your local data, just running the ipfs commands as a standalone CLI will be enough to retrieve it (if the data is indeed in your node since the network will be unreachable in this scenario).

@fiatjaf
Copy link
Author

fiatjaf commented Dec 23, 2018

Don't worry about the private key, I've deleted part of it before posting, but I also don't intend to use it anymore.

Oh, so Filestore means that thing where you add some stuff to the repo without duplicating? I was using that for a while, but given up since it was making me confuse.

fiatjaf@menger ~ [1]> env IPFS_PATH=/home/fiatjaf/.ipfs ipfs config --json Experimental.FilestoreEnabled false
fiatjaf@menger ~> env IPFS_PATH=/home/fiatjaf/.ipfs ipfs daemon
Initializing daemon...
go-ipfs version: 0.4.18-
Repo version: 7
System version: amd64/linux
Golang version: go1.11.1
Successfully raised file descriptor limit to 2048.
18:12:13.387 ERROR   cmd/ipfs: failed to read listening addresses: route ip+net: netlinkrib: too many open files daemon.go:507
Error: serveHTTPApi: manet.Listen(/ip4/0.0.0.0/tcp/5003) failed: listen tcp4 0.0.0.0:5003: socket: too many open files
Received interrupt signal, shutting down...
(Hit ctrl-c again to force-shutdown the daemon.)
fiatjaf@menger ~ [1]> 

To query the local data you mean using ipfs get? I get other bizarre behaviors running that, should I open another issue?

@schomatis
Copy link
Contributor

Oh, so Filestore means that thing where you add some stuff to the repo without duplicating?

Yes.

To query the local data you mean using ipfs get?

Yes, the same command you intended to run with the daemon running.

I get other bizarre behaviors running that, should I open another issue?

Post it here (since I'm suspecting there's a chance they may be related to a more general problem in your environment) and we can open an issue later if necessary.

@fiatjaf
Copy link
Author

fiatjaf commented Dec 23, 2018

fiatjaf@menger ~ [1]> env IPFS_PATH=/home/fiatjaf/.ipfs ipfs ls /ipfs/zdj7WWbxXhFaTUTPTY18ea7awwF3uoxJJJuewH5hX7WGTfgT3 
QmUD6HdcU56z2xRbob3tTGeVKpcSq5TrigA1T8SjShpLcT 1478737112  aulas-gugu/
QmRA3NWM82ZGynMbYzAgYTSXCVM14Wx1RZ8fKP42G6gjgj 184306      bitcoin.pdf
QmcRwn8zHMUthrnwqehb8a69NJ5A3hq3no9ntWRVXxkcTm 13690139868 cof/
QmdSKwDV7bHzLX5bXLHhjLKC75M8JZuUtjxALXpL2tyCJ6 1031757184  consciencia-de-imortalidade/
QmWt9b6yRDS5guYGrYepKSTWGeKgjq1LcVnpsh8SAVAyDX 1337291086  educacao-liberal-literatura/
QmYVs8vfvx1JoaEGQ3ZSdASioDNHuoJC9rd2ry58iBiZSK 738049702   hangouts/
QmbEVFkLyNjvYRSYsPeDjBaR6c4pPyUAWACzupf857Fkjy 1904309683  historia-essencial-da-filosofia/
Qmc5m94Gu7z62RC8waSKkZUrCCBJPyHbkpmGzEePxy2oXJ 9           index.html
QmXpZ7THy8g8mNYsYiYB1uHCQxY9abTm1agcyNpSSuM6Z5 350002940   introducao-ao-pensamento-filosofico-gugu/
QmSih5ptwYWQxwLFzuZVkAnL7CAF5Jt7gqCSgLSqzCDpDL 570626308   leitura-de-classicos-gugu/
QmcM7yrXV8nZe7gRcAiTBWys1ayU6Na4Yv1ZwEUibuAngF 447066449   literatura-e-autobiografia/
QmRynvoT4Uk2Ty1LWU7demU8B1FhmUhajRjhZidAky6j2t 2228709913  livros/
QmSkiJ1YnwNVvGMV7w5BPkpAuBX3NGw6cMJQkLbyeKazPz 4294        livros-olavo.html
QmV68mD6TnbCuSCpSuuhzgQuitCZv6XYvC3iLwywZcB7pD 637162079   lonergan-halifax/
Qmdvbn9TYm32aVeCxeQgfANHghMrc1cqV5S7h7rw42zpvt 667600380   mario-ferreira-dos-santos/
QmXoSbNBbEuhZbtrQqudEjcn4u89L6o8xNHXpav6KcRyhE 77551966    olavodecarvalho.org/
QmdU9XQDzMHnsadbwcTT3YeuY85vUTTGMyVAao1HJp9Y3q 7559748     pensadoresbrasileiros.home.comcast.net/
QmbKux7eQdtUHFVP2tYn2rJwzk4qk2poBLok81jdhxUtob 420213556   primeiro-evento-do-icls/
QmYn4B2mXTfyWdoziFy7TwU1sQMEtwbL4wdv9GJQNGaRNa 2611057944  religioes-do-mundo/
QmQr3RfaqfzJDjf4RssyKVX4i4RGrLgunJDvy1j4f3Mt2d 956177686   roger-scruton-crise-da-inteligencia/
QmSm7hsUWSBiMkF269iz8cYmyj9SCSvtLJ5b2gNtg5pP6C 4026661271  true-outspeak/
fiatjaf@menger ~> env IPFS_PATH=/home/fiatjaf/.ipfs ipfs get /ipfs/zdj7WWbxXhFaTUTPTY18ea7awwF3uoxJJJuewH5hX7WGTfgT3 -o w.alhur.es
Saving file(s) to w.alhur.es
 30.90 GiB / 30.90 GiB [===========================================================] 100.00% 0s
Error: failed to fetch all nodes

Why is this odd?

Yesterday I was trying to fetch this same hash using my other repo ipfs get, no env (I thought I had these files on another node running on another computer, but actually I just had a part of it[1]) then I got to 1.88% after many hours. That with the daemon running. Then I stopped it, stopped the daemon and run ipfs get again (again, no env, using the other repo), and got that instant 100% thing and the same error as above (please note that the 100% is wrong, actually nothing has been downloaded and the target directory is empty). Now if I run ipfs get without env I get Error: api not running.

[1]: I think I was able to fetch the index from the ipfs.io public gateway, since that w.alhur.es is/was a page being served from there.

@schomatis
Copy link
Contributor

Now if I run ipfs get without env I get Error: api not running.

That is an unfortunate error message that we need to fix (#5784), it normally happens when the daemon doesn't clean up correctly its state, deleting the $IPFS_PATH/api should be enough.

@fiatjaf
Copy link
Author

fiatjaf commented Dec 27, 2018

@schomatis can you help me with that failed ipfs get that reports a 100% but doesn't actually get anything?

@schomatis
Copy link
Contributor

schomatis commented Dec 27, 2018

Yes, in general disregard the process bar, in many cases it's not accurate, trust the failed to fetch all nodes, which would indicate that some part of your data may not be present. Is it distributed between the two nodes? in that case I think you'll need to run the daemon to connect them, which will trigger the "too many open files" error, and I'm not sure how to continue from there.

@fiatjaf
Copy link
Author

fiatjaf commented Dec 27, 2018

Well, the data should be all this failing node, but if it is failing I think I'll stop assuming it's a bug and try to fetch each subdirectory individually and see what happens.

@schomatis
Copy link
Contributor

Yes, it would help identify if there's a particular block missing or if we just have many blanks across the DAG.

@Stebalien Stebalien added the kind/bug A bug in existing code (including security flaws) label Dec 27, 2018
@Stebalien
Copy link
Member

ipfs config --json Experimental.FilestoreEnabled false

Note:Tthat won't really change anything at this point (except it might make it impossible to find some blocks). Really, I wouldn't disable that once you've enabled it.


Do you really have 941 connections? Try ipfs warm peers | wc -l. I ask because I noticed that you have the DHT disabled (which is usually how one finds peers).

The ConnMgr section also looks a bit funky. You may want to set it to:

    "ConnMgr": {
      "GracePeriod": "30s",
      "HighWater": 100,
      "LowWater": 20,
      "Type": "basic"
    },

That should significantly reduce the number of open files (assuming you do have 941 open connections).

@fiatjaf
Copy link
Author

fiatjaf commented Dec 27, 2018

I've disabled dht later, to see if things improved. And removed nodes from the bootstrap list and put 1/2 on low/highwater. I don't know why I don't have a "type" on connmgr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

No branches or pull requests

5 participants