Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: Adding a lot of files to MFS will slow ipfs down significantly #8694

Open
3 tasks done
RubenKelevra opened this issue Jan 22, 2022 · 40 comments
Open
3 tasks done
Labels
kind/bug A bug in existing code (including security flaws) need/analysis Needs further analysis before proceeding P1 High: Likely tackled by core team if no one steps up topic/MFS Topic MFS topic/sharding Topic about Sharding (HAMT etc)

Comments

@RubenKelevra
Copy link
Contributor

RubenKelevra commented Jan 22, 2022

Checklist

Installation method

built from source

Version

go-ipfs version: 0.13.0-dev-2a871ef01
Repo version: 12
System version: amd64/linux
Golang version: go1.17.6

Config

{
  "API": {
    "HTTPHeaders": {}
  },
  "Addresses": {
    "API": "/ip4/127.0.0.1/tcp/5001",
    "Announce": [],
    "AppendAnnounce": null,
    "Gateway": "/ip4/127.0.0.1/tcp/80",
    "NoAnnounce": [
      "/ip4/10.0.0.0/ipcidr/8",
      "/ip4/100.64.0.0/ipcidr/10",
      "/ip4/169.254.0.0/ipcidr/16",
      "/ip4/172.16.0.0/ipcidr/12",
      "/ip4/192.0.0.0/ipcidr/24",
      "/ip4/192.0.0.0/ipcidr/29",
      "/ip4/192.0.0.8/ipcidr/32",
      "/ip4/192.0.0.170/ipcidr/32",
      "/ip4/192.0.0.171/ipcidr/32",
      "/ip4/192.0.2.0/ipcidr/24",
      "/ip4/192.168.0.0/ipcidr/16",
      "/ip4/198.18.0.0/ipcidr/15",
      "/ip4/198.51.100.0/ipcidr/24",
      "/ip4/203.0.113.0/ipcidr/24",
      "/ip4/240.0.0.0/ipcidr/4",
      "/ip6/100::/ipcidr/64",
      "/ip6/2001:2::/ipcidr/48",
      "/ip6/2001:db8::/ipcidr/32",
      "/ip6/fc00::/ipcidr/7",
      "/ip6/fe80::/ipcidr/10"
    ],
    "Swarm": [
      "/ip4/0.0.0.0/tcp/443",
      "/ip6/::/tcp/443",
      "/ip4/0.0.0.0/udp/443/quic",
      "/ip6/::/udp/443/quic"
    ]
  },
  "AutoNAT": {},
  "Bootstrap": [
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN",
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa",
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb",
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt",
    "/ip4/104.131.131.82/tcp/4001/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
    "/ip4/104.131.131.82/udp/4001/quic/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ"
  ],
  "DNS": {
    "Resolvers": null
  },
  "Datastore": {
    "BloomFilterSize": 0,
    "GCPeriod": "1h",
    "HashOnRead": false,
    "Spec": {
      "mounts": [
        {
          "child": {
            "path": "blocks",
            "shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
            "sync": false,
            "type": "flatfs"
          },
          "mountpoint": "/blocks",
          "prefix": "flatfs.datastore",
          "type": "measure"
        },
        {
          "child": {
            "compression": "none",
            "path": "datastore",
            "type": "levelds"
          },
          "mountpoint": "/",
          "prefix": "leveldb.datastore",
          "type": "measure"
        }
      ],
      "type": "mount"
    },
    "StorageGCWatermark": 90,
    "StorageMax": "500GB"
  },
  "Discovery": {
    "MDNS": {
      "Enabled": false,
      "Interval": 10
    }
  },
  "Experimental": {
    "AcceleratedDHTClient": false,
    "FilestoreEnabled": false,
    "GraphsyncEnabled": false,
    "Libp2pStreamMounting": false,
    "P2pHttpProxy": false,
    "StrategicProviding": false,
    "UrlstoreEnabled": false
  },
  "Gateway": {
    "APICommands": [],
    "HTTPHeaders": {
      "Access-Control-Allow-Headers": [
        "X-Requested-With",
        "Range",
        "User-Agent"
      ],
      "Access-Control-Allow-Methods": [
        "GET"
      ],
      "Access-Control-Allow-Origin": [
        "*"
      ]
    },
    "NoDNSLink": false,
    "NoFetch": false,
    "PathPrefixes": [],
    "PublicGateways": null,
    "RootRedirect": "",
    "Writable": false
  },
  "Identity": {
    "PeerID": "xxx"
  },
  "Internal": {},
  "Ipns": {
    "RecordLifetime": "96h",
    "RepublishPeriod": "",
    "ResolveCacheSize": 2048
  },
  "Migration": {
    "DownloadSources": null,
    "Keep": ""
  },
  "Mounts": {
    "FuseAllowOther": false,
    "IPFS": "/ipfs",
    "IPNS": "/ipns"
  },
  "Peering": {
    "Peers": null
  },
  "Pinning": {},
  "Plugins": {
    "Plugins": null
  },
  "Provider": {
    "Strategy": ""
  },
  "Pubsub": {
    "DisableSigning": false,
    "Router": "gossipsub"
  },
  "Reprovider": {
    "Interval": "12h",
    "Strategy": "all"
  },
  "Routing": {
    "Type": "dhtserver"
  },
  "Swarm": {
    "AddrFilters": [
      "/ip4/10.0.0.0/ipcidr/8",
      "/ip4/100.64.0.0/ipcidr/10",
      "/ip4/169.254.0.0/ipcidr/16",
      "/ip4/172.16.0.0/ipcidr/12",
      "/ip4/192.0.0.0/ipcidr/24",
      "/ip4/192.0.0.0/ipcidr/29",
      "/ip4/192.0.0.8/ipcidr/32",
      "/ip4/192.0.0.170/ipcidr/32",
      "/ip4/192.0.0.171/ipcidr/32",
      "/ip4/192.0.2.0/ipcidr/24",
      "/ip4/192.168.0.0/ipcidr/16",
      "/ip4/198.18.0.0/ipcidr/15",
      "/ip4/198.51.100.0/ipcidr/24",
      "/ip4/203.0.113.0/ipcidr/24",
      "/ip4/240.0.0.0/ipcidr/4",
      "/ip6/100::/ipcidr/64",
      "/ip6/2001:2::/ipcidr/48",
      "/ip6/2001:db8::/ipcidr/32",
      "/ip6/fc00::/ipcidr/7",
      "/ip6/fe80::/ipcidr/10"
    ],
    "ConnMgr": {
      "GracePeriod": "3m",
      "HighWater": 700,
      "LowWater": 500,
      "Type": "basic"
    },
    "DisableBandwidthMetrics": false,
    "DisableNatPortMap": true,
    "RelayClient": {},
    "RelayService": {},
    "Transports": {
      "Multiplexers": {},
      "Network": {
        "QUIC": false
      },
      "Security": {}
    }
  }
}

Description

I'm running 2a871ef compiled by go 1.17.6 on Arch Linux for some days on one of my servers.

I had trouble with my MFS datastore after updating (I couldn't delete a file). So I reset my datastore and started importing the data again.

I'm using a shell script that adds the files and folders individually. Because of #7532, I can't use ipfs files write but instead use ipfs add, followed by an ipfs files cp /ipfs/$cid /path/to/file and an ipfs pin rm $cid.

For the ipfs add is set size-65536 as the chunker, blake2b-256 as the hashing algorithm, and use raw-leaves.


After the 3 days, there was basically no IO on the machine and ipfs was using around 1.6 cores pretty consistently without any progress real progress. At that time only this one script was running against the API with no concurrency. The automatic garbage collector of ipfs is off.

There are no experimental settings activated and I'm using flatfs.

I did some debugging, all operations were still working, just extremely slow:

$ time /usr/sbin/ipfs --api=/ip4/127.0.0.1/tcp/5001 files stat --hash --offline /x86-64.archlinux.pkg.pacman.store/community
bafybeianfwoujqfauris6eci6nclgng72jttdp5xtyeygmkivzyss4xhum

real	0m59.164s
user	0m0.299s
sys	0m0.042s

and

$ time /usr/sbin/ipfs --api=/ip4/127.0.0.1/tcp/5001 files stat --hash --offline --with-local /x86-64.archlinux.pkg.pacman.store/community
bafybeie5kkzcg6ftmppbuauy3tgtx2f4gyp7nhfdfsveca7loopufbijxu
Local: 20 GB of 20 GB (100.00%)

real	4m55.298s
user	0m0.378s
sys	0m0.031s

This is while my script was still running on the API and waiting minutes on each response.

Here's my memory dump etc. while the issue occurred: /ipfs/QmPJ1ec2CywWLFeaHFaTeo6g56S5Bqi3g3MEF1a3JrL8zk

Here's a dump after I stopped the import of files and the CPU usage dropped down to like 0.3 cores: /ipfs/QmbotJhgzc2SBxuvGA9dsCFLbxd836QBNFYkLhdqTCZwrP

Here's what the memory looked like as the issue occurred (according to atop 1):

MEM
tot    31.4G
free    6.6G
cache   1.1G
dirty   0.1M
buff   48.9M
slab    7.1G
slrec   3.7G
shmem   2.0M
shrss   0.0M
vmbal   0.0M
zfarc  15.6G
hptot   0.0M
hpuse   0.0M

The machine got 10 dedicated cores from a AMD EPYC 7702 and 1 TB SSD storage via NAS.

@RubenKelevra RubenKelevra added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Jan 22, 2022
@RubenKelevra
Copy link
Contributor Author

The shellscript I'm using is open source, so you should be able to reproduce this:

git clone https://github.com/RubenKelevra/rsync2ipfs-cluster.git rsync2ipfs-cluster
cd rsync2ipfs-cluster
git reset --hard 1fd9712371f0315a35a80e9680340655ba751d7a
bash bin/rsync2cluster.sh --create --arch-config

This will rsync the arch package mirror and loop over the files and import this into the local MFS.

Just make sure you have enough space in ~ for the download (69GB) and on the IPFS node to write this into the storage.

@aschmahmann
Copy link
Contributor

@RubenKelevra do you know which version caused a regression? Have you tried with v0.11.0? v0.12.0 is a very targeted release which should not have disturbed much so understanding when this issue emerged would be very helpful.

@RubenKelevra
Copy link
Contributor Author

Hey @aschmahmann, I started the import on 0.11 yesterday. As soon as I'm home I can report if this is happening there too.

While an offline import works without slowdown, I still get sometimes errors back which looks like the ipfs add comes too fast back and the ipfs files cp * command can't yet access the CID.

This seems to be a dedicated issue which is probably not a regression, as I never tried importing it off line before.

@RubenKelevra
Copy link
Contributor Author

I can confirm this issue for 0.11 as well, so it's not a new thing.

$ ipfs version --all
go-ipfs version: 0.11.0-67220edaa
Repo version: 11
System version: amd64/linux
Golang version: go1.17.6

The next step for me is to try the binary from dist.ipfs.io to rule out any build issues.

@RubenKelevra
Copy link
Contributor Author

The next step for me is to try the binary from dist.ipfs.io to rule out any build issues.

I can confirm the issue for the binary from dist.ipfs.io as well.

@aschmahmann
Copy link
Contributor

aschmahmann commented Jan 30, 2022

I can confirm this issue for 0.11 as well, so it's not a new thing.

Thanks that's very helpful. Is this a v0.10.0 -> v0.11.0 thing? When was the last known version before the behavior started changing? In any event, having a more minimal reproduction would help (e.g. making a version of the script that works from a local folder rather than relying on rsync).

If this is v0.11.0 related then my suspicion is that you have directories that were small enough you could transfer them through go-ipfs previously, but large enough that MFS will now automatically shard them (could be confirmed by looking at your MFS via ipfs dag get and seeing if you have links like FF in your directories). IIRC I saw some HAMT checks in your profile dump which would support this.

If so then what exactly about sharded directories + MFS is causing the slow down should be looked at. Some things I'd start with investigating are:

  • The modifications of the sharded directories are more expensive for repeated MFS updates
    • Since you have to modify multiple blocks at a time
    • The limit checks for automatic sharding/unsharding are too expensive for repeated MFS modifications
    • Bulking up writes and flushing would likely help here, although if going down this road I'd be careful. My suspicion is that MFS flush has not been extensively tested and probably even more so with sharded directories

@RubenKelevra
Copy link
Contributor Author

Thanks that's very helpful. Is this a v0.10.0 -> v0.11.0 thing? When was the last known version before the behavior started changing?

I think the last time I ran a full import I was on 0.9.1.

I just started the import to make sure that's correct.

In any event, having a more minimal reproduction would help (e.g. making a version of the script that works from a local folder rather than relying on rsync).

Sure, if you want to avoid any rsync, just comment out L87. I think that should work.

The script will still expect a repository directory like from Manjaro or Arch to work properly, but you can just reuse the same repository without having to update it between each try.

If so then what exactly about sharded directories + MFS is causing the slow down should be looked at. Some things I'd start with investigating are:

  • The modifications of the sharded directories are more expensive for repeated MFS updates

    • Since you have to modify multiple blocks at a time
    • The limit checks for automatic sharding/unsharding are too expensive for repeated MFS modifications
    • Bulking up writes and flushing would likely help here, although if going down this road I'd be careful. My suspicion is that MFS flush has not been extensively tested and probably even more so with sharded directories

Sounds like a reasonable suspicion, but on the other hand, this shouldn't lead to minutes in response time for simple operations.

I feel like we're dealing with some kind of locked operation which gets "overwritten" with new data fed into ipfs, while it's running. So we pile up tasks before a lock.

This would explain why it's starting fast and get slower and slower until it's basically down to a crawl.

@RubenKelevra
Copy link
Contributor Author

RubenKelevra commented Feb 5, 2022

Ah and additionally, I used sharding previously just for testing, but decided against it. So the import was running fine with sharding previously (like with 0.4 or something).

Previously, there was no need for sharding, which makes me wonder why IPFS would do sharding if it's not necessary.

@RubenKelevra
Copy link
Contributor Author

@aschmahmann I've installed 0.9.1 from dist.ipfs.io and I can confirm, the bug is not present in this version.

@aschmahmann
Copy link
Contributor

Ok, so to clarify your performance/testing looks like:

  • v0.12.0-rc1 ❌
  • v0.11.0 ❌ (includes automatic UnixFS sharding)
  • v0.10.0 ❓ (includes a bunch of code moving to the newer IPLD libraries)
  • v0.9.1 ✔️

Previously there was no need for sharding, which makes me wonder why IPFS would do sharding if it's not necessary.

TLDR: Two reasons. 1) Serializing the block to check if it exceeds the limit before re-encoding it is expensive, so having some conservative estimate is reasonable 2) Maxing out the block size isn't necessarily optimal. For example, if you keep writing blocks up to 1MB in size then every time you add an entry you create a duplicate block of similar size which can lead to a whole bunch of wasted space that you may/may not want to GC depending on how accessible you want your history to be. #7022 (comment)


Thanks for your testing work so far. If you're able to keep going here, understanding if v0.10.0 is ✔️ or ❌ would be helpful. Additionally/alternatively, you could try v0.11.0 and jack up the internal variable controlling the auto-sharding threshold to effectively turn it off by doing ipfs config --json Internal.UnixFSShardingSizeThreshold "\"1GB\"" (1GB is obviously huge and will create blocks too big to transfer, but will make it easy to identify if this is what's causing the performance issue).

I also realized this internal flag was missing from the docs 🤦 so I put up #8723

@BigLep BigLep moved this to 🥞 Todo in IPFS Shipyard Team Mar 2, 2022
@BigLep BigLep removed the status in IPFS Shipyard Team Mar 2, 2022
@BigLep BigLep removed the need/triage Needs initial labeling and prioritization label Mar 3, 2022
@BigLep BigLep added the need/author-input Needs input from the original author label Mar 3, 2022
@BigLep
Copy link
Contributor

BigLep commented Mar 25, 2022

We're going to close because don't have additional info to dig in further. Feel free to reopen with the requested info if this is still an issue. Thanks.

@BigLep BigLep closed this as completed Mar 25, 2022
@RubenKelevra
Copy link
Contributor Author

@aschmahmann was this fixed? I updated to the 0.13rc1, and I ran into serious performance issues again.

Have you tried to add many files to the MFS in simple ipfs add/ipfs files cp/ipfs pin remove or tried my script, yet?

@RubenKelevra

This comment was marked as resolved.

@RubenKelevra

This comment was marked as off-topic.

@RubenKelevra
Copy link
Contributor Author

@aschmahmann I set ipfs config --json Internal.UnixFSShardingSizeThreshold "\"1MB\"" so 1MB not 1GB, since this should work in theory.

But I still see 30 second delays for removing a single file in the MFS.

I think this was more due to large repining operations by the cluster daemon, as the MFS folders need to be pinned locally on every change.

I created a ticket on the cluster project for this.

Furthermore, I see (at least with a few file changes) no large hangs when using 1 MB sharding.

But I haven't yet tested the full import I had originally trouble with – and what this ticket is about.

@RubenKelevra
Copy link
Contributor Author

RubenKelevra commented Jun 5, 2022

@aschmahmann I can confirm this issue with the suggested ipfs config --json Internal.UnixFSShardingSizeThreshold "\"1MB\"" with a current master (a72753b) as well as the current stable (0.12.2) from the ArchLinux repo.

(1 MB should never be exceeded on my datasets, as sharding wasn't necessary before to store the folders.)

The changes to the MFS are crunching to a hold after a lot of consecutive operations, where single ipfs files cp /ipfs/$CID /path/to/file commands take 1-2 minutes while the IPFS daemon is taking 4-6 cores worth of CPU power.

All other MFS operations are blocked as well, so you get response times in the minutes for simple ls operations.

@BigLep please reopen as this isn't fixed and can be reproduced

@RubenKelevra
Copy link
Contributor Author

I'll take my project pacman.store with the package mirrors for Manjaro, Arch Linux etc. down until this is solved. I don't like running 0.9 anymore due to the age and would need to downgrade the whole server again.

I just cannot share days old packages, even weeks due to safety concerns, so I don't like to do any harm here.

The URLs will just return empty directories for now.

@aschmahmann
Copy link
Contributor

aschmahmann commented Jun 16, 2022

@RubenKelevra might be worth checking if #9042 fixes the problem (it should be in master later today). There was a performance fix in go-unixfs regarding the automatic sharding (ipfs/go-unixfs#120)

@RubenKelevra
Copy link
Contributor Author

@RubenKelevra might be worth checking if #9042 fixes the problem (it should be in master later today). There was a performance fix in go-unixfs regarding the automatic sharding (ipfs/go-unixfs#120)

Great, thanks. Will check it out as soon as I'm back at home :)

@github-actions
Copy link

Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 7 days.

@RubenKelevra
Copy link
Contributor Author

RubenKelevra commented Jun 23, 2022

@aschmahmann I can confirm this bug for 88d8815

@Jorropo Jorropo removed need/author-input Needs input from the original author kind/stale labels Jun 23, 2022
@lidel

This comment was marked as off-topic.

@lidel lidel added P1 High: Likely tackled by core team if no one steps up topic/MFS Topic MFS topic/sharding Topic about Sharding (HAMT etc) labels Jun 27, 2022
@lidel lidel moved this to 🥞 Todo in IPFS Shipyard Team Jun 27, 2022
@lidel lidel added the need/analysis Needs further analysis before proceeding label Jun 27, 2022
@RubenKelevra
Copy link
Contributor Author

RubenKelevra commented Jun 27, 2022

@RubenKelevra you mean it is still broken, even after the switch to go-unixfs v0.4.0?

Yeah. CPU load piles up and a simple ipfs files cp /ipfs/$CID /path/in/mfs takes minutes to complete.

I think there's just something running in concurrency and somehow work need to be done again and again to make the change to the MFS, as other parts of it are still changing. But that's just a guess. Could be anything else, really.

@schomatis
Copy link
Contributor

Asked them for specific commands that are executed just before the crash

@lidel We might be conflating different topics in the same issue here, let's have a new issue for that report when it comes and please ping me.

@lidel
Copy link
Member

lidel commented Jun 27, 2022

@schomatis ack, moved panic investigation to #9063

@dhyaniarun1993

This comment was marked as off-topic.

@Jorropo
Copy link
Contributor

Jorropo commented Jul 25, 2022

@dhyaniarun1993 I am confident that this is an other issue (I couldn't find the issue so if you want open a new one even tho we know what this is).
ipfs ls fetches block one by one, so it's a sequential process to read all the 40k files.

@dhyaniarun1993

This comment was marked as off-topic.

@Jorropo
Copy link
Contributor

Jorropo commented Jul 25, 2022

@dhyaniarun1993 I don't want to spam this issue so I'll mark our conversation off-topic FYI, pls open a new issue.

The resolution (walking from / to /xyz is fine enough).
The issue is:

/xyz/0
/xyz/1
/xyz/2
/xyz/3
/xyz/4
/xyz/5

Kubo will fetch 0, 1, 2, 3, 4 and 5 one-by-one (instead of 32 by 32 in parallel for example).

EDIT: github wont let me hide my own messages ... :'(

@CMB
Copy link

CMB commented Aug 4, 2022

I have this same issue. I'm maintaining a package mirror with approximately
400000 files. ipfs files cp gets progressively slower
as files are added to MFS.

Here's the question. I already have a table of names and cids of
all of the files. I track them in a database.
Is there a way I could create the directory structure in one fell swoop
from my list, rather than adding links to MFS one at a time?

@BigLep
Copy link
Contributor

BigLep commented Sep 1, 2022

@Jorropo : what are the next steps here?

@BigLep BigLep assigned Jorropo and unassigned aschmahmann Nov 17, 2022
@Jorropo Jorropo removed their assignment Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/analysis Needs further analysis before proceeding P1 High: Likely tackled by core team if no one steps up topic/MFS Topic MFS topic/sharding Topic about Sharding (HAMT etc)
Projects
No open projects
Status: 🥞 Todo
Development

No branches or pull requests

8 participants