Skip to content

Harness tests random panic #10932

@guillaumemichel

Description

@guillaumemichel

Checklist

Installation method

dist.ipfs.tech or ipfs-update

Version

Kubo version: 0.38.0-dev-dd3f59db5
Repo version: 17
System version: amd64/linux
Golang version: go1.25.0

Config

{
  "API": {
    "HTTPHeaders": {}
  },
  "Addresses": {
    "API": "/ip4/127.0.0.1/tcp/5001",
    "Announce": [],
    "AppendAnnounce": [],
    "Gateway": "/ip4/127.0.0.1/tcp/8080",
    "NoAnnounce": [],
    "Swarm": [
      "/ip4/0.0.0.0/tcp/4001",
      "/ip6/::/tcp/4001",
      "/ip4/0.0.0.0/udp/4001/webrtc-direct",
      "/ip4/0.0.0.0/udp/4001/quic-v1",
      "/ip4/0.0.0.0/udp/4001/quic-v1/webtransport",
      "/ip6/::/udp/4001/webrtc-direct",
      "/ip6/::/udp/4001/quic-v1",
      "/ip6/::/udp/4001/quic-v1/webtransport"
    ]
  },
  "AutoConf": {},
  "AutoNAT": {},
  "AutoTLS": {},
  "Bitswap": {},
  "Bootstrap": [
    "auto"
  ],
  "DNS": {
    "Resolvers": {
      ".": "auto"
    }
  },
  "Datastore": {
    "BlockKeyCacheSize": null,
    "BloomFilterSize": 0,
    "GCPeriod": "1h",
    "HashOnRead": false,
    "Spec": {
      "mounts": [
        {
          "mountpoint": "/blocks",
          "path": "blocks",
          "prefix": "flatfs.datastore",
          "shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
          "sync": false,
          "type": "flatfs"
        },
        {
          "compression": "none",
          "mountpoint": "/",
          "path": "datastore",
          "prefix": "leveldb.datastore",
          "type": "levelds"
        }
      ],
      "type": "mount"
    },
    "StorageGCWatermark": 90,
    "StorageMax": "10GB"
  },
  "Discovery": {
    "MDNS": {
      "Enabled": true
    }
  },
  "Experimental": {
    "FilestoreEnabled": false,
    "Libp2pStreamMounting": false,
    "OptimisticProvide": false,
    "OptimisticProvideJobsPoolSize": 0,
    "P2pHttpProxy": false,
    "UrlstoreEnabled": false
  },
  "Gateway": {
    "DeserializedResponses": null,
    "DisableHTMLErrors": null,
    "ExposeRoutingAPI": null,
    "HTTPHeaders": {},
    "NoDNSLink": false,
    "NoFetch": false,
    "PublicGateways": null,
    "RootRedirect": ""
  },
  "HTTPRetrieval": {},
  "Identity": {
    "PeerID": "12D3KooWKYwvPGz6qqaisCuX7MG4wQLLFjHX2pts56xRMRq8ve2H"
  },
  "Import": {
    "BatchMaxNodes": null,
    "BatchMaxSize": null,
    "CidVersion": null,
    "HashFunction": null,
    "UnixFSChunker": null,
    "UnixFSDirectoryMaxLinks": null,
    "UnixFSFileMaxLinks": null,
    "UnixFSHAMTDirectoryMaxFanout": null,
    "UnixFSHAMTDirectorySizeThreshold": null,
    "UnixFSRawLeaves": null
  },
  "Internal": {},
  "Ipns": {
    "DelegatedPublishers": [
      "auto"
    ],
    "RecordLifetime": "",
    "RepublishPeriod": "",
    "ResolveCacheSize": 128
  },
  "Migration": {},
  "Mounts": {
    "FuseAllowOther": false,
    "IPFS": "/ipfs",
    "IPNS": "/ipns",
    "MFS": "/mfs"
  },
  "Peering": {
    "Peers": null
  },
  "Pinning": {
    "RemoteServices": {}
  },
  "Plugins": {
    "Plugins": null
  },
  "Provider": {},
  "Pubsub": {
    "DisableSigning": false,
    "Router": ""
  },
  "Reprovider": {},
  "Routing": {
    "DelegatedRouters": [
      "auto"
    ]
  },
  "Swarm": {
    "AddrFilters": null,
    "ConnMgr": {},
    "DisableBandwidthMetrics": false,
    "DisableNatPortMap": false,
    "RelayClient": {},
    "RelayService": {},
    "ResourceMgr": {},
    "Transports": {
      "Multiplexers": {},
      "Network": {},
      "Security": {}
    }
  },
  "Version": {}
}

Description

Some harness tests are randomly panicking when ran locally, individually.

➜  cli git:(master) go test -run ^TestProvider$ -v -count 1
=== RUN   TestProvider
=== PAUSE TestProvider
=== CONT  TestProvider
=== RUN   TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_add
=== PAUSE TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_add
=== RUN   TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_add_--pin=false_with_default_strategy
=== PAUSE TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_add_--pin=false_with_default_strategy
=== RUN   TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_block_put_--pin=false_with_default_strategy
=== PAUSE TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_block_put_--pin=false_with_default_strategy
=== RUN   TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_dag_put_--pin=false_with_default_strategy
=== PAUSE TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_dag_put_--pin=false_with_default_strategy
=== RUN   TestProvider/Provider.Enabled=false_disables_announcement_of_new_CID_from_ipfs_add
=== PAUSE TestProvider/Provider.Enabled=false_disables_announcement_of_new_CID_from_ipfs_add
=== RUN   TestProvider/Provider.Enabled=false_disables_manual_announcement_via_RPC_command
=== PAUSE TestProvider/Provider.Enabled=false_disables_manual_announcement_via_RPC_command
=== RUN   TestProvider/Reprovide.Interval=0_disables_announcement_of_new_CID_too
=== PAUSE TestProvider/Reprovide.Interval=0_disables_announcement_of_new_CID_too
=== RUN   TestProvider/Manual_Reprovider_trigger_does_not_work_when_periodic_Reprovider_is_disabled
=== PAUSE TestProvider/Manual_Reprovider_trigger_does_not_work_when_periodic_Reprovider_is_disabled
=== RUN   TestProvider/Manual_Reprovider_trigger_does_not_work_when_Provider_system_is_disabled
=== PAUSE TestProvider/Manual_Reprovider_trigger_does_not_work_when_Provider_system_is_disabled
=== RUN   TestProvider/Provide_with_'all'_strategy
=== PAUSE TestProvider/Provide_with_'all'_strategy
=== RUN   TestProvider/Provide_with_'pinned'_strategy
=== PAUSE TestProvider/Provide_with_'pinned'_strategy
=== RUN   TestProvider/Provide_with_'pinned+mfs'_strategy
=== PAUSE TestProvider/Provide_with_'pinned+mfs'_strategy
=== RUN   TestProvider/Provide_with_'roots'_strategy
=== PAUSE TestProvider/Provide_with_'roots'_strategy
=== RUN   TestProvider/Provide_with_'mfs'_strategy
=== PAUSE TestProvider/Provide_with_'mfs'_strategy
=== RUN   TestProvider/Reprovides_with_'all'_strategy_when_strategy_is_''_(empty)
=== PAUSE TestProvider/Reprovides_with_'all'_strategy_when_strategy_is_''_(empty)
=== RUN   TestProvider/Reprovides_with_'all'_strategy
=== PAUSE TestProvider/Reprovides_with_'all'_strategy
=== RUN   TestProvider/Reprovides_with_'pinned'_strategy
=== PAUSE TestProvider/Reprovides_with_'pinned'_strategy
=== RUN   TestProvider/Reprovides_with_'roots'_strategy
=== PAUSE TestProvider/Reprovides_with_'roots'_strategy
=== RUN   TestProvider/Reprovides_with_'mfs'_strategy
=== PAUSE TestProvider/Reprovides_with_'mfs'_strategy
=== RUN   TestProvider/Reprovides_with_'pinned+mfs'_strategy
=== PAUSE TestProvider/Reprovides_with_'pinned+mfs'_strategy
=== RUN   TestProvider/provide_clear_command_removes_items_from_provide_queue
=== PAUSE TestProvider/provide_clear_command_removes_items_from_provide_queue
=== RUN   TestProvider/provide_clear_command_with_quiet_option
=== PAUSE TestProvider/provide_clear_command_with_quiet_option
=== RUN   TestProvider/provide_clear_command_works_when_provider_is_disabled
=== PAUSE TestProvider/provide_clear_command_works_when_provider_is_disabled
=== RUN   TestProvider/provide_clear_command_returns_JSON_with_removed_item_count
=== PAUSE TestProvider/provide_clear_command_returns_JSON_with_removed_item_count
=== CONT  TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_add
=== CONT  TestProvider/Reprovides_with_'mfs'_strategy
=== CONT  TestProvider/provide_clear_command_works_when_provider_is_disabled
=== CONT  TestProvider/provide_clear_command_returns_JSON_with_removed_item_count
=== CONT  TestProvider/Provide_with_'all'_strategy
=== CONT  TestProvider/Reprovides_with_'roots'_strategy
=== CONT  TestProvider/Reprovides_with_'pinned'_strategy
=== CONT  TestProvider/Reprovides_with_'all'_strategy
=== CONT  TestProvider/Reprovides_with_'all'_strategy_when_strategy_is_''_(empty)
=== CONT  TestProvider/Provide_with_'mfs'_strategy
=== CONT  TestProvider/Provide_with_'roots'_strategy
=== CONT  TestProvider/provide_clear_command_removes_items_from_provide_queue
=== CONT  TestProvider/Provide_with_'pinned+mfs'_strategy
=== CONT  TestProvider/Reprovides_with_'pinned+mfs'_strategy
=== CONT  TestProvider/provide_clear_command_with_quiet_option
=== CONT  TestProvider/Provide_with_'pinned'_strategy
=== CONT  TestProvider/Provider.Enabled=false_disables_manual_announcement_via_RPC_command
=== CONT  TestProvider/Manual_Reprovider_trigger_does_not_work_when_Provider_system_is_disabled
=== CONT  TestProvider/Manual_Reprovider_trigger_does_not_work_when_periodic_Reprovider_is_disabled
=== CONT  TestProvider/Reprovide.Interval=0_disables_announcement_of_new_CID_too
=== CONT  TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_dag_put_--pin=false_with_default_strategy
=== CONT  TestProvider/Provider.Enabled=false_disables_announcement_of_new_CID_from_ipfs_add
=== CONT  TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_block_put_--pin=false_with_default_strategy
=== CONT  TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_add_--pin=false_with_default_strategy
--- FAIL: TestProvider (2.91s)
    --- PASS: TestProvider/provide_clear_command_returns_JSON_with_removed_item_count (1.45s)
    --- PASS: TestProvider/provide_clear_command_with_quiet_option (1.45s)
    --- PASS: TestProvider/provide_clear_command_works_when_provider_is_disabled (1.49s)
    --- PASS: TestProvider/Provide_with_'all'_strategy (1.59s)
    --- PASS: TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_add (1.65s)
    --- PASS: TestProvider/provide_clear_command_removes_items_from_provide_queue (1.69s)
    --- PASS: TestProvider/Provide_with_'pinned'_strategy (2.10s)
    --- PASS: TestProvider/Provide_with_'mfs'_strategy (2.18s)
    --- PASS: TestProvider/Reprovides_with_'all'_strategy_when_strategy_is_''_(empty) (2.25s)
    --- PASS: TestProvider/Provide_with_'roots'_strategy (2.42s)
    --- PASS: TestProvider/Reprovides_with_'all'_strategy (2.42s)
    --- PASS: TestProvider/Reprovides_with_'pinned+mfs'_strategy (2.42s)
    --- PASS: TestProvider/Reprovides_with_'mfs'_strategy (2.56s)
    --- PASS: TestProvider/Provide_with_'pinned+mfs'_strategy (2.73s)
    --- PASS: TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_dag_put_--pin=false_with_default_strategy (1.16s)
    --- FAIL: TestProvider/Provider.Enabled=true_announces_new_CIDs_created_by_ipfs_block_put_--pin=false_with_default_strategy (0.80s)
panic: runtime error: index out of range [0] with length 0 [recovered, repanicked]

goroutine 37 [running]:
testing.tRunner.func1.2({0x30d7380, 0xc000f0bb90})
	/home/guissou/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.0.linux-amd64/src/testing/testing.go:1872 +0x237
testing.tRunner.func1()
	/home/guissou/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.0.linux-amd64/src/testing/testing.go:1875 +0x35b
panic({0x30d7380?, 0xc000f0bb90?})
	/home/guissou/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.0.linux-amd64/src/runtime/panic.go:783 +0x132
github.com/ipfs/kubo/test/cli/harness.Nodes.Connect({0xc000cae590, 0x2, 0x2})
	/home/guissou/Documents/ipfs/kubo/test/cli/harness/nodes.go:52 +0x21b
github.com/ipfs/kubo/test/cli.TestProvider.func1(0xc000382fc0?, 0x2, 0x3599108)
	/home/guissou/Documents/ipfs/kubo/test/cli/provider_test.go:21 +0xf3
github.com/ipfs/kubo/test/cli.TestProvider.func7(0xc000382fc0)
	/home/guissou/Documents/ipfs/kubo/test/cli/provider_test.go:72 +0x72
testing.tRunner(0xc000382fc0, 0xc0004d8108)
	/home/guissou/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.0.linux-amd64/src/testing/testing.go:1934 +0xea
created by testing.(*T).Run in goroutine 34
	/home/guissou/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.0.linux-amd64/src/testing/testing.go:1997 +0x465
exit status 2
FAIL	github.com/ipfs/kubo/test/cli	2.926s

The culprit is

func (n Nodes) Connect() Nodes {
wg := sync.WaitGroup{}
for i, node := range n {
for j, otherNode := range n {
if i == j {
continue
}
node := node
otherNode := otherNode
wg.Add(1)
go func() {
defer wg.Done()
node.Connect(otherNode)
}()
}
}
wg.Wait()
for _, node := range n {
firstPeer := node.Peers()[0]
if _, err := firstPeer.ValueForProtocol(multiaddr.P_P2P); err != nil {
log.Panicf("unexpected state for node %d with peer ID %s: %s", node.ID, node.PeerID(), err)
}
}
return n
}

In TestProvider, it isn't always the same subtest that is failing. Multiple (all?) subtests can trigger a panic.

Some other tests using the same code path also randomly panic such as TestTransports. It is harder to reproduce with this test (maybe because there are less subtests).

I wasn't able to reproduce the panic when running all the tests from /test/cli with go test ., nor in the CI.


The fix suggested in #10929 doesn't solve the issue. The node panics after the timeout at

log.Panicf("node %d with peer ID %s has no peers after connection timeout", node.ID, node.PeerID())


This may be a timing issue, due to the high parallelization of tests?

Metadata

Metadata

Assignees

Labels

kind/bugA bug in existing code (including security flaws)need/triageNeeds initial labeling and prioritization

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions