Skip to content
This repository has been archived by the owner on Aug 2, 2021. It is now read-only.

Fix data races in TestLDBStoreCollectGarbage #1123

Closed
wants to merge 4 commits into from
Closed

Conversation

janos
Copy link
Member

@janos janos commented Jan 16, 2019

This PR addresses race condition described in #1118 and a similar one in gc tests.

This change solves data races, but it does not solve failing tests when go test tool is called with -race. Tests are failing with not all expected chunks deleted by the garbage collection. If tests are run without the -race flag, they are passing as usual. Currently, the flaky test is disabled.

We agreed on the standup not to dedicate more time to this failing tests, as regular tests are passing and we are working on a different gc implementation.

@frncmx
Copy link
Contributor

frncmx commented Jan 18, 2019

I think we could use just a read lock, couldn't we?

@frncmx
Copy link
Contributor

frncmx commented Jan 18, 2019

@janos I'm a little bit confused about the status of this PR, could you help me out?

The test does not fix TestLDBStoreCollectGarbage as stated in the commit and above you say we don't want to spend more time on this issue.

Shall we just close this with a no fix?

[correction: fixes the data race but causes the test to fail for other reasons.]

@frncmx frncmx assigned frncmx and unassigned janos Jan 22, 2019
@frncmx
Copy link
Contributor

frncmx commented Jan 22, 2019

As discussed, on stand-up I'm trying to figure what to do with this.

Problem:

$ go test -race -count 1 -run TestLDBStoreCollectGarbage github.com/ethereum/go-ethereum/swarm/storage -loglevel 0
--- FAIL: TestLDBStoreCollectGarbage (59.98s)
    --- FAIL: TestLDBStoreCollectGarbage/B/50/200 (1.25s)
        ldbstore_test.go:559: expected surplus chunk 150 to be missing, but got no error
    --- FAIL: TestLDBStoreCollectGarbage/B/100/400 (3.15s)
        ldbstore_test.go:559: expected surplus chunk 300 to be missing, but got no error
FAIL
FAIL	github.com/ethereum/go-ethereum/swarm/storage	62.153s

Context:

  • LDB is going to be replaced by the new localstore implementation => limit time and effort
  • tests are failing because the pressure caused by -race

1st (ugly,cheap) idea: skip the 2 failing test cases when -race provided.
Learning:

  • there is no testing.Race() as there is testing.Verbose()
  • -race is a build flag to alternate between norace.go and race.go
  • there is a const called race.Enabled however that cannot be accessed because race is an internal package
  • nice, because Go really protects me from choosing the ugly solution

2nd idea: just drop the new locks and the 2 debug log lines

  • that also solves the race
  • still, the two test cases fail with -race
    => Failure is totally unrelated to the change.

3rd idea: debug
results: seemingly we must trigger another GC round, simplest way to add before the failing for loop ldb.collectGarbage()

Ferenc Szabo added 2 commits January 22, 2019 19:01
As the context parameter is not used and fixing it is not trivial,
let's rather just drop that.

Why is it not trivial?
The usual select with <-ctx.Done() would not work, as ldb.gc.runC
is always expected to have 1 element when the GC is not running.
Also: #1151
Test fails unreliably when run with `-race` flag. As waitGC() does not
work as expected, it might just continue (not block) if GC is not
running already. We might end up with or without a chunk, depending
on goroutine scheduling.

As localstore rewrite is at the finish line and this blocks
#741 => Skip.
@frncmx
Copy link
Contributor

frncmx commented Jan 23, 2019

Calling ldb.collectGarbage() directly in test did not solve the flakiness as expected. The above mentioned test might still fail when run with -race. That blocks #741 => I vote to skip the test until we fix waitGC() (#1151) or ldb gets replaced by the new localstore.

@frncmx frncmx requested review from nolash, gluk256 and nonsense and removed request for frncmx January 23, 2019 09:58
@frncmx frncmx assigned frncmx and unassigned janos Jan 23, 2019
@janos
Copy link
Member Author

janos commented Jan 23, 2019

@frncmx I agree.

@frncmx
Copy link
Contributor

frncmx commented Jan 23, 2019

opened upstream ethereum/go-ethereum#18512

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants