Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update1.8 #19029

Merged
merged 65 commits into from
Feb 20, 2019
Merged

Update1.8 #19029

merged 65 commits into from
Feb 20, 2019

Conversation

holiman
Copy link
Contributor

@holiman holiman commented Feb 9, 2019

Trying my hands at cherrypicking here. An attempt to backport the bugfixes that are live on master but not yet released on 1.8. PTAL and check if it looks correct.

@holiman holiman requested a review from karalabe as a code owner February 9, 2019 16:50
@karalabe karalabe modified the milestones: 1.9.0, 1.8.23 Feb 12, 2019
karalabe and others added 21 commits February 19, 2019 10:53
* build: use sftp for launchpad uploads

* .travis.yml: configure sftp export

* build: update CI docs

(cherry picked from commit 3de19c8)
…eum#19035)

* common/fdlimit: cap on MacOS file limits, fixes ethereum#18994

* common/fdlimit: fix Maximum-check to respect OPEN_MAX

* common/fdlimit: return error if OPEN_MAX is exceeded in Raise()

* common/fdlimit: goimports

* common/fdlimit: check value after setting fdlimit

* common/fdlimit: make comment a bit more descriptive

* cmd/utils: make fdlimit happy path a bit cleaner

(cherry picked from commit f48da43)
(cherry picked from commit dcc045f)
(cherry picked from commit 37e5a90)
* cmd/swarm/swarm-snapshot: add binary to create network snapshots

* cmd/swarm/swarm-snapshot: refactor and extend tests

* p2p/simulations: remove unused triggerChecks func and fix linter

* internal/cmdtest: raise the timeout for killing TestCmd

* cmd/swarm/swarm-snapshot: add more comments and other minor adjustments

* cmd/swarm/swarm-snapshot: remove redundant check in createSnapshot

* cmd/swarm/swarm-snapshot: change comment wording

* p2p/simulations: revert Simulation.Run from master

https://github.com/ethersphere/go-ethereum/pull/1077/files#r247078904

* cmd/swarm/swarm-snapshot: address pr comments

* swarm/network/simulations/discovery: removed snapshot write to file

* cmd/swarm/swarm-snapshot, swarm/network/simulations: removed redundant connection event check, fixed lint error

(cherry picked from commit 34f11e7)
…thereum#18404)

* swarm/network: fix skipped tests related to suggestPeer

* swarm/network: rename depth to radius

* swarm/network: uncomment assertHealth and improve comments

* swarm/network: remove commented code

* swarm/network: kademlia suggestPeer algo correction

* swarm/network: kademlia suggest peer

 * simplify suggest Peer code
 * improve peer suggestion algo
 * add comments
 * kademlia testing improvements
   * assertHealth -> checkHealth (test helper)
   * testSuggestPeer -> checkSuggestPeer (test helper)
   * remove testSuggestPeerBug and TestKademliaCase

* swarm/network: kademlia suggestPeer cleanup, improved comments

* swarm/network: minor comment, discovery test default arg

(cherry picked from commit bcb2594)
(cherry picked from commit 257bfff)
frncmx and others added 22 commits February 19, 2019 13:06
…reum#19028)

* swarm/storage: fix HashExplore concurrency bug ethersphere#1211

*  swarm/storage: lock as value not pointer

* swarm/storage: wait for  to complete

* swarm/storage: fix linter problems

* swarm/storage: append to nil slice

(cherry picked from commit 3d22a46)
* swarm/network/stream: newStreamerTester cleanup only if err is nil

* swarm/network/stream: raise newStreamerTester waitForPeers timeout

* swarm/network/stream: fix data races in GetPeerSubscriptions

* swarm/storage: prevent data race on LDBStore.batchesC

ethersphere/swarm#1198 (comment)

* swarm/network/stream: fix TestGetSubscriptionsRPC data race

ethersphere/swarm#1198 (comment)

* swarm/network/stream: correctly use Simulation.Run callback

ethersphere/swarm#1198 (comment)

* swarm/network: protect addrCountC in Kademlia.AddrCountC function

ethersphere/swarm#1198 (comment)

* p2p/simulations: fix a deadlock calling getRandomNode with lock

ethersphere/swarm#1198 (comment)

* swarm/network/stream: terminate disconnect goruotines in tests

* swarm/network/stream: reduce memory consumption when testing data races

* swarm/network/stream: add watchDisconnections helper function

* swarm/network/stream: add concurrent counter for tests

* swarm/network/stream: rename race/norace test files and use const

* swarm/network/stream: remove watchSim and its panic

* swarm/network/stream: pass context in watchDisconnections

* swarm/network/stream: add concurrent safe bool for watchDisconnections

* swarm/storage: fix LDBStore.batchesC data race by not closing it

(cherry picked from commit 3fd6db2)
* swarm/network: new saturation for  implementation

* swarm/network: re-added saturation func in Kademlia as it is used elsewhere

* swarm/network: saturation with higher MinBinSize

* swarm/network: PeersPerBin with depth check

* swarm/network: edited tests to pass new saturated check

* swarm/network: minor fix saturated check

* swarm/network/simulations/discovery: fixed renamed RPC call

* swarm/network: renamed to isSaturated and returns bool

* swarm/network: early depth check

(cherry picked from commit 2af2472)
…m#19049)

swarm/network/stream: remove netstore internal wg
swarm/network/stream: run individual tests with t.Run

(cherry picked from commit 3ee09ba)
* swarm/pss: split pss and keystore

* swarm/pss: moved whisper to keystore

* swarm/pss: goimports fixed

(cherry picked from commit 12ca3b1)
* swarm/network: DRY out repeated giga comment

I not necessarily agree with the way we wait for event propagation.
But I truly disagree with having duplicated giga comments.

* p2p/simulations: encapsulate Node.Up field so we avoid data races

The Node.Up field was accessed concurrently without "proper" locking.
There was a lock on Network and that was used sometimes to access
the  field. Other times the locking was missed and we had
a data race.

For example: ethereum#18464
The case above was solved, but there were still intermittent/hard to
reproduce races. So let's solve the issue permanently.

resolves: ethersphere/swarm#1146

* p2p/simulations: fix unmarshal of simulations.Node

Making Node.Up field private in 13292ee
broke TestHTTPNetwork and TestHTTPSnapshot. Because the default
UnmarshalJSON does not handle unexported fields.

Important: The fix is partial and not proper to my taste. But I cut
scope as I think the fix may require a change to the current
serialization format. New ticket:
ethersphere/swarm#1177

* p2p/simulations: Add a sanity test case for Node.Config UnmarshalJSON

* p2p/simulations: revert back to defer Unlock() pattern for Network

It's a good patten to call `defer Unlock()` right after `Lock()` so
(new) error cases won't miss to unlock. Let's get back to that pattern.

The patten was abandoned in 85a79b3,
while fixing a data race. That data race does not exist anymore,
since the Node.Up field got hidden behind its own lock.

* p2p/simulations: consistent naming for test providers Node.UnmarshalJSON

* p2p/simulations: remove JSON annotation from private fields of Node

As unexported fields are not serialized.

* p2p/simulations: fix deadlock in Network.GetRandomDownNode()

Problem: GetRandomDownNode() locks -> getDownNodeIDs() ->
GetNodes() tries to lock -> deadlock

On Network type, unexported functions must assume that `net.lock`
is already acquired and should not call exported functions which
might try to lock again.

* p2p/simulations: ensure method conformity for Network

Connect* methods were moved to p2p/simulations.Network from
swarm/network/simulation. However these new methods did not follow
the pattern of Network methods, i.e., all exported method locks
the whole Network either for read or write.

* p2p/simulations: fix deadlock during network shutdown

`TestDiscoveryPersistenceSimulationSimAdapter` often got into deadlock.
The execution was stuck on two locks, i.e, `Kademlia.lock` and
`p2p/simulations.Network.lock`. Usually the test got stuck once in each
20 executions with high confidence.

`Kademlia` was stuck in `Kademlia.EachAddr()` and `Network` in
`Network.Stop()`.

Solution: in `Network.Stop()` `net.lock` must be released before
calling `node.Stop()` as stopping a node (somehow - I did not find
the exact code path) causes `Network.InitConn()` to be called from
`Kademlia.SuggestPeer()` and that blocks on `net.lock`.

Related ticket: ethersphere/swarm#1223

* swarm/state: simplify if statement in DBStore.Put()

* p2p/simulations: remove faulty godoc from private function

The comment started with the wrong method name.

The method is simple and self explanatory. Also, it's private.
=> Let's just remove the comment.

(cherry picked from commit 50b872b)
* cmd/swarm/swarm-smoke: first version trigger has-chunks on timeout

* cmd/swarm/swarm-smoke: finalize trigger to chunk debug

* cmd/swarm/swarm-smoke: fixed httpEndpoint for trigger

* cmd/swarm/swarm-smoke: port

* cmd/swarm/swarm-smoke: ws not rpc

* cmd/swarm/swarm-smoke: added debug output

* cmd/swarm/swarm-smoke: addressed PR comments

* cmd/swarm/swarm-smoke: renamed track-timeout and track-chunks

(cherry picked from commit 62d7688)
…m#19117)

* swarm: Reinstate Pss Protocol add call through swarm service

* swarm: Even less self

(cherry picked from commit d88c6ce)
JerzyLa and others added 3 commits February 19, 2019 17:34
This PR is replacing the metrics.influxdb.host.tag cmd-line flag with metrics.influxdb.tags - a comma-separated key/value tags, that are passed to the InfluxDB reporter, so that we can index measurements with multiple tags, and not just one host tag.

This will be useful for Swarm, where we want to index measurements not just with the host tag, but also with bzzkey and git commit version (for long-running deployments).

(cherry picked from commit 21acf0b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.