Implemented NetworkConditioner #141

Antonito · 2021-06-23T12:34:58Z

This PR adds a NetworkConditioner on vnet.Net – which behaves in addition to the Router's settings.

A NetworkConditioner can be provided at vnet.Net creation time, or later.
The change in itself is not atomic, rather each value of the struct is updated atomically. It shouldn't cause any issue though (maybe invalid behavior on 1 or 2 packets at most) and permits to avoid any per-packet locking mechanism – which would hurt performances.

A user can use one of the preset (the values are taken from Apple's NetworkLinkConditioner), or provide their own custom one.

This is useful in situation such as when you need to unit test behavior when a specific user drops their connection.

I'm not 100% sure the handleDownLink and handleUpLink calls are at the best place nor return the correct values?

codecov · 2021-06-23T12:36:16Z

Codecov Report

Merging #141 (2dfbb94) into master (8ae6e0d) will increase coverage by 0.57%.
The diff coverage is 89.58%.

@@            Coverage Diff             @@
##           master     #141      +/-   ##
==========================================
+ Coverage   83.16%   83.74%   +0.57%     
==========================================
  Files          26       28       +2     
  Lines        1740     1882     +142     
==========================================
+ Hits         1447     1576     +129     
- Misses        190      197       +7     
- Partials      103      109       +6

Flag	Coverage Δ
go	`83.73% <89.58%> (+0.58%)`	⬆️
wasm	`76.66% <81.25%> (+0.41%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
vnet/net.go	`73.02% <50.00%> (-0.95%)`	⬇️
vnet/network_conditioner.go	`76.47% <76.47%> (ø)`
vnet/network_conditioner_presets.go	`100.00% <100.00%> (ø)`
test/bridge.go	`79.05% <0.00%> (+0.52%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8ae6e0d...2dfbb94. Read the comment docs.

NetworkConditioner allows to replicate in a predictable way network condition, such as a 3G or LTE network. There is a list of pre-defined NetworkConditionerPreset, or the user can provided their own.

enobufs · 2021-06-27T20:27:29Z

@Antonito @Sean-Der

Before I look closely into the code, are we really sure we'd want to introduce network impairment tool per instance of Net (NIC) when we already have the same at Router level?

RouterConfig:

MinDelay time.Duration
MaxJitter time.Duration
Router:
AddChunkFilter(filter ChunkFilter)
- type ChunkFilter func(c Chunk) bool
  o ChunkFilter is a handler users can add to filter chunks. If the filter returns false, the packet will be dropped.

See: https://pkg.go.dev/github.com/pion/transport@v0.12.3/vnet

Notes:

Router already had a queue (incl. solutions to atomicity around queuing ops/complexity) and was convenient to implement above. The benefit was, Net can be simplified by synchronously writing into the queue on the router.
You can always cascade routers to mimic per-NIC configuration if necessary. (Remember, the NIC in the real world is actually a router with one local endpoint.)

I am just wondering if this is worth the effort / maintenance cost for the increased complexity just for the simulation/testing.

enobufs

I believe you'd need to implement queues (outbound & inbound), just like Router does in order to overcome the unwanted blocking operations I pointed out. Please let me know if I did not correctly capture your intention. Also, please consider what I suggested earlier (if the existing solution really cannot help your problems at all, etc).

enobufs · 2021-06-28T06:16:18Z

vnet/network_conditioner.go

+
+	// Apply constant latency if required
+	if sleepNeeded := time.Duration(atomic.LoadInt64((*int64)(&n.UpLink.Latency))) - bandwidthLimiterSleep; sleepNeeded > 0 {
+		time.Sleep(sleepNeeded)


This is blocking the calling thread. WriteTo() would block when the send-buffer is full (which does not happen much with UDP) and I believe this is not what you are trying to simulate with this network conditioner.

enobufs · 2021-06-28T06:21:03Z

vnet/network_conditioner.go

+
+	// Apply constant latency if required
+	if sleepNeeded := time.Duration(atomic.LoadInt64((*int64)(&n.DownLink.Latency))) - bandwidthLimiterSleep; sleepNeeded > 0 {
+		time.Sleep(sleepNeeded)


If I'm not mistaken (it's been a while since I wrote this code.. :P), you are blocking the calling thread which is the processChunks routine from a router. During this blockage, the route cannot process other chunks unrelated to this NIC. I believe this is not what you intended...

Antonito · 2021-06-28T14:58:01Z

Thanks a lot for your comments @enobufs

Totally right about the blocked processChunks, you're correct I intended to add latency to a single client not all of them, I missed that completely.
I'm all for building on top of existing code & keep things simple, and also want to provide the simplest way to test things (I find WebRTC tests can easily become quite verbose, so the simpler the better).

With the current PR, things in a typical scenario looks like this:

wan, err := vnet.NewRouter(&vnet.RouterConfig{
	CIDR:          "0.0.0.0/0",
	LoggerFactory: logging.NewDefaultLoggerFactory(),
})
assert.NoError(t, err)

vnetA := vnet.NewNet(&vnet.NetConfig{
	NetworkConditioner: vnet.NewNetworkConditioner(vnet.NetworkConditionerPresetNone),
})
assert.NoError(t, wan.AddNet(vnetA))

assert.NoError(t, wan.Start())
defer func() { _ = wan.Stop() }()

// Later, the user can dynamically change the NetworkConditioner, to simulate network disconnection for example
vnetA.SetNetworkConditioner(vnet.NewNetworkConditioner(vnet.NetworkConditionerPresetFullLoss))

By implementing on top of the Router, we'd get something like this:

wan, err := vnet.NewRouter(&vnet.RouterConfig{
	CIDR:          "0.0.0.0/0",
	LoggerFactory: logging.NewDefaultLoggerFactory(),
})
assert.NoError(t, err)

subnetA, err := vnet.NewRouter(&vnet.RouterConfig{
	CIDR:          "1.0.0.0/0",
	LoggerFactory: logging.NewDefaultLoggerFactory(),
	NetworkConditioner: vnet.NewNetworkConditioner(vnet.NetworkConditionerPresetNone),
})
assert.NoError(t, err)
assert.NoError(t, wan.AddRouter(vnetA))

vnetA := vnet.NewNet(&vnet.NetConfig{})
assert.NoError(t, subnetA.AddNet(vnetA))

assert.NoError(t, wan.Start())
defer func() { _ = wan.Stop() }()

I think we could introduce a helper function which would wrap the Router+Net creation, but then we'd have to handle CIDR (or let the user provide it, but I feel that's unnecessary complexity), since a subrouter can't have the same CIDR as its parent (or am I wrong on this?)

A bit out of the scope of this PR, but I've been wanting to simulate network changes, to test behavior in mobile-like situations (wifi<->lte switches). With this PR, we can change the behavior of the network, but that's still an inaccurate simulation since the address of the client doesn't change. I think it's something that could be handled at the router's level, and that would make sense to handle network condition on the same level in this situation.

If I'm not mistaken, the current implementation of Router has a single queue to handle incoming and outgoing packets (processChunks). To handle downlink/uplink conditions logic would require a bit of change.

The way I see it, there are 3 possibilities:

NetworkConditions at the Net level, just like this PR does, but with inbound & outbound queues
- Simplest API for the user
- A bit of double use with the Router's logic
NetworkConditions at the router level
- API a bit more verbose
- Would still require logic modifications
- MinDelay is already specified in NeworkCondition, MaxJitter isn't. Ideally we'd have to get rid of MinDelay and add MaxJitter to NetworkCondition, but that's a breaking change I guess?
Have a mix of the two
- We should still merge NetworkCondition with MaxJitter/MinDelay
- Router's logic can stay the same, have the Net implement the NetworkCondition logic without queues (just as it is now)
- Force NetworkCondition-ed Net to be wrapped in a dedicated subrouter. The Net will still block the processChunks routine, but that isn't an issue anymore here.
- Downside is it's a bit less rigorous from a design perspective, and we still have the more verbose API

( I'd also love your thoughts on this @Sean-Der )

enobufs · 2021-06-28T17:20:53Z

My only concern is if per NIC delay is that important for this level of testing. Primary goal of vnet was to emulate NAT so that we could test with various types of NAT. Network impairment was introduced later on, on the router because it already had a queue, so I agree, it was a shortcut and not truly representing real world's complex scenario, but that was good enough to find most of bugs with race conditions.

I used to use WAN emulators such as this. This helped me to evaluate overall performance and to be reassured before shipping a product to the real world, but most of the bugs were found just by introducing simple network impairment between point A and point B, with a tool similar to comcast (using iptables/netem). The vnet does this simple stuff and even more quickly because everything is on the test code to achieve the same.

Having said that, I understand what you are try to do (introducing more controls and helping user side complexity with presets also), and I am not too strong about the above thoughts. Maybe I am too minimalistic..

Antonito · 2021-06-29T11:34:33Z

I still think there's value in enabling users to simulate network conditions via unit tests:

they're very simple to write, and provide consistent results
easy to integrate in existing CI/CD pipelines
they should help catch a good amount of network related problems (if not most?) & allow to test scenarios that aren't (easily) testable as of today

I also agree we should not overcomplicate stuff. I feel the parameters I added to NetworkCondition provide a good amount of flexibility (there are exactly the same as what iOS Network Conditioner Link provides), without making the code too complex.
For scenarios requiring more controls, sure, let's redirect users to more complete tools such as the ones you linked.

I get your point though, and the solutions I proposed might not be best ways to solve this. I dislike having network impairment controls on both the Router and Net, it goes against the idea of simplicity we're seeking.

Antonito · 2021-06-29T11:47:02Z

After wrapping my head against all of this again, I think the best solution is to keep the impairment logic at the vNet level?

the queues add a bit of complexity, but not too much
keeps the API simple for the end user
MinDelay at the Router level can be removed, as it's already handled by DownLink/UpLink Latency
MaxJitter can be added to NetworkCondition and removed at the router level. This adds the possibility to have different DownLink and Uplink jitter.
Router-wide settings can be re-enabled by simply applying a config to all users? (but in that case, are router-wide settings really still useful/wanted?)

The only downside I see with this approach is we introduce breaking changes. But on the other hand, I don't think a lot of code depends on vNet, and especially the network impairment stuff, so that's probably ok?

I've been debating between this, and the Have a mix of the two... solution. However I think the later introduces more complexity, both for the end user (have to manage CIDR) and in the package (still have to move MaxJitter and MinDelay).
Also, keeping the vNet implementation without queues and forcing users to wrap network-impaired vNet into a dedicated sub-router still leaves the possibility of enabling network impairments on non-wrapped vNet – which would lead to unwanted & confusing behavior, which we should avoid.

If someone has a better solution, I'm all for it :)

enobufs · 2021-06-29T21:51:06Z

If we were to emulate network more resembles to the real network, then we should keep the network impairment on the router and improve them for better.

"Network conditioner" settings are from the aspect of one endpoint. It is a special case and does not cover scenarios like having multiple endpoints sharing the same subscriber (i.e. Comcast cable, AT&T fiber, etc) line. The subscriber line is in between your residential router and the internet (other "WAN" routers), while the network speed within the LAN is very fast, etc. Mac's Network Link Conditioner is an alternative solution (things are happening only on your PC) when you don't have a way to introduce network impairment elsewhere.

I use Comcast and most of the bottle neck comes from the subscriber line, particularly the upload speed is very limited with Comcast cable. I'm sure lots of packet loss is happening behind my cable modem (a residential router).

What's missing in the Net class (~= NIC) is the socket buffer (currently not emulated). Write()/WriteTo() call would block if the send (socket) buffer becomes full (until the deadline expires). But if that is UDP, this blockage rarely happen. Also, if it is wifi, there would be relatively large packet loss due to radio signal interference (microwave in the kitchen, etc).

So, if I were to improvement vnet further, I'd look into:

Introduce a way to limit up/down bandwidths separately from bandwidth within the subnet. (as I mentioned above regarding the "subscriber line")
We could Introduce packet loss settings to Net (NIC) as well.

stv0g · 2023-04-12T06:41:26Z

Hey @Antonito,

are you still interested in getting this merged?
If so, could you please resolve the conflicts and push a new version?

Thanks :)

Antonito · 2023-04-12T15:09:42Z

Hey @Antonito,

are you still interested in getting this merged? If so, could you please resolve the conflicts and push a new version?

Thanks :)

Hey @stv0g!

I'm still interested in getting this merged, however I don't have the bandwidth to work on this.
If someone else wants to take ownership of this PR, it's fine by me :)

stv0g · 2023-04-12T16:52:10Z

Thanks @Antonito,

as a maintainer, I need somebody who feels responsible for getting this PR merged and bring it forward.
Its quite normal that initial authors have to abandon their PRs due to resource constraints or a shift of focus.

As this seems to be the case, I would like to close the PR until we find somebody who wants to work on it to get it merged.

Please feel free to reopen the PR if you have the bandwidth :)

Antonito requested a review from enobufs June 23, 2021 12:34

Antonito force-pushed the feat/network-condition branch 2 times, most recently from 215bc56 to 299835f Compare June 23, 2021 22:54

Implemented NetworkConditioner

2dfbb94

NetworkConditioner allows to replicate in a predictable way network condition, such as a 3G or LTE network. There is a list of pre-defined NetworkConditionerPreset, or the user can provided their own.

Antonito force-pushed the feat/network-condition branch from 299835f to 2dfbb94 Compare June 23, 2021 23:43

Antonito mentioned this pull request Jun 26, 2021

Implement DTLS restart pion/webrtc#1846

Open

enobufs requested changes Jun 28, 2021

View reviewed changes

Sean-Der mentioned this pull request Sep 13, 2021

Bandwidth Estimator pion/interceptor#25

Open

10 tasks

mengelbart mentioned this pull request Sep 21, 2021

Add Token Bucket Filter for bandwidth limitation #151

Merged

enobufs mentioned this pull request Oct 16, 2022

Redefining vnet #207

Closed

stv0g closed this Apr 12, 2023

stv0g deleted the feat/network-condition branch April 17, 2023 17:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implemented NetworkConditioner #141

Implemented NetworkConditioner #141

Antonito commented Jun 23, 2021

codecov bot commented Jun 23, 2021 •

edited

Loading

enobufs commented Jun 27, 2021 •

edited

Loading

enobufs left a comment

enobufs Jun 28, 2021 •

edited

Loading

enobufs Jun 28, 2021

Antonito commented Jun 28, 2021

enobufs commented Jun 28, 2021 •

edited

Loading

Antonito commented Jun 29, 2021

Antonito commented Jun 29, 2021

enobufs commented Jun 29, 2021 •

edited

Loading

stv0g commented Apr 12, 2023

Antonito commented Apr 12, 2023

stv0g commented Apr 12, 2023

Implemented NetworkConditioner #141

Implemented NetworkConditioner #141

Conversation

Antonito commented Jun 23, 2021

codecov bot commented Jun 23, 2021 • edited Loading

Codecov Report

enobufs commented Jun 27, 2021 • edited Loading

enobufs left a comment

Choose a reason for hiding this comment

enobufs Jun 28, 2021 • edited Loading

Choose a reason for hiding this comment

enobufs Jun 28, 2021

Choose a reason for hiding this comment

Antonito commented Jun 28, 2021

enobufs commented Jun 28, 2021 • edited Loading

Antonito commented Jun 29, 2021

Antonito commented Jun 29, 2021

enobufs commented Jun 29, 2021 • edited Loading

stv0g commented Apr 12, 2023

Antonito commented Apr 12, 2023

stv0g commented Apr 12, 2023

codecov bot commented Jun 23, 2021 •

edited

Loading

enobufs commented Jun 27, 2021 •

edited

Loading

enobufs Jun 28, 2021 •

edited

Loading

enobufs commented Jun 28, 2021 •

edited

Loading

enobufs commented Jun 29, 2021 •

edited

Loading