Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented NetworkConditioner #141

Closed
wants to merge 1 commit into from
Closed

Conversation

Antonito
Copy link
Member

This PR adds a NetworkConditioner on vnet.Net – which behaves in addition to the Router's settings.

A NetworkConditioner can be provided at vnet.Net creation time, or later.
The change in itself is not atomic, rather each value of the struct is updated atomically. It shouldn't cause any issue though (maybe invalid behavior on 1 or 2 packets at most) and permits to avoid any per-packet locking mechanism – which would hurt performances.

A user can use one of the preset (the values are taken from Apple's NetworkLinkConditioner), or provide their own custom one.

This is useful in situation such as when you need to unit test behavior when a specific user drops their connection.

I'm not 100% sure the handleDownLink and handleUpLink calls are at the best place nor return the correct values?

@Antonito Antonito requested a review from enobufs June 23, 2021 12:34
@codecov
Copy link

codecov bot commented Jun 23, 2021

Codecov Report

Merging #141 (2dfbb94) into master (8ae6e0d) will increase coverage by 0.57%.
The diff coverage is 89.58%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #141      +/-   ##
==========================================
+ Coverage   83.16%   83.74%   +0.57%     
==========================================
  Files          26       28       +2     
  Lines        1740     1882     +142     
==========================================
+ Hits         1447     1576     +129     
- Misses        190      197       +7     
- Partials      103      109       +6     
Flag Coverage Δ
go 83.73% <89.58%> (+0.58%) ⬆️
wasm 76.66% <81.25%> (+0.41%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
vnet/net.go 73.02% <50.00%> (-0.95%) ⬇️
vnet/network_conditioner.go 76.47% <76.47%> (ø)
vnet/network_conditioner_presets.go 100.00% <100.00%> (ø)
test/bridge.go 79.05% <0.00%> (+0.52%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8ae6e0d...2dfbb94. Read the comment docs.

@Antonito Antonito force-pushed the feat/network-condition branch 2 times, most recently from 215bc56 to 299835f Compare June 23, 2021 22:54
NetworkConditioner allows to replicate in a predictable way
network condition, such as a 3G or LTE network.
There is a list of pre-defined NetworkConditionerPreset,
or the user can provided their own.
@enobufs
Copy link
Member

enobufs commented Jun 27, 2021

@Antonito @Sean-Der

Before I look closely into the code, are we really sure we'd want to introduce network impairment tool per instance of Net (NIC) when we already have the same at Router level?

RouterConfig:

  • MinDelay time.Duration
  • MaxJitter time.Duration
    Router:
  • AddChunkFilter(filter ChunkFilter)
    • type ChunkFilter func(c Chunk) bool
      o ChunkFilter is a handler users can add to filter chunks. If the filter returns false, the packet will be dropped.

See: https://pkg.go.dev/github.com/pion/transport@v0.12.3/vnet

Notes:

  • Router already had a queue (incl. solutions to atomicity around queuing ops/complexity) and was convenient to implement above. The benefit was, Net can be simplified by synchronously writing into the queue on the router.
  • You can always cascade routers to mimic per-NIC configuration if necessary. (Remember, the NIC in the real world is actually a router with one local endpoint.)

I am just wondering if this is worth the effort / maintenance cost for the increased complexity just for the simulation/testing.

Copy link
Member

@enobufs enobufs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you'd need to implement queues (outbound & inbound), just like Router does in order to overcome the unwanted blocking operations I pointed out. Please let me know if I did not correctly capture your intention. Also, please consider what I suggested earlier (if the existing solution really cannot help your problems at all, etc).


// Apply constant latency if required
if sleepNeeded := time.Duration(atomic.LoadInt64((*int64)(&n.UpLink.Latency))) - bandwidthLimiterSleep; sleepNeeded > 0 {
time.Sleep(sleepNeeded)
Copy link
Member

@enobufs enobufs Jun 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is blocking the calling thread. WriteTo() would block when the send-buffer is full (which does not happen much with UDP) and I believe this is not what you are trying to simulate with this network conditioner.


// Apply constant latency if required
if sleepNeeded := time.Duration(atomic.LoadInt64((*int64)(&n.DownLink.Latency))) - bandwidthLimiterSleep; sleepNeeded > 0 {
time.Sleep(sleepNeeded)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken (it's been a while since I wrote this code.. :P), you are blocking the calling thread which is the processChunks routine from a router. During this blockage, the route cannot process other chunks unrelated to this NIC. I believe this is not what you intended...

@Antonito
Copy link
Member Author

Thanks a lot for your comments @enobufs

Totally right about the blocked processChunks, you're correct I intended to add latency to a single client not all of them, I missed that completely.
I'm all for building on top of existing code & keep things simple, and also want to provide the simplest way to test things (I find WebRTC tests can easily become quite verbose, so the simpler the better).

With the current PR, things in a typical scenario looks like this:

wan, err := vnet.NewRouter(&vnet.RouterConfig{
	CIDR:          "0.0.0.0/0",
	LoggerFactory: logging.NewDefaultLoggerFactory(),
})
assert.NoError(t, err)

vnetA := vnet.NewNet(&vnet.NetConfig{
	NetworkConditioner: vnet.NewNetworkConditioner(vnet.NetworkConditionerPresetNone),
})
assert.NoError(t, wan.AddNet(vnetA))

assert.NoError(t, wan.Start())
defer func() { _ = wan.Stop() }()

// Later, the user can dynamically change the NetworkConditioner, to simulate network disconnection for example
vnetA.SetNetworkConditioner(vnet.NewNetworkConditioner(vnet.NetworkConditionerPresetFullLoss))

By implementing on top of the Router, we'd get something like this:

wan, err := vnet.NewRouter(&vnet.RouterConfig{
	CIDR:          "0.0.0.0/0",
	LoggerFactory: logging.NewDefaultLoggerFactory(),
})
assert.NoError(t, err)

subnetA, err := vnet.NewRouter(&vnet.RouterConfig{
	CIDR:          "1.0.0.0/0",
	LoggerFactory: logging.NewDefaultLoggerFactory(),
	NetworkConditioner: vnet.NewNetworkConditioner(vnet.NetworkConditionerPresetNone),
})
assert.NoError(t, err)
assert.NoError(t, wan.AddRouter(vnetA))

vnetA := vnet.NewNet(&vnet.NetConfig{})
assert.NoError(t, subnetA.AddNet(vnetA))

assert.NoError(t, wan.Start())
defer func() { _ = wan.Stop() }()

I think we could introduce a helper function which would wrap the Router+Net creation, but then we'd have to handle CIDR (or let the user provide it, but I feel that's unnecessary complexity), since a subrouter can't have the same CIDR as its parent (or am I wrong on this?)

A bit out of the scope of this PR, but I've been wanting to simulate network changes, to test behavior in mobile-like situations (wifi<->lte switches). With this PR, we can change the behavior of the network, but that's still an inaccurate simulation since the address of the client doesn't change. I think it's something that could be handled at the router's level, and that would make sense to handle network condition on the same level in this situation.

If I'm not mistaken, the current implementation of Router has a single queue to handle incoming and outgoing packets (processChunks). To handle downlink/uplink conditions logic would require a bit of change.

The way I see it, there are 3 possibilities:

  • NetworkConditions at the Net level, just like this PR does, but with inbound & outbound queues
    • Simplest API for the user
    • A bit of double use with the Router's logic
  • NetworkConditions at the router level
    • API a bit more verbose
    • Would still require logic modifications
    • MinDelay is already specified in NeworkCondition, MaxJitter isn't. Ideally we'd have to get rid of MinDelay and add MaxJitter to NetworkCondition, but that's a breaking change I guess?
  • Have a mix of the two
    • We should still merge NetworkCondition with MaxJitter/MinDelay
    • Router's logic can stay the same, have the Net implement the NetworkCondition logic without queues (just as it is now)
    • Force NetworkCondition-ed Net to be wrapped in a dedicated subrouter. The Net will still block the processChunks routine, but that isn't an issue anymore here.
    • Downside is it's a bit less rigorous from a design perspective, and we still have the more verbose API

( I'd also love your thoughts on this @Sean-Der )

@enobufs
Copy link
Member

enobufs commented Jun 28, 2021

My only concern is if per NIC delay is that important for this level of testing. Primary goal of vnet was to emulate NAT so that we could test with various types of NAT. Network impairment was introduced later on, on the router because it already had a queue, so I agree, it was a shortcut and not truly representing real world's complex scenario, but that was good enough to find most of bugs with race conditions.

I used to use WAN emulators such as this. This helped me to evaluate overall performance and to be reassured before shipping a product to the real world, but most of the bugs were found just by introducing simple network impairment between point A and point B, with a tool similar to comcast (using iptables/netem). The vnet does this simple stuff and even more quickly because everything is on the test code to achieve the same.

Having said that, I understand what you are try to do (introducing more controls and helping user side complexity with presets also), and I am not too strong about the above thoughts. Maybe I am too minimalistic..

@Antonito
Copy link
Member Author

I still think there's value in enabling users to simulate network conditions via unit tests:

  • they're very simple to write, and provide consistent results
  • easy to integrate in existing CI/CD pipelines
  • they should help catch a good amount of network related problems (if not most?) & allow to test scenarios that aren't (easily) testable as of today

I also agree we should not overcomplicate stuff. I feel the parameters I added to NetworkCondition provide a good amount of flexibility (there are exactly the same as what iOS Network Conditioner Link provides), without making the code too complex.
For scenarios requiring more controls, sure, let's redirect users to more complete tools such as the ones you linked.

I get your point though, and the solutions I proposed might not be best ways to solve this. I dislike having network impairment controls on both the Router and Net, it goes against the idea of simplicity we're seeking.

@Antonito
Copy link
Member Author

After wrapping my head against all of this again, I think the best solution is to keep the impairment logic at the vNet level?

  • the queues add a bit of complexity, but not too much
  • keeps the API simple for the end user
  • MinDelay at the Router level can be removed, as it's already handled by DownLink/UpLink Latency
  • MaxJitter can be added to NetworkCondition and removed at the router level. This adds the possibility to have different DownLink and Uplink jitter.
  • Router-wide settings can be re-enabled by simply applying a config to all users? (but in that case, are router-wide settings really still useful/wanted?)

The only downside I see with this approach is we introduce breaking changes. But on the other hand, I don't think a lot of code depends on vNet, and especially the network impairment stuff, so that's probably ok?

I've been debating between this, and the Have a mix of the two... solution. However I think the later introduces more complexity, both for the end user (have to manage CIDR) and in the package (still have to move MaxJitter and MinDelay).
Also, keeping the vNet implementation without queues and forcing users to wrap network-impaired vNet into a dedicated sub-router still leaves the possibility of enabling network impairments on non-wrapped vNet – which would lead to unwanted & confusing behavior, which we should avoid.

If someone has a better solution, I'm all for it :)

@enobufs
Copy link
Member

enobufs commented Jun 29, 2021

If we were to emulate network more resembles to the real network, then we should keep the network impairment on the router and improve them for better.

"Network conditioner" settings are from the aspect of one endpoint. It is a special case and does not cover scenarios like having multiple endpoints sharing the same subscriber (i.e. Comcast cable, AT&T fiber, etc) line. The subscriber line is in between your residential router and the internet (other "WAN" routers), while the network speed within the LAN is very fast, etc. Mac's Network Link Conditioner is an alternative solution (things are happening only on your PC) when you don't have a way to introduce network impairment elsewhere.

I use Comcast and most of the bottle neck comes from the subscriber line, particularly the upload speed is very limited with Comcast cable. I'm sure lots of packet loss is happening behind my cable modem (a residential router).

What's missing in the Net class (~= NIC) is the socket buffer (currently not emulated). Write()/WriteTo() call would block if the send (socket) buffer becomes full (until the deadline expires). But if that is UDP, this blockage rarely happen. Also, if it is wifi, there would be relatively large packet loss due to radio signal interference (microwave in the kitchen, etc).

So, if I were to improvement vnet further, I'd look into:

  • Introduce a way to limit up/down bandwidths separately from bandwidth within the subnet. (as I mentioned above regarding the "subscriber line")
  • We could Introduce packet loss settings to Net (NIC) as well.

@stv0g
Copy link
Member

stv0g commented Apr 12, 2023

Hey @Antonito,

are you still interested in getting this merged?
If so, could you please resolve the conflicts and push a new version?

Thanks :)

@Antonito
Copy link
Member Author

Hey @Antonito,

are you still interested in getting this merged? If so, could you please resolve the conflicts and push a new version?

Thanks :)

Hey @stv0g!

I'm still interested in getting this merged, however I don't have the bandwidth to work on this.
If someone else wants to take ownership of this PR, it's fine by me :)

@stv0g
Copy link
Member

stv0g commented Apr 12, 2023

Thanks @Antonito,

as a maintainer, I need somebody who feels responsible for getting this PR merged and bring it forward.
Its quite normal that initial authors have to abandon their PRs due to resource constraints or a shift of focus.

As this seems to be the case, I would like to close the PR until we find somebody who wants to work on it to get it merged.

Please feel free to reopen the PR if you have the bandwidth :)

@stv0g stv0g closed this Apr 12, 2023
@stv0g stv0g deleted the feat/network-condition branch April 17, 2023 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants