Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: create Happy Eyeballs dialer #176

Merged
merged 30 commits into from
Feb 14, 2024
Merged

feat: create Happy Eyeballs dialer #176

merged 30 commits into from
Feb 14, 2024

Conversation

fortuna
Copy link
Contributor

@fortuna fortuna commented Feb 8, 2024

No description provided.

@fortuna fortuna requested a review from jyyi1 February 8, 2024 02:18
Copy link
Contributor

@jyyi1 jyyi1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have added some comments, also I would suggest to update the PR title to feat: add HappyEyeballsDialer

CONTRIBUTING.md Show resolved Hide resolved
// Channel to wait for before a new dial attempt. It starts
// with a closed channel that doesn't block because there's no
// wait initially.
var attemptDelayCh <-chan struct{} = newClosedChan()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit it seems this is the only place newClosedChan() is used. Maybe we can simplify close the channel here and deletenewClosedChan function to save some lines?

Suggested change
var attemptDelayCh <-chan struct{} = newClosedChan()
attemptDelayCh := make(chan struct{})
close(attemptDelayCh)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code doesn't work because attemptDelayCh must be of type <-chan struct{}, not chan struct{}, otherwise the assignment attemptDelayCh = waitCtx.Done() doesn't compile.

Instead of creating a channel, closing, and then assigning to attemptDelayCh, I found it better to keep that logic out of the core logic, which is already complicated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG, maybe var attemptDelayCh <-chan struct{} = make(chan struct{}) would work here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that too 😅
The problem is that you can't close a read-only channel.

transport/happyeyeballs.go Outdated Show resolved Hide resolved
transport/happyeyeballs.go Outdated Show resolved Hide resolved
transport/happyeyeballs.go Outdated Show resolved Hide resolved
Comment on lines 153 to 155
resolutionDelayCtx, cancelResolutionDelay := context.WithTimeout(searchCtx, 50*time.Millisecond)
defer cancelResolutionDelay()
readyToDialCh = resolutionDelayCtx.Done()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these lines be simplified with the following? Because we already have a case <-searchCtx.Done() statement in the select below, which will be triggered once searchCtx is cancelled. In addition, if we linked searchCtx to readyToDialCh, will there be any potential race conditions, because both case <-searchCtx.Done() and case <-readyToDialCh will be ready when searchCtx is cancelled?

Suggested change
resolutionDelayCtx, cancelResolutionDelay := context.WithTimeout(searchCtx, 50*time.Millisecond)
defer cancelResolutionDelay()
readyToDialCh = resolutionDelayCtx.Done()
readyToDialCh = time.Deplay(50*time.Millisecond)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming you meant time.NewTimer(50*time.Milisecond).C. I wasn't sure if we needed to drain the Timer.C channel, but it seems it's buffered and we don't need to?

In any case, I tried based on your suggestion, but the type of channel is different (<-chan time.Time), so it requires more changes, not really simpler. Also, the timer sends the time once, versus a context closes the channel, which is easier to reason about (a read will never block).

Another important point is that I still need to stop the timer, so I need to pass it to the goroutine to call Stop in case the dial returns before the timer triggers.

On the race condition, that's a valid point. A cancellation will potentially trigger an extra dial. I tried different ways to trigger, but couldn't. In any case, I changed the code to use the background context instead.

transport/happyeyeballs.go Outdated Show resolved Hide resolved
transport/happyeyeballs.go Outdated Show resolved Hide resolved
transport/happyeyeballs.go Show resolved Hide resolved
@fortuna fortuna changed the title Create Happy Eyeballs dialer feat: create Happy Eyeballs dialer Feb 9, 2024
Copy link
Contributor Author

@fortuna fortuna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments addressed

CONTRIBUTING.md Show resolved Hide resolved
transport/happyeyeballs.go Outdated Show resolved Hide resolved
// Channel to wait for before a new dial attempt. It starts
// with a closed channel that doesn't block because there's no
// wait initially.
var attemptDelayCh <-chan struct{} = newClosedChan()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code doesn't work because attemptDelayCh must be of type <-chan struct{}, not chan struct{}, otherwise the assignment attemptDelayCh = waitCtx.Done() doesn't compile.

Instead of creating a channel, closing, and then assigning to attemptDelayCh, I found it better to keep that logic out of the core logic, which is already complicated.

transport/happyeyeballs.go Outdated Show resolved Hide resolved
transport/happyeyeballs.go Outdated Show resolved Hide resolved
transport/happyeyeballs.go Outdated Show resolved Hide resolved
transport/happyeyeballs.go Show resolved Hide resolved
transport/happyeyeballs.go Outdated Show resolved Hide resolved
Comment on lines 153 to 155
resolutionDelayCtx, cancelResolutionDelay := context.WithTimeout(searchCtx, 50*time.Millisecond)
defer cancelResolutionDelay()
readyToDialCh = resolutionDelayCtx.Done()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming you meant time.NewTimer(50*time.Milisecond).C. I wasn't sure if we needed to drain the Timer.C channel, but it seems it's buffered and we don't need to?

In any case, I tried based on your suggestion, but the type of channel is different (<-chan time.Time), so it requires more changes, not really simpler. Also, the timer sends the time once, versus a context closes the channel, which is easier to reason about (a read will never block).

Another important point is that I still need to stop the timer, so I need to pass it to the goroutine to call Stop in case the dial returns before the timer triggers.

On the race condition, that's a valid point. A cancellation will potentially trigger an extra dial. I tried different ways to trigger, but couldn't. In any case, I changed the code to use the background context instead.

@fortuna fortuna requested a review from jyyi1 February 9, 2024 22:30
Copy link
Contributor

@jyyi1 jyyi1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code LGTM, with some questions to be resolved. But the code flow seems a little bit complicated, hopefully we did not miss any race conditions or edge cases.

Comment on lines +88 to +90
// Indicates to attempts that the dialing process is done, so they don't get stuck.
ctx, dialDone := context.WithCancel(ctx)
defer dialDone()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I mean we did not do anything special with dialDone in the last review. Why not just simply using the ctx parameter directly in the code below?

Suggested change
// Indicates to attempts that the dialing process is done, so they don't get stuck.
ctx, dialDone := context.WithCancel(ctx)
defer dialDone()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no guarantee that the input context will be cancelled. In this code, I cancel the context to make sure all subtasks are notifed and can complete.
If I remove the dialDone, the subtasks may hang for a while.

// Channel to wait for before a new dial attempt. It starts
// with a closed channel that doesn't block because there's no
// wait initially.
var attemptDelayCh <-chan struct{} = newClosedChan()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG, maybe var attemptDelayCh <-chan struct{} = make(chan struct{}) would work here?

}

func TestHappyEyeballsStreamDialer_DialStream(t *testing.T) {
t.Run("Works with IPv4 hosts", func(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of using t.Run instead of defining a test function func TestIPv4HostShouldWork() {}?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subtests all get grouped nicely in the output and in the code, and I don't have to repeat the prefix all the time. Specially when I keep renaming things.

image

"github.com/stretchr/testify/require"
)

type colletcStreamDialer struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type colletcStreamDialer struct {
type collectStreamDialer struct {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor Author

@fortuna fortuna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the flow is a bit complicated. I couldn't figure out a simpler way (it was a lot worse before). At least I have ~100% coverage for this code.

FYI, I generalized the API. Now you can customize the resolution phase to get v1 or v2 behavior. You can also inject hard-coded IPs, which will be quite useful.

I added documentation examples that are helpful illustrations.

image

Comment on lines +88 to +90
// Indicates to attempts that the dialing process is done, so they don't get stuck.
ctx, dialDone := context.WithCancel(ctx)
defer dialDone()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no guarantee that the input context will be cancelled. In this code, I cancel the context to make sure all subtasks are notifed and can complete.
If I remove the dialDone, the subtasks may hang for a while.

// Channel to wait for before a new dial attempt. It starts
// with a closed channel that doesn't block because there's no
// wait initially.
var attemptDelayCh <-chan struct{} = newClosedChan()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that too 😅
The problem is that you can't close a read-only channel.

"github.com/stretchr/testify/require"
)

type colletcStreamDialer struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

}

func TestHappyEyeballsStreamDialer_DialStream(t *testing.T) {
t.Run("Works with IPv4 hosts", func(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subtests all get grouped nicely in the output and in the code, and I don't have to repeat the prefix all the time. Specially when I keep renaming things.

image

@fortuna fortuna requested a review from jyyi1 February 10, 2024 00:03
Copy link
Contributor

@jyyi1 jyyi1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

@fortuna fortuna merged commit 3694a32 into main Feb 14, 2024
6 checks passed
@fortuna fortuna deleted the fortuna-heb branch February 20, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants