Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to connect redis with tls #13

Closed
punyapatsw opened this issue Jan 3, 2023 · 14 comments · Fixed by #17
Closed

Failed to connect redis with tls #13

punyapatsw opened this issue Jan 3, 2023 · 14 comments · Fixed by #17
Labels
bug Something isn't working evaluation needed

Comments

@punyapatsw
Copy link

When connecting to redis via tls and setting "insecureSkipTLSVerify: true," the error "ERRO[0001] Uncaught (in promise) EOF executor=shared-iterations scenario=redisPerformance" will occur.
To get around this problem, I removed "c.redisOptions.Dialer = vuState.Dialer.DialContext" from func connect() in "redis/client.go."

func (c *Client) connect() error {
	// A nil VU state indicates we are in the init context.
	// As a general convention, k6 should not perform IO in the
	// init context. Thus, the Connect method will error if
	// called in the init context.
	vuState := c.vu.State()
	if vuState == nil {
		return common.NewInitContextError("connecting to a redis server in the init context is not supported")
	}

	// If the redisClient is already instantiated, it is safe
	// to assume that the connection is already established.
	if c.redisClient != nil {
		return nil
	}

	// If k6 has a TLSConfig set in its state, use
	// it has redis' client TLSConfig too.
	if vuState.TLSConfig != nil {
		c.redisOptions.TLSConfig = vuState.TLSConfig
	}

	// use k6's lib.DialerContexter function has redis'
	// client Dialer
	// c.redisOptions.Dialer = vuState.Dialer.DialContext

	// Replace the internal redis client instance with a new
	// one using our custom options.
	c.redisClient = redis.NewUniversalClient(c.redisOptions)

	return nil
}
@imiric
Copy link
Contributor

imiric commented Jan 26, 2023

Hi there, thanks for reporting this issue. And sorry for the slow response.

insecureSkipTLSVerify is a k6-only option and doesn't apply for the Redis connection made by this extension. There's no reason that it couldn't, so I think we should make that work as expected. EDIT: Actually, nevermind, this should work already, since it's set in the vuState.TLSConfig.

What you accomplished with your workaround is to disable the k6-specific dialer, but go-redis would fallback to using a default dialer instead, so you're likely only masking the EOF error, but you would still be unable to connect.

Since you want to use insecureSkipTLSVerify I'm assuming you're using self-signed certificates. The solution in this case is to use k6's tlsAuth option to load your custom certificate and key. This should work, since as you see in the Client.connect() method there, we're setting TLSConfig as specified in k6.1

I tried to make this work, by creating the certs with Redis' own test script, starting the server with them, creating the .pem files from .crts, and loading them in the script with tlsAuth, but... I still got the same EOF error. 😞

I'm sure it's not a server issue, as I can connect with redis-cli fine by specifying the certs. I wasn't able to see anything beyond that EOF error, even digging into go-redis methods. The server just logs:

Error accepting a client connection: error:1408F10B:SSL routines:ssl3_get_record:wrong version number (conn: fd=9)

I'm not 100% sure if I created the .pem file correctly, so I need to confirm that.1

But, in theory, this should work the same way as it does for HTTP and WebSocket connections in k6, and you wouldn't need to use insecureSkipTLSVerify.

In any case, this needs further investigation beyond what I can dedicate time to right now, so let's look into it later. If someone else has time to dig into this, please do so. 🙏

cc: @oleiade

EDIT: BTW, I'm not sure if it's relevant, but the suggested way to load TLS certs in redis/go-redis#1306 (comment) is via tls.LoadX509KeyPair("fullchain.pem", "privkey.pem"), which k6 currently doesn't call. I'm not sure if this is a strict requirement, or if the current way k6 loads them would be enough, but might be something to look into as well.

Footnotes

  1. A very late update, and correction: I was confusing client certs, which is what our tlsAuth option is for, with CA certs, which you would need in this case to use a custom server certificate. This is why the tlsAuth option didn't work for me. k6 does not support CA certs yet, so this will need a more cohesive implementation to fix, as Théo indicated below. Sorry for the confusion. 2

@imiric imiric added bug Something isn't working evaluation needed labels Jan 26, 2023
@punyapatsw
Copy link
Author

Hi There,

My script is able to connect to Redis without any issues by using a workaround.
This is GCP Memstore Redis, which only provides a CA cert. That's why I tried to use insecureSkipTLSVerify.
Normally, I can connect to the server with redis-cli by using the --cacert flag.
redis-cli -h localhost -p 6378 -a "auth-key" --tls --cacert ./ca-cert.pem

Note: The script is executed locally, using gcloud to tunnel to GCP Memstore.

@oleiade
Copy link
Member

oleiade commented Jan 27, 2023

@punyapatsw my understanding is that your initial issue is solved as result. Is that correct?

If that's the case I would then close it in favor #14 , which intends to improve the documentation regarding using TLS connections with redis 🙇🏻

@punyapatsw
Copy link
Author

Yes, my issue was solved by using the default dialer instead, but I am not sure about the consequences.
Also, using the default dialer required editing and recompiling the library. Will documenting the process be sufficient?
It would be helpful if there were documentation about using a CA pool, since in my case I couldn't use k6's tlsAuth option.

@oleiade
Copy link
Member

oleiade commented Jan 27, 2023

I see @punyapatsw thanks a lot for coming back to us. We will keep the issue open, and I intend to work on this in the upcoming weeks to see if I streamline support for TLS in the module somehow 👍🏻

@oleiade
Copy link
Member

oleiade commented Feb 10, 2023

Hey @punyapatsw 👋🏻

We took time with the k6 team to investigate what might be causing the issue. I can confirm k6 causes it 🙇🏻

The reason this happens is roughly the following:

  • the redis extension tries to be a good citizen of k6 and reuses its Go Dialer
  • the k6 current Dialer implementation doesn't support TLS the same way the Redis library we use under the hood does.
  • the way k6 currently handles TLS works is bound to HTTP and was not designed to work with direct TCP/UDP connections like the redis extension intends.

We have discussed and established a plan forward:

  1. We decided to wait to add a specific behavior in k6 for this use case, the changes involved are too risky, and we prefer to wait for some refactoring before even attempting this move.
  2. We decided that the Redis extension should wrap the k6 Dialer, and add the necessary steps to establish a TLS handshake itself in the meantime (the idea there is to mimic the code from the Go standard library tls package right after calling the DialContext method from k6).
  3. We want to use this opportunity to add the ability to pass certificates directly to the redis Client. That way, it would be possible to emulate multiple clients establishing unrelated TLS connections in different VUs.

Solving this issue is not so trivial, and my gut feeling is that we can expect to see it addressed either in v0.44 or v0.45.

@vukor
Copy link

vukor commented Jun 8, 2023

Hey guys, FYI, the same issue is reproduced with AWS ElastiCache via TLS protocol. So I've built a forked k6 version according to @punyapatsw's comment while the issue is not resolved:

  • Patch k6:
git clone https://github.com/grafana/xk6-redis.git
cd xk6-redis
# Comment 'c.redisOptions.Dialer = vuState.Dialer.DialContext' line
diff --git a/redis/client.go b/redis/client.go
index 43b8511..90eb5c2 100644
--- a/redis/client.go
+++ b/redis/client.go
@@ -1084,7 +1084,7 @@ func (c *Client) connect() error {

 	// use k6's lib.DialerContexter function has redis'
 	// client Dialer
-	c.redisOptions.Dialer = vuState.Dialer.DialContext
+	//c.redisOptions.Dialer = vuState.Dialer.DialContext
  • Build k6 tool for Mac
docker run --rm -e GOOS=darwin -u "$(id -u):$(id -g)" -v "${PWD}:/xk6" \
  grafana/xk6 build v0.44.1 \
  --with github.com/grafana/xk6-redis@latest=.
  • Build k6 tool for Linux
docker run --rm -u "$(id -u):$(id -g)" -v "${PWD}:/xk6" \
  grafana/xk6 build v0.44.1 \
  --with github.com/grafana/xk6-redis@latest=.

@oleiade
Copy link
Member

oleiade commented Jun 26, 2023

Hey @vukor 👋🏻

Thanks for the heads-up 🙇🏻 We haven't addressed the issues generally (as opposed to specifically for this use case) yet, but we have planned to give it a shot tentatively in the upcoming release of k6. No promises, though 🤞🏻

@imiric
Copy link
Contributor

imiric commented Jun 29, 2023

Hey folks, I took a closer look into this issue, and here are some notes we could discuss.

The main issue is that the k6 Dialer (vuState.Dialer.DialContext) always returns a net.Conn (an unencrypted TCP connection), and with TLS servers we should return a tls.Conn instead. This is the cause of the cryptic EOF error. In this case the Redis server logs:

Error accepting a client connection: error:1408F10B:SSL routines:ssl3_get_record:wrong version number (conn: fd=9)

... because it's expecting the TLS handshake, and k6 is not doing it, so the connection is closed abruptly.

I confirmed that if I establish a tls.Conn and set insecureSkipTLSVerify: true, it works fine. This is essentially what happens by commenting out the c.redisOptions.Dialer = vuState.Dialer.DialContext line, per the above workaround, since go-redis will default to a TLS connection. The problem with this is that it's not possible to also connect to unencrypted TCP-only servers. So the trick is knowing when to establish a TCP or a TLS connection...

The reason this works for HTTP in k6 is that http.Transport checks the URL scheme, and if it's https it does the TLS handshake after establishing the TCP connection. So we essentially need to do the same in xk6-redis.

We need for the script to specify whether it's connecting to a TCP or a TLS server, as we can't easily make this guess ourselves. Technically, we could establish a TCP connection, try to do the TLS handshake, and if it fails, fallback to TCP, but this requires timing out, which adds a connection delay and possible false negatives if the timeout is too short, etc., so it would be super hacky.

Instead, we should allow users to specify this in the Client constructor. There is a Redis TLS scheme for this (rediss://), which go-redis supports, albeit not transparently, but with this helper function that translates Redis URLs to Options which we can use to initialize the Client.

See how the Node Redis lib does this: https://github.com/redis/node-redis/blob/master/docs/client-configuration.md
They allow passing a URL, or a socket object with more configuration. Since we want to also support custom TLS certs, we should make this configurable anyway. Configuring the client via URL query strings would also be more flexible, for users that prefer it.

My only doubt is how we should handle connecting to a Redis cluster if the URLs have conflicting options. E.g. they use different database names.
I don't see this mentioned in Node Redis, but they do have a separate API for connecting to a cluster, so we should look into that.

So I think the way forward is:

  1. Refactor our Client constructor to be similar to the Node Redis API. This is a breaking change, but we're allowed to do those as an experimental k6 module. 😌
    This will ultimately result in a more familiar API for JS developers. It doesn't have to be a 1:1 replica (e.g. I don't think we should have a separate constructor for clusters), but at least the way options are specified per Redis node should be consistent.

  2. Do the conditional TCP/TLS connection change, which will effectively fix this issue.

  3. Add support for custom TLS certs, which we can tackle in a separate issue.

The first one is probably the biggest change. The second one is trivial, and the third one is somewhere in the middle.

The good news is that this doesn't require changes in k6, AFAICT. We can reuse k6's Dialer, so we can track metrics like received and sent data, reuse blocklists and DNS resolver settings, but the way we handle TLS will be slightly different.

@oleiade WDYT? I can draft a tentative API proposal right here, if you agree.

@oleiade
Copy link
Member

oleiade commented Jun 29, 2023

Thanks a lot for the detailed analysis indeed, much appreciated 🙇🏻

I think you're spot on, and your plan makes a ton of sense 🤝 The Node API looks good and user-friendly indeed, and I agree we should try to reproduce it on our side. I also agree that we should tackle 1., and 2. first and custom TLS certs in a separate issue to have a better separation of concern, which will probably make our lives easier.

Please go ahead with a draft proposal @imiric if you find the time and feel like it, and let me know if I can support you in any way 👍🏻 🎉

@imiric
Copy link
Contributor

imiric commented Jun 30, 2023

Alright, here's a draft of the API I have in mind:

// Simplest form using a URL string
new redis.Client('redis://user:pass@localhost:6379/0');

// Same as above, but passing an object
new redis.Client({
  socket: {
    host: 'localhost',
    port: 6379,
  },
  username: 'user',
  password: 'pass',
  database: 0,
});

// TLS with a URL
new redis.Client('rediss://user:pass@localhost:6379/0');

// TLS with an object
new redis.Client({
  socket: {
    host: 'localhost',
    port: 6379,
    tls: true,  // or tls: {}
  },
  username: 'user',
  password: 'pass',
  database: 0,
});

// TLS with a custom CA cert
new redis.Client({
  socket: {
    host: 'localhost',
    port: 6379,
    tls: {
      ca: open('ca.crt'),
    },
  },
});

// TLS client auth (mTLS)
new redis.Client({
  socket: {
    host: 'localhost',
    port: 6379,
    tls: {
      ca: open('ca.crt'),
      cert: open('client.crt'),
      key: open('client.key'),
    },
  },
});

// Cluster client, with just URLs
new redis.Client(
  cluster: {
    nodes: [
      'redis://user:pass@host1:6379/0',
      'redis://user:pass@host2:6379/0',
    ],
  },
);

// Cluster client with objects
new redis.Client({
  cluster: {
    nodes: [
      {
        socket: {
          host: 'host1',
          port: 6379,
        },
        username: 'user',
        password: 'pass',
        database: 0,
      },
      {
        socket: {
          host: 'host2',
          port: 6379,
        },
        username: 'user',
        password: 'pass',
        database: 0,
      },
    ],
  },
});

WDYT?

Obviously, not all properties under socket and tls would be supported as they are in Node (Redis), but we can expand this later.

I'm not certain on the repurposing of the Client constructor for clusters, but at the same time I don't see the need to have a separate constructor for it, since the Client will ultimately expose the same interface in either case. The only check needed to initialize this properly is whether the cluster property was specified or not.

If I find this to be cumbersome for whatever reason, I'll create a separate redis.Cluster constructor/object instead, which would be more aligned with Node Redis.

Similarly for allowing plain strings to be passed that will be interpreted as URLs. It's more convenient than having to pass {url: 'redis://...'}, and there's no ambiguity if the url will override the object values, or viceversa. Only a type check is needed. I don't think we'll have any issues with this.

So I've started working on this and hope to create a PR for at least the initial refactor sometime next week.

@oleiade
Copy link
Member

oleiade commented Jul 5, 2023

Great work @imiric I really love it! 👍🏻 👏🏻 🎉 Clean, comprehensible, and user-friendly, let's do it 🦖

Regarding clustering, I checked out the node-redis API, and what exists on the go-redis counterpart, and my preference goes to having dedicated constructor of some sort in JS, just like node-redis does.

I believe that would involve a bit more work to rewire stuff under the hood, but considering this is a specific use case users have been asking about, and we haven't fully supported yet, the hassle of making an explicit API for it feels worth it?

WDYT? 🙇🏻

@imiric
Copy link
Contributor

imiric commented Jul 5, 2023

@oleiade I'm not sure. I don't see a reason to have a separate constructor for what will essentially be a redis.UniversalClient underneath, which is a common interface for single-node, failover and cluster clients. go-redis already abstracts this for us, and while I considered splitting it up internally, I ultimately think we can keep using it and UniversalOptions.

The only benefit of using a separate constructor is to remove one level from the configuration object (i.e. the cluster property in my examples above), and to have a slightly simpler constructor, since we wouldn't be checking for that property. But these are very minor benefits, and as a user I would prefer to have a single import of Client and be able to use it for all 3 purposes, since the final API of interacting with Redis is the same. So the Node Redis API is slightly annoying in that sense, and I'd prefer to deviate from it in this case. Of course, given that I don't run into any complications. 🤞

Do you see another benefit I'm ignoring?

I believe that would involve a bit more work to rewire stuff under the hood, but considering this is a specific use case users have been asking about, and we haven't fully supported yet, the hassle of making an explicit API for it feels worth it?

But using Client to connect to a cluster should be supported right now, no? Users can pass multiple addrs, which go-redis would interpret as a ClusterClient. I haven't tested this personally, and we don't have tests for it, but it should work:tm:.

@oleiade
Copy link
Member

oleiade commented Jul 6, 2023

I don't see a reason to have a separate constructor for what will essentially be a redis.UniversalClient underneath

That's a fair point indeed 🤝 I guess the aspect that brought this to my mind was that in the proposed API, we do make a distinction between single node, and cluster, which the underlying UniversalClient does not. But I like your solution, it involves a bit more validation logic indeed, but composes nicely, and feels intuitive enough 👍🏻

But using Client to connect to a cluster should be supported right now, no? Users can pass multiple addrs, which go-redis would interpret as a ClusterClient. I haven't tested this personally, and we don't have tests for it, but it should work™️.

Cluster should be supported already, yes. We've had, however, a couple of users reporting issues around it (#15 and grafana/k6#3031).

After sleeping on it, I'm happy with your proposal 👏🏻 Let's dismiss the idea of a dedicated Cluster constructor indeed 👍🏻

imiric pushed a commit that referenced this issue Aug 10, 2023
To allow passing objects similar to node-redis. See
#13 (comment)
@imiric imiric linked a pull request Aug 10, 2023 that will close this issue
5 tasks
oleiade pushed a commit that referenced this issue Oct 12, 2023
To allow passing objects similar to node-redis. See
#13 (comment)
oleiade pushed a commit that referenced this issue Oct 12, 2023
To allow passing objects similar to node-redis. See
#13 (comment)
oleiade pushed a commit that referenced this issue Oct 17, 2023
To allow passing objects similar to node-redis. See
#13 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working evaluation needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants