Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis pubsub support #645

Merged
merged 2 commits into from
Aug 5, 2022
Merged

Redis pubsub support #645

merged 2 commits into from
Aug 5, 2022

Conversation

rukai
Copy link
Member

@rukai rukai commented May 17, 2022

Closes #491 (We now ensure we read a response for every request we send out, which fixes the hang)
Implements the transform_subscribe idea described in #541

This only implements support for RedisSinkSingle.
I will investigate RedisSinkCluster later on, not sure what will be involved for that one.

@rukai rukai force-pushed the redis_pubsub branch 13 times, most recently from 3858406 to 4a53481 Compare July 13, 2022 06:40
@rukai rukai force-pushed the redis_pubsub branch 4 times, most recently from 9fb49b7 to 42ae8e3 Compare July 13, 2022 07:12
@rukai rukai changed the title Redis pubsub prototype Redis pubsub support Jul 14, 2022
@rukai
Copy link
Member Author

rukai commented Jul 14, 2022

I just realized that this PR doesnt handle the case explained here: #537 (comment)
So subscribe a1 works fine but subscribe a1 a2 does not.

subscribe a1 a2 returns:

1) "subscribe"
2) "a1"
3) (integer) 1

When it should return:

1) "subscribe"
2) "a1"
3) (integer) 1
1) "subscribe"
2) "a2"
3) (integer) 2

I think we can still reasonably land the PR as is and fix this is in a follow up.

@rukai
Copy link
Member Author

rukai commented Jul 14, 2022

Oh another case we arent handling yet is:
subscribe a1
subscribe a2
unsubscribe
Which will return a unique unsubscribe response for both a1 and a2, but we currently expect there to be only a single unsubscribe response.
I added a commented out test case for it and I'll see if its reasonable to resolve in this PR, but it could also be delayed to afterwards.

@rukai
Copy link
Member Author

rukai commented Jul 14, 2022

Possible fix 1

A fix for both of these problems is to send "message", "subscribe" and "unsubscribe" responses down the pushed messages chain not just "message".
To do this while following the transform invariants we will need to send a Frame::None down the regular chain and alter the shotover logic a little to drop Frame::Nones instead of panicking.
That seems reasonable and possible.

However!
This still leaves a problem where:

batch1:
    subscribe foo
batch2:
    unsubscribe foo
    get a

may be processed as:

enter pubsub mode
response to `get a`
leave pubsub mode

This is because the pubsub/nonpubsub responses to batch2 are nonordered due to going down separate chains

Oh.
I wonder if this always occurred anyway due to the "message" responses arriving after an "unsubscribe" 0

batch1:
    subscribe foo
batch2:
    unsubscribe foo

may be processed as:

enter pubsub mode
leave pubsub mode
"message" blah

I guess this doesnt apply to fix 1 but does apply to fix 2?

If we remove any "message"'s that occur in the same batch as a corresponding "unsubscribe" will that avoid any instances of this issue?
I think it might not as we could still have a corresponding "message" from a previous batch occur after this if we are backlogged.
Maybe we need some kind of ordering field to batches?

Possible fix 2

Alter the redis Frame so that we can store all the "subscribe" and "unsubscribe" responses batched into a single response.
This is pretty invasive to the frame structure so I would much rather not do this, but its possibly the only way.

@rukai
Copy link
Member Author

rukai commented Jul 15, 2022

Alright this is my plan to fix the discovered issues:

Fix response/request sync

group subscribe and unsubscribe responses into a single message. Either:

  • as an extra variant in RedisFrame
  • a Message level abstraction shared with cassandra

Fix "message" appearing after leaving pubsub mode

Add a field to Wrapper called pushed_messages_state which contains an enum:

enum PushedMessagesState {
    SetDrop,
    SetProcess,
    MaintainPreviousState,
}

server.rs then uses this field to maintain a drop_pushed_messages: bool that determines whether to ignore pushed messages.

Remaining issue

There is still the issue of "message"s arriving while in pubsub mode but after they were specifically unsubscribed.
But:

  • I think clients should handle that a lot better than receiving messages while outside of pubsub mode
  • short of having server.rs keep track of what subscriptions are currently active there is no way to solve this

Conclusion

These changes will be quite invasive so I propose we continue with the review, land this PR and then attempt these changes in follow up PRs.

@benbromhead
Copy link
Member

Shouldn't all client requests still just traverse the normal request chain? Which would stop the out of order problem? My read was that these commands would head down the normal chain

E.g.

batch1:
    subscribe foo
batch2:
    unsubscribe foo
    get a

I thought the unsubscribe command would head down the same chain as the subscribe. It's only the "subscription" messages (not commands) that go down the other chain (or at least in my mind we should do it that way, if we don't)?

Give that those are all client requests, the order they are issued will be maintained. There may be some subscribbed messages that interleave that. But get a should always be after the subscribe and then unsubscribe as we should push those client requests all across the same chain.

@rukai
Copy link
Member Author

rukai commented Jul 15, 2022

A quick refresher on terms:
When we are in pubsub mode we may receive the following responses (the name comes from the identifying string at index 0 of the response array):

  • "subscribe" a response to a SUBSCRIBE request
  • "unsubscribe" a response to an UNSUBSCRIBE request
  • "message" not a response to a request but a subscription message that occurs when another client publishes to a subscribed channel.

I thought the unsubscribe command would head down the same chain as the subscribe. It's only the "subscription" messages (not commands) that go down the other chain (or at least in my mind we should do it that way, if we don't)?

Yep, that is how I have currently implemented it and keeping it like that seems to be the best solution.

Give that those are all client requests, the order they are issued will be maintained. There may be some subscribbed messages that interleave that. But get a should always be after the subscribe and then unsubscribe as we should push those client requests all across the same chain.

The problem is not the ordering of responses but the ordering of "message"s relative to the responses.
e.g.
redis will return:

"subscribe" response
"message" response (not in response to any request)
"unsubscribe" response (at this point we leave subscription mode)

and shotover may reorder this to:

"subscribe" response
"unsubscribe" response (at this point we leave subscription mode)
"message" response (not in response to any request)

Which is very incorrect as the "message" will be interpreted as the response to the next request as the "message" was received after we left subscription mode.

Oh.
A possible solution I didnt consider is to just say that shotover doesnt support unsubscribing and handle UNSUBSCRIBE commands by just immediately closing the connection.
Not sure how important unsubscribing is to the usual redis pubsub use case.

@conorbros conorbros enabled auto-merge (squash) August 5, 2022 01:43
@conorbros conorbros merged commit 17060be into shotover:main Aug 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix hang in redis_int_tests::basic_driver_tests::test_source_tls_and_single_tls
3 participants