-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use troupe crate #202
base: development
Are you sure you want to change the base?
Use troupe crate #202
Conversation
Note that this would merge into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good so far. What else needs to be done?
squire_core/src/state/accounts.rs
Outdated
use troupe::{ | ||
ActorBuilder, | ||
ActorState, | ||
Permanent, | ||
Scheduler, | ||
sink::{ | ||
SinkActor, | ||
SinkClient, | ||
permanent::Tracker, | ||
}, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: You should be able to use a wildcard import for this.
use troupe::{ | |
ActorBuilder, | |
ActorState, | |
Permanent, | |
Scheduler, | |
sink::{ | |
SinkActor, | |
SinkClient, | |
permanent::Tracker, | |
}, | |
}; | |
use troupe::prelude::* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I would have thought that wildcard is less preferable because it could clobber names. Though I'm guessing that Rust would tell you if that happened (unlike Python). Do you recommend it generally or just if I'm importing a ton of stuff like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opinions on wildcard imports vary. You are correct that Rust will warn you about this (even between different versions of the same crate). But, generally speaking, if a crate has a prelude
module, this module should be thought of as "tools you need to use this crate" and/or "things you are going to commonly import".
As a general rule, I usually only use wildcard imports on prelude modules or something similar.
@@ -16,3 +16,14 @@ impl Debug for SessionCommand { | |||
} | |||
} | |||
} | |||
|
|||
impl Debug for AccountCommand { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary for this PR, but I wonder if we could use the derive_more
crates macros to generate this.
|
Could you clarify - if an actor message send fails, what exactly happened? Should we just retry until it succeeds? (Should that retry just be wrapped in a helper?) I took a crash course on Erlang one time long ago but I think I have more to learn about the actor model. |
state.handle_new_onlooker(id, user, ws).await; | ||
if !state.handle_new_onlooker(id, user, ws).await { | ||
panic!("TODO what if this fails") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the point it gets here, we've already assumed that AnyUser::convert
returned Ok(...)
, so it seems we're already ready to panic. Should we just panic here? Does the router handle panics nicely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait that's not right. It's not panicking in the case of not Ok(token)
, it's just returning. So I should probably silently fail like before, then, right?
@@ -85,7 +94,9 @@ where | |||
match msg { | |||
GatheringHallMessage::NewGathering(id) => self.process_new_gathering(id).await, | |||
GatheringHallMessage::NewConnection(id, user, ws) => { | |||
self.process_new_onlooker(id, user, ws).await | |||
if !self.process_new_onlooker(id, user, ws).await { | |||
panic!("TODO what if this fails") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this is the handler for an event fired off by this function call:
If the above event firing fails and we never get here, might panic, and maybe the router handles it (as I mentioned in a comment there). But what if this message send fails? I'd think we'd want to propagate that failure back to the router, but I don't think we can anymore. But from what I remember from Erlang, the actor model is supposed to be tolerant of errors. So I'm not really sure what to do.
I guess before I put this in, you were effectively ignoring the errors anyway. Should I just go back to that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From another thread:
The only time a message can fail is when a permanent actor has panicked.
Okay that's something I hadn't internalized. I see that track
waits for a response, and will panic if the actor is permanent and panics (not sure about transient actors that panic) whereas send
fires and forgets, ignoring(?) actor panics.
With that in mind, it seems that this panic would be ignored?
self.gatherings
.send(GatheringHallMessage::NewConnection(id, user, ws))
As well as the panics below in GatheringHallMessage::Persist
since they're scheduled with a delay:
scheduler.schedule(
Instant::now() + Duration::from_secs(5),
GatheringHallMessage::Persist,
);
Are these actually okay to ignore?
let tourns = ActorClient::builder(TournPersister::new(tourn_db.clone())).launch(); | ||
let gatherings = ActorBuilder::new(GatheringHall::new(tourns.clone())).launch(); | ||
let tourns = ActorBuilder::new(TournPersister::new(tourn_db.clone())).launch(); | ||
let gatherings = ActorBuilder::new(GatheringHall::<TournPersister>::new(tourns.clone())).launch(); | ||
AppState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So just to be sure of my reasoning here: it looks like sessions
, accounts
, and gatherings
are clients attached the app state, so those handlers should be Permanent
. I think that tourns
stays attached to gatherings
, which means that that should also be Permanent
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. All of those should be permanent. The only actors that should be transient are the Gathering
s, which are internally managed by the GatheringHall
actor.
squire_sdk/src/client/network.rs
Outdated
@@ -61,7 +66,11 @@ pub enum NetworkCommand { | |||
|
|||
#[async_trait] | |||
impl ActorState for NetworkState { | |||
type Permanence = Transient; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NetworkState
attaches to a SquireClient
which is created at the top level of main
. I guess it should be Permanent
.
(I initially thought it was a per-request sort of thing because I misunderstood what url
was)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is no reason for this to fail.
squire_sdk/src/client/tournaments.rs
Outdated
@@ -24,7 +32,7 @@ use crate::{ | |||
/// A container for the channels used to communicate with the tournament management task. | |||
#[derive(Debug, Clone)] | |||
pub struct TournsClient { | |||
client: ActorClient<ManagerState>, | |||
client: ActorClient<Transient, ManagementCommand>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, ActorClient
? How did this compile? Maybe the front end doesn't compile when you run cargo shuttle run
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The troupe
actor model is built with compiling to WASM in mind. Both the front and back ends used it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I was rather unclear - My point is that ActorClient
is the old name that you used in actor.rs
, which became SinkClient
etc. I'm confused as to how cargo shuttle run
started without complaining about this. My hypothesis was that it wasn't compiling the front end (which means I probably have plenty more errors to fix).
I think
But I'm realizing now that the type checker should correct me if I'm getting permanence wrong. I just need to be able to compile the client. |
Hmm, basically all actors we had were permanent except for the |
Sure. The only time a message can fail is when a permanent actor has panicked. Messages to and from an actor use in-memory channels, not something like interprocess communication, etc. Under the current model, this is an unrecoverable error from an outsider's perspective. We do not have something like a manager model yet. |
2f8734c
to
e73e454
Compare
4d7a7d3
to
dfd9e48
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## development #202 +/- ##
===============================================
- Coverage 30.25% 29.28% -0.98%
===============================================
Files 77 76 -1
Lines 4022 3961 -61
===============================================
- Hits 1217 1160 -57
+ Misses 2805 2801 -4 ☔ View full report in Codecov by Sentry. |
e393898
to
bf0eac7
Compare
dfd9e48
to
78f4b0c
Compare
@@ -63,6 +63,8 @@ headers = { version = "0.4", optional = true } | |||
# To be moved | |||
hashbag = { version = "0.1.11", features = ["serde"] } | |||
derive_more = "0.99.17" | |||
hyper = "1.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The history of this one is a bit weird. Within the history of development
, we see hyper = "0.14"
added and removed.
Should this line be here in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, the history is a bit weird, but this line is fine. This change is needed regardless.
@@ -239,7 +259,7 @@ impl ManagerState { | |||
let (sink, stream) = ws.split(); | |||
let (broad, sub) = watch_channel(()); | |||
entry.get_mut().comm = Some((sink, broad)); | |||
scheduler.add_stream(stream); | |||
scheduler.attach_stream(stream.fuse()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like .fuse()
showed up during the rebase. I tried removing it, and I get a type error in the front end code. I guess this is something I would have seen if I had compiled the front end before the rebase.
But the weird thing is that .fuse()
is also called within troupe
, in attach_stream_inner
. Is this correct, and is it necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really more of an issue with troupe
. I remember writing this function signature and was torn on this question. fuse
adds (basically) no overhead, so it isn't a problem. You need to give attached_stream
a fused stream because I want the caller to acknowledge that the stream is capable of terminating.
Internally, the second call to fuse
can be gotten rid of. If you would like, you can open a PR on the troupe repo to address this.
@@ -64,18 +65,23 @@ pub enum GatheringHallMessage { | |||
/// through message passing and tokio tasks. | |||
#[derive(Debug)] | |||
pub struct GatheringHall<P: ActorState<Message = PersistMessage>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this again and thinking about it, it seems like we can just remove this type restriction. (That it requires a phantom should have been a sign). I'm going to push a change that removes it, lmk what you think.
You added it here:
self.process_new_onlooker(id, user, ws).await | ||
if !self.process_new_onlooker(id, user, ws).await { | ||
panic!("process_new_onlooker failed") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now I have to decide on these panics I added. I think the right call is context specific. But I will say that if ignore the error instead, it won't be a regression (and as such maybe we can handle it in a separate PR). The question is about whether to handle a failure to send now that we have access to it (bool
return value of send()
), and whether panic
is the right option.
process_new_onlooker
gets or inits a gathering. Supposing we get one and it fails to send. Rather than panic, it seems like maybe what needs to happen is that these gatherings should be removed from the collection? Since they've panicked or shut down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I will say that if ignore the error instead, it won't be a regression
This is correct. This function returnsfalse
when the user attempts to connect to a tournament that has already shutdown for whatever reason. This is not an unrecoverable error for theGatheringHall
. A panic here means, amount other things, that all tournaments will fail to get new connections.
I think the best solution for now is to just ignore this error. I'm working to refactor most of the websocket management logic so that a user only ever has one WS open at a time. That will address most of the open questions around this problem.
sender.send(msg); | ||
if !sender.send(msg) { | ||
panic!("GatheringMessage::GetTournament failed") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is similar, maybe a gathering panicked or shut down. But this is a failure to write to disk. That's pretty bad, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to what I was saying before. A panic for Permanent
actors should be reserved for errors that are completely unrecoverable. If we failed to get a copy of the tournament, that's fine. The tournament just shutdown. That should not stop the entire tournament management system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks mostly good. There are two panics in the GatheringHall
that should be removed. After that, I'll happily merge this.
There is a conflict. I will resolve that error before merging. |
Should I squash-rebase, or do you use the squash Github feature? I have a lot of junk commit messages. |
* Don't panic so frivolously * Remove type restriction on GatheringHall (we could have done this before the troupe crate IMO) * Drop a couple drops that we no longer need thanks to troupe doing it for us * Undo a weird commit that snuck in somehow via rebase * Derive/Implement `Debug` for a few things that we now need
de8a79d
to
b0b55f4
Compare
It compiles at least. Seems like the core stuff were handlers for stores that should be permanent. The stuff in clients were created with different ids and such so they should be transient. Though looking again, I'm unsure about whether the ManagerState should have been permanent.
send
now seems to return a boolean where it didn't before. Assuming the value implies success, I figured we may not want to ignore it, but I assume the panics I put in place are not the right answer.