-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iroh start & stop commands #360
Conversation
iroh-util/src/lock.rs
Outdated
/// Attempt to remove a stray lock file that wasn't cleaned up, returns true | ||
/// if a lock is successfully deleted, and will only attempt to delete if the | ||
/// lock is not currently held | ||
pub fn try_cleanup_dead_lock(prog_name: &str) -> Result<bool> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a test, and I need to review it when I'm not braindead, please scrutinize.
Sure, that works for me. I'm not a fan of the binary pid format, but not a hill I want to die on either. Exiting with a non-zero status totally makes sense. |
f20fc57
to
046d430
Compare
@fabricedesre, sounds good. I've changed the format to be a serialized UFT-8 string representation. |
@faassen I'd appreciate you smoke testing this on linux, iirc you have a fedora distro lying around, right? I'll also be testing on ubuntu |
@@ -0,0 +1,23 @@ | |||
//! iroh exit codes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could also move this into an errors package. I've just crammed it here for now, and don't think it's a big deal to move later once we start getting more rigorous about error categorization. Happy to move it now if someone wants to name the crate this belongs in
Yes, Fedora is my main platform so will do some testing tomorrow. In fact this command will be handy when I do manual testing with the CLI. |
note that it requires iroh binaries on $PATH. |
Finding two bugs:
The second problem is a way bigger conversation: do we want folks to be able to run more than one setup on their machine, if so how do we want them to do it, where would non-default locks live, etc |
localops/Cargo.toml
Outdated
@@ -0,0 +1,15 @@ | |||
[package] | |||
name = "localops" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is not private, should be iroh-localops
for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we for now just put this functionality into iroh-util
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd heavily prefer iroh-localops
, even just for the amount of conditional compilation that needs to go into this crate. Most of this work is about vetting dependencies for cross-platform compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does that change when putting it in utils? we already have a lot of crates :/
localops/src/process.rs
Outdated
} | ||
|
||
#[cfg(any(target_os = "macos", target_os = "linux"))] | ||
fn stop_process(pid: u32) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make all apis that use u32
s currently, use a new type Pid
, that looks sth like this
#[cfg(unix)]
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct Pid(nix::Pid);
#[cfg(not(unix))]
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct Pid; // TODO: fill in for each plattform when supported
#[cfg(unix)]
impl From nix::Pid for Pid {
// ..
}
impl std::fmt::Display for Pid {
// ..
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've marked this as a TODO for later
iroh-util/src/exitcodes.rs
Outdated
//! ``` | ||
|
||
/// Alias for the numeric type that holds system exit codes. | ||
pub type ExitCode = i32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use existing rust types instead of defining new ones :) https://doc.rust-lang.org/std/process/struct.ExitCode.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From that page:
ExitCode is intended to be consumed only by the standard library (via Termination::report()), and intentionally does not provide accessors like PartialEq, Eq, or Hash. Instead the standard library provides the canonical SUCCESS and FAILURE exit codes as well as From for ExitCode for constructing other arbitrary exit codes.
the definition is irritating:
pub struct ExitCode(imp::ExitCode);
this means you can't define a nonstandard exit code as a constant. I'd argue that std
authors are pointing us in the right direction: these are iroh exit codes, and should have their own type.
I'm going to rename the type to IrohExitCode
to further disambiguate, and await your clarification on why this isn't the right move 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mac bug:
iroh start
says "starting" and "success", but doesn't seem to create any lock files or launch any processes.
the individual binaries create and remove their lock files appropriately
if I start the individual binaries up, I can use iroh stop
to shut them down
iroh stop
also seems to have 4 output lines, but should only be reporting about 3 processes.
iroh-util/src/lock.rs
Outdated
/// Failure to parse contents of lock file | ||
#[error("Corrupt lock file contents at {0}")] | ||
CorruptLock(PathBuf), | ||
#[error("Cannot detrmine status of process holding this lock")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
detrmine -> determine
Err(anyhow!("daemonizing processes on windows is not supported")) | ||
} | ||
|
||
pub fn stop(pid: i32) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for later: you don't really need this indirection, you can just cfg
the public method
iroh-rpc-client/src/status.rs
Outdated
|
||
self.iter += 1; | ||
if current == None { | ||
self.iter = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it on purpose that this iterator is infinite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually it doesn't seem to be, I am confused about this logic here. what are you trying to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean yes, but now I realize that that doesn't make sense b/c why would you re-use iterators in Rust. I'll remove this reset entirely
iroh-util/src/lock.rs
Outdated
/// is already running. | ||
/// The lock is released either when the object is dropped | ||
/// or when the program stops. | ||
pub fn acquire_or_exit(lock: &mut ProgramLock) -> &mut ProgramLock { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be a method on ProgramLock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused, why does this exist here and down below again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tried a few variations, and couldn't get the runtime to hang onto the reference at the call site:
what I wanted:
fn main() {
let lock = ProgramLock::new("iroh-gateway")?.acquire_or_exit();
which caused the compiler to yell about an unused variable. changing lock
-> _lock
means the lock is Drop
d right away, an makes the lock not work at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 forgot to remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lock -> _lock means the lock is Dropd right away,
That is not true, this is how drop guards work, they are kept for the current scope. It is only dropped immediately if it is _
.
.unwrap_or_else(|| std::ffi::OsStr::new("")) | ||
.to_str() | ||
.unwrap() | ||
.split('.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to split on .
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.lock
extension in the file name. There's a forthcoming method in rust nightly that'll make this much cleaner
if let Ok(file) = File::create(&self.path) { | ||
file_guard::try_lock(&file, Lock::Exclusive, 0, 1).is_err() | ||
// path exists, examine lock PID | ||
let pid = read_lock(&self.path)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like a lot of duplication from is_locked
#[test] | ||
fn test_locks() { | ||
use nix::unistd::{fork, ForkResult::*}; | ||
use std::io::{Read, Write}; | ||
use std::time::Duration; | ||
|
||
// Start we no lock file. | ||
// Start with no lock file. | ||
let _ = std::fs::remove_file("test1.lock"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here as well, please use a temp directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is more complicated than it seems. I'll make an issue & address in a follow up
/// running. | ||
pub async fn start(api: &impl Api) -> Result<()> { | ||
// start_services(api, HashSet::from(["store"])).await | ||
start_services(api, HashSet::from(["store", "p2p", "gateway"])).await |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should really not be strings but an enum
}); | ||
|
||
// construct a new set from the intersection of missing & expected services | ||
let missing_services: HashSet<&str> = services |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HashSet::difference
is your friend
|
||
/// stop all services by sending SIGINT to any active daemons identified | ||
/// by lockfiles | ||
pub async fn stop(api: &impl Api) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this take a configuration of services as well?
b4f5f46
to
0ca8ca2
Compare
25ab856
to
1f28497
Compare
e622346
to
35208ca
Compare
supersedes #340. closes n0-computer/beetle#166. closes n0-computer/beetle#158, closes n0-computer/beetle#211
re-re-re worked start & stop commands that effectively do this on start:
all daemons now use lockfiles, and all lockfiles contain the PID of the owning process.
iroh stop
uses this PID to sendSIGINT
to each process.iroh stop
also makes a best effort to remove stray lockfiles.TODO:
cc @fabricedesre, this modifies lockfiles, writing the PID to the lockfile itself. I'd like to confirm you're ok with these changes. I'd also like to propose we exit with a non-zero status when a new process can't acquire the lock.