Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resume from suspend requires manual refresh #287

Closed
smemsh opened this issue Mar 5, 2024 · 19 comments
Closed

resume from suspend requires manual refresh #287

smemsh opened this issue Mar 5, 2024 · 19 comments

Comments

@smemsh
Copy link
Contributor

smemsh commented Mar 5, 2024

ctrl-z suspend and then re-foreground, requires a ctrl-l to see screen contents on resume. The following is surely the wrong fix but it works:

--- a/src/osdep/signals.c
+++ b/src/osdep/signals.c
@@ -123,7 +123,10 @@ poll_fg(void *t_)
 static void
 sig_cont(struct terminal *term)
 {
-       if (!unblock_itrm()) resize_terminal();
+       if (!unblock_itrm()) {
+               resize_terminal();
+               redraw_all_terminals();
+       }
 }
 #endif

Fix only works without libev/libevent. ctrl-l still required with libevent.
Was not able to test with libev due to compile error:

Found ninja-1.10.1 at /usr/bin/ninja
[35/288] Compiling C object src/elinks.p/cookies_cookies.c.o
FAILED: src/elinks.p/cookies_cookies.c.o
cc [...]  -o src/elinks.p/cookies_cookies.c.o -c ../src/cookies/cookies.c
In file included from ../src/cookies/cookies.c:31:
../src/main/select.h:60:25: error: field ‘timer_event’ has incomplete type
   60 |         struct ev_timer timer_event;
      |                         ^~~~~~~~~~~
[40/288] Compiling C object src/elinks.p/cookies_dialogs.c.o
ninja: build stopped: subcommand failed.

My system has libev-4.33.

Note also, ctrl-z behavior with libevent is curious, it goes back to the invoking shell, but leaves elinks running in the background rather than stopped. I suppose that is a separate bug though.

@rkd77
Copy link
Owner

rkd77 commented Mar 6, 2024

Try this ^ change. If ok, I'll look at the libev case.

@smemsh
Copy link
Contributor Author

smemsh commented Mar 7, 2024

Regarding b71faa2, I still get the same compiler error with that applied. Note that the header for libev is ev.h, while libevent uses event.h. Also note, my system has both installed and both libraries present:

 $ ls -1 /usr/include/**/{ev,event}.h
/usr/include/event2/event.h
/usr/include/event.h
/usr/include/ev.h

 $ ls -1 /usr/lib/**/lib{ev,event}.a
/usr/lib/x86_64-linux-gnu/libev.a
/usr/lib/x86_64-linux-gnu/libevent.a

I have also verified that my system version 4.33 remains the latest available (apparently from http://dist.schmorp.de/libev/)

I am compiling using meson configure -D libevent=false -D libev=true (among other flags).

@smemsh
Copy link
Contributor Author

smemsh commented Mar 7, 2024

Regarding 0acfd90, the patch fixes both libevent and noevent, with the following caveats:

  • There is a delay before the refresh once resume occurs.
  • libevent [but not noevent] requires pressing <enter> after the ctrl-z to complete the suspend
  • In both cases, elinks resumes running in background ~1s after suspending. This is not how older elinks behaved, and differs from other programs like text editors, etc. With multiple jobs, we now have to determine and then specify job number to resume the elinks slave.

Typical use case is, suspend to quickly enter shell command, then fg to go back. Now we have to look at jobs output and then fg <number>. Originally I had thought this behavior was only with libevent, but now I can reproduce it with noevent, I think because there is a small delay until it wakes up.

Ideal behavior is, once backgrounded, the slave does not wake up until SIGCONT was received, and then does a synchronous screen refresh without waiting for a timer.

@smemsh
Copy link
Contributor Author

smemsh commented Mar 7, 2024

I went back to 050f304, which was just prior to libevent code being added (2017), and elinks still has the same behavior of starting again in background shortly after suspending, so I might be remembering wrong that this behavior has changed. It would be nice if elinks didn't do this, but it doesn't seem to be a regression.

@smemsh
Copy link
Contributor Author

smemsh commented Mar 7, 2024

In sig_tstp() handler, a child process is forked, which waits one second and then raises SIGCONT in the parent and _exit()s, so this explains suspend and then resume in background behavior. It then I guess checks periodically if it's in the foreground every FG_POLL_TIME (half second) and redraws once it is?

Does anyone know why this is done? Is this because the same code is used in master process and slave process? And master can't go to sleep without blocking all slaves?

Could we have a different code path for slaves (only) that just stays asleep and doesn't fork anything? Only master needs to stay awake right?

@rkd77
Copy link
Owner

rkd77 commented Mar 7, 2024

Ad fg) It is copied from links. It behaved like this for years and nobody complained.

Ad libev) which distribution is it?

@smemsh
Copy link
Contributor Author

smemsh commented Mar 7, 2024

Ok, but is there any reason we could not make slave better? At least it will be better for one user. I can try to come up with a patch. Do you see a technical reason why slave needs to be awake?

I am using Ubuntu 22.

@smemsh
Copy link
Contributor Author

smemsh commented Mar 7, 2024

Incidentally, I proposed the below patch in 2007, so someone did complain about it before. The original response from developers is here: https://marc.info/?l=elinks-dev&m=119638212521090&w=2

I have worked around the master needing to keep running with a master session created using dtach -n. If I were to come up with a patch that does not use a config option, but only forks in the master, would this be tenable?

diff -ur elinks-0.13-20071210/src/config/options.inc elinks-nosusp/src/config/options.inc
--- elinks-0.13-20071210/src/config/options.inc	2007-12-10 15:40:03.000000000 -0700
+++ elinks-nosusp/src/config/options.inc	2007-12-10 22:33:29.000000000 -0700
@@ -1280,6 +1280,17 @@
 		"sessions", OPT_SORT,
 		N_("Sessions settings.")),
 
+#if defined (SIGCONT) && defined(SIGTTOU)
+	/* XXX see comment in signals.c to understand this */
+	INIT_OPT_BOOL("ui.sessions", N_("Keep session master running"),
+		"keep_master_running", 0, 0,
+		N_("Cause a master suspended instance to revive itself\n"
+		"after 1s.  Be aware that unsetting this option will\n"
+		"put the onus on the user to either keep the master\n"
+		"session running or start it in the background\n"
+		"and hangup the terminal.\n")),
+#endif
+
 	INIT_OPT_BOOL("ui.sessions", N_("Keep session active"),
 		"keep_session_active", 0, 0,
 		N_("Keep the session active even if the last terminal exits.")),
diff -ur elinks-0.13-20071210/src/osdep/signals.c elinks-nosusp/src/osdep/signals.c
--- elinks-0.13-20071210/src/osdep/signals.c	2007-12-10 15:40:03.000000000 -0700
+++ elinks-nosusp/src/osdep/signals.c	2007-12-10 22:27:07.000000000 -0700
@@ -74,16 +74,40 @@
 	pid_t pid = getpid();
 
 	block_itrm();
+
 #if defined (SIGCONT) && defined(SIGTTOU)
-	if (!fork()) {
-		sleep(1);
-		kill(pid, SIGCONT);
-		/* Use _exit() rather than exit(), so that atexit
-		 * functions are not called, and stdio output buffers
-		 * are not flushed.  Any such things must have been
-		 * inherited from the parent process, which will take
-		 * care of them when appropriate.  */
-		_exit(0);
+	/*
+	 * If the master session is suspended, all slave
+	 * instances will block waiting for it to resume.
+	 * The following code is intended to prevent this by
+	 * causing any terminal-stopped instance to be
+	 * resumed after one second by a short-lived child
+	 * that continues its parent.  This is really only
+	 * needed by the session master, but we do it
+	 * unconditionally based on the setting of the
+	 * config option.
+	 *
+	 * This works out fine because load_config() is
+	 * never called from slave instances, so they will
+	 * only use the compiled-in default for the option;
+	 * slaves can be suspended happily without blocking
+	 * any other instances, so it shouldn't ever be
+	 * needed to enable this behavior in slave
+	 * instances.
+	 */
+	if (get_opt_bool("ui.sessions.keep_master_running", NULL)) {
+		if (!fork()) {
+			sleep(1);
+			kill(pid, SIGCONT);
+			/* Use _exit() rather than exit(),
+			 * so that atexit functions are not
+			 * called, and stdio output buffers
+			 * are not flushed.  Any such things
+			 * must have been inherited from the
+			 * parent process, which will take
+			 * care of them when appropriate.  */
+			_exit(0);
+		}
 	}
 #endif
 	raise(SIGSTOP);

@rkd77
Copy link
Owner

rkd77 commented Mar 7, 2024

I don't get it. Do you want to only send continuation signal for master process?
If yes, check the fg branch, if not, please explain on some examples. And write once again why current code is wrong.

@rkd77
Copy link
Owner

rkd77 commented Mar 7, 2024

Ad libev) It works for me, but need to install libev-libevent-dev , which uninstalls libevent-dev.

@smemsh
Copy link
Contributor Author

smemsh commented Mar 7, 2024

I don't get it. Do you want to only send continuation signal for master process? If yes, check the fg branch

Yes, It works perfectly. This compromise allows me to have a headless master somewhere that is never suspended and never interacted with, while all slaves suspend and never wake up until foregrounded. The headless master continues to have existing behavior, so it should satisfy everyone. Thank you!

@smemsh
Copy link
Contributor Author

smemsh commented Mar 7, 2024

The remaining issues:

  • small delay until refresh on resume, maybe this is acceptable (500ms max). synchronous would be better, but it's ok by me to wait 500ms.
  • have to press enter after ctrl-z with libevent, before suspend actually happens

As for libev, I have specified libevent=false along with libev=true, so I think it should ignore libevent headers, even if present. I'm fine just running without either. Ordinary select() is ok with me.

@smemsh
Copy link
Contributor Author

smemsh commented Mar 7, 2024

small delay until refresh on resume, maybe this is acceptable (500ms max). synchronous would be better, but it's ok by me to wait 500ms.

Actually I noticed that the 500ms delay on resume is absent on fg branch (with slave). It refreshes instantly every time, when foregrounded. So, I am completely satisfied with fg branch behavior, as long as not using libevent.

Note, with libevent, also it does not correctly restore invoking shell screen. It starts writing at the top over the existing text on screen that was present before starting elinks. Normally (and in noevent case), screen contents at time of elinks invocation would be restored, and it starts writing where it left off at invocation time (or last suspend/resume). FYI

@rkd77
Copy link
Owner

rkd77 commented Mar 7, 2024

There is no big difference if any (select vs libevent), so I would stay with current state.

@rkd77
Copy link
Owner

rkd77 commented Mar 7, 2024

FYI, there is config option ui.sessions.fork_on_start . When enabled you have only slaves.

@smemsh
Copy link
Contributor Author

smemsh commented Mar 7, 2024

There is no big difference if any (select vs libevent), so I would stay with current state.

Well, the difference is, libevent (1) requires pressing enter after ctrl-z to suspend, and (2) does not properly restore screen contents on suspend. Whether this is a "big difference" is subjective. If libevent doesn't provide any advantage over select, then I would question why it's available anyways... but the more the merrier!

ui.sessions.fork_on_start

Interesting, I didn't know about that. I think it will obviate my need for using dtach with master. Thanks!

I will close ticket once you make decision on whether fg branch is safe to merge.

@rkd77
Copy link
Owner

rkd77 commented Mar 7, 2024

The fg branch was already merged to master.

@smemsh smemsh closed this as completed Mar 7, 2024
@smemsh
Copy link
Contributor Author

smemsh commented Mar 8, 2024

ui.sessions.fork_on_start

... I think it will obviate my need for using dtach with master

Turns out, leaving one running headless still helps for startup overhead if there is not already an instance running somewhere. First instance to start takes a bit of time (0.36s on my system), whereas slaves connecting to existing session start instantaneously. Hence, use of dtach -n elinks once per boot remains a good practice. (Also preserves memory cache...)

@rkd77
Copy link
Owner

rkd77 commented Mar 8, 2024

How do you measure this?

smemsh added a commit to smemsh/elinks that referenced this issue Apr 10, 2024
…n_start

When ui.sessions.fork_on_start, we fork a process that has a different
fdout than get_output_handle() (which has hardcoded fdout of 1), so it
will never be considered term->master, yet this is the process we want
to wake up in SIGTSTP handler.

Since we cannot rely on term->master to determine if we are the master
process, we instead move master_pid to be set explicitly at the places
where we have information about whether our process is a master or a
slave: first on start, then once the interlink determination has been
made.

master_pid has to be set in both parent and child, because both will get
suspended and need to know which one needs to resume in background with
SIGCONT (the master).  We can't inherit from the parent because it's
unknown at the time of fork.

Previously, master_pid worked correctly with fork_on_start=0,
-no-connect or -dump, but not with fork_on_start=1.

See rkd77#287 for background.
smemsh added a commit to smemsh/elinks that referenced this issue Apr 12, 2024
…n_start

When ui.sessions.fork_on_start, we fork a process that has a different
fdout than get_output_handle() (which has hardcoded fdout of 1), so it
will never be considered term->master, yet this is the process we want
to wake up in SIGTSTP handler.

Since we cannot rely on term->master to determine if we are the master
process, we instead move master_pid to be set explicitly at the places
where we have information about whether our process is a master or a
slave: first on start, then once the interlink determination has been
made.

master_pid has to be set in both parent and child, because both will get
suspended and need to know which one needs to resume in background with
SIGCONT (the master).  We can't inherit from the parent because it's
unknown at the time of fork.

Previously, master_pid worked correctly with fork_on_start=0,
-no-connect or -dump, but not with fork_on_start=1.

See rkd77#287 for background.
rkd77 added a commit that referenced this issue Oct 4, 2024
rkd77 added a commit that referenced this issue Oct 4, 2024
rkd77 added a commit that referenced this issue Oct 4, 2024
rkd77 pushed a commit that referenced this issue Oct 4, 2024
…n_start

When ui.sessions.fork_on_start, we fork a process that has a different
fdout than get_output_handle() (which has hardcoded fdout of 1), so it
will never be considered term->master, yet this is the process we want
to wake up in SIGTSTP handler.

Since we cannot rely on term->master to determine if we are the master
process, we instead move master_pid to be set explicitly at the places
where we have information about whether our process is a master or a
slave: first on start, then once the interlink determination has been
made.

master_pid has to be set in both parent and child, because both will get
suspended and need to know which one needs to resume in background with
SIGCONT (the master).  We can't inherit from the parent because it's
unknown at the time of fork.

Previously, master_pid worked correctly with fork_on_start=0,
-no-connect or -dump, but not with fork_on_start=1.

See #287 for background.

(cherry picked from commit b2556aa)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants