From 533aad0902f5385b931027c1b1c88c5ca581c6cf Mon Sep 17 00:00:00 2001
From: Danila Fedorin <daniel.fedorin@hpe.com>
Date: Tue, 3 Dec 2024 11:48:55 -0800
Subject: [PATCH 1/2] Add a section to tech note about GPU-based halting

Signed-off-by: Danila Fedorin <daniel.fedorin@hpe.com>
---
 doc/rst/technotes/gpu.rst | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/doc/rst/technotes/gpu.rst b/doc/rst/technotes/gpu.rst
index 093142d647f2..6a040f17b89b 100644
--- a/doc/rst/technotes/gpu.rst
+++ b/doc/rst/technotes/gpu.rst
@@ -599,6 +599,42 @@ See the `asyncTaskComm
 benchmark for a full example of a pattern that benefits from oversubscribing
 GPUs.
 
+GPU-based Halting
+~~~~~~~~~~~~~~~~~
+
+Standard Chapel has a number of features that can cause a program to exit,
+or "halt". The 2.3 release of Chapel introduced the ability to execute halting
+functions on the GPU, allowing Chapel-generated GPU kernels to halt the
+execution of the whole program. This makes it possible to both invoke halts
+directly via Chapel's :proc:`~Errors.halt`, and to invoke functions that
+themselves halt. In prior releases, doing so made code ineligible for GPU
+execution.
+
+The following program demonstrates this feature, printing "halt reached in
+GPU kernel".
+
+.. code-block:: chapel
+
+  on here.gpus[0] {
+    @assertOnGpu
+    foreach i in 1..10 {
+      halt();
+    }
+  }
+
+There are some caveats to the current implementation:
+
+* Presently, halting is implemented by setting a flag from the kernel that
+  is later accessed by the host program. As a consequence, kernel execution
+  proceeds past the ``halt()`` call; however, once the kernel
+  is executed, the program exits.
+* There is a race condition between several threads using the halt flag
+  on the same device, which can interfere with the behavior of ``halt()``.
+  This will be fixed in future releases.
+* String manipulation for printing halt messages requires a number of features
+  ill-suited for the GPU. As a result, at this time, functions that use
+  the string-enabled overloads of ``halt()`` will still not work on the GPU.
+
 Known Limitations
 -----------------
 

From 4720c26da200356366638d4eff21f12a98728558 Mon Sep 17 00:00:00 2001
From: Danila Fedorin <daniel.fedorin@hpe.com>
Date: Tue, 3 Dec 2024 12:01:43 -0800
Subject: [PATCH 2/2] Incorprorate Engin's feedback

Signed-off-by: Danila Fedorin <daniel.fedorin@hpe.com>
---
 doc/rst/technotes/gpu.rst | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/doc/rst/technotes/gpu.rst b/doc/rst/technotes/gpu.rst
index 6a040f17b89b..6da1cec86f4e 100644
--- a/doc/rst/technotes/gpu.rst
+++ b/doc/rst/technotes/gpu.rst
@@ -624,16 +624,17 @@ GPU kernel".
 
 There are some caveats to the current implementation:
 
+* String manipulation for printing halt messages requires a number of features
+  ill-suited for the GPU. As a result, at this time, functions that use
+  the string-enabled overloads of ``halt()`` will still not work on the GPU.
+  This will be improved in future releases.
 * Presently, halting is implemented by setting a flag from the kernel that
   is later accessed by the host program. As a consequence, kernel execution
   proceeds past the ``halt()`` call; however, once the kernel
   is executed, the program exits.
-* There is a race condition between several threads using the halt flag
-  on the same device, which can interfere with the behavior of ``halt()``.
+* There is a race condition between several Chapel tasks using the same
+  device to launch kernels, which can interfere with the behavior of ``halt()``.
   This will be fixed in future releases.
-* String manipulation for printing halt messages requires a number of features
-  ill-suited for the GPU. As a result, at this time, functions that use
-  the string-enabled overloads of ``halt()`` will still not work on the GPU.
 
 Known Limitations
 -----------------