- 
                Notifications
    
You must be signed in to change notification settings  - Fork 3
 
Interactive: refactor task done #795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
          
WalkthroughRemoved per-task  Changes
 Sequence Diagram(s)sequenceDiagram
  autonumber
  participant Caller as Caller
  participant Exec as execute_tasks
  participant NoCache as _execute_task_without_cache
  participant Future as Future
  Note over Exec,NoCache: no-cache execution
  Caller->>Exec: submit task
  Exec->>NoCache: run(interface, task_dict)
  NoCache-->>Future: set_result / set_exception
  NoCache-->>Exec: return
  Exec->>Exec: _task_done(future_queue)
  Exec-->>Caller: futures
    sequenceDiagram
  autonumber
  participant Caller as Caller
  participant Exec as execute_tasks
  participant CacheH as _execute_task_with_cache
  participant Disk as FileCache
  participant Future as Future
  Note over Exec,CacheH: cached execution
  Caller->>Exec: submit task (cache_key)
  Exec->>CacheH: run(interface, task_dict, cache_directory, cache_key)
  alt cache hit
    CacheH->>Disk: read(cache_file)
    Disk-->>CacheH: data
    CacheH-->>Future: set_result
  else cache miss / compute
    CacheH->>Disk: write(cache_file)
    CacheH-->>Future: set_result / set_exception
  end
  CacheH-->>Exec: return
  Exec->>Exec: _task_done(future_queue)
  Exec-->>Caller: futures
    Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
 Suggested reviewers
 Poem
 ✨ Finishing Touches
 🧪 Generate unit tests
 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit: 
 SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type  Other keywords and placeholders
 CodeRabbit Configuration File (
 | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️  Outside diff range comments (2)
executorlib/task_scheduler/interactive/shared.py (2)
76-89: Always call task_done; prevent deadlock and stop after fatal task failuresIf
_execute_task_with_cacheraises (it does re-raise),_task_doneis skipped, potentially hangingfuture_queue.join(). Also, after helpers shut down the interface on error, the loop continues and will use a dead interface. Wrap dispatch in try/finally and break on error.- elif "fn" in task_dict and "future" in task_dict: - if error_log_file is not None: - task_dict["error_log_file"] = error_log_file - if cache_directory is None: - _execute_task_without_cache(interface=interface, task_dict=task_dict) - else: - _execute_task_with_cache( - interface=interface, - task_dict=task_dict, - cache_directory=cache_directory, - cache_key=cache_key, - ) - _task_done(future_queue=future_queue) + elif "fn" in task_dict and "future" in task_dict: + if error_log_file is not None: + task_dict["error_log_file"] = error_log_file + try: + if cache_directory is None: + _execute_task_without_cache(interface=interface, task_dict=task_dict) + else: + _execute_task_with_cache( + interface=interface, + task_dict=task_dict, + cache_directory=cache_directory, + cache_key=cache_key, + ) + except Exception: + # Helpers set the Future exception and shut down the interface; stop processing. + break + finally: + _task_done(future_queue=future_queue)
151-153: Guard Future state on cache hit to avoid InvalidStateErrorOn cache hits you call
future.set_resultwithout checking cancellation/done state. This can raise and bypass_task_done.- _, _, result = get_output(file_name=file_name) - future = task_dict["future"] - future.set_result(result) + _, _, result = get_output(file_name=file_name) + future = task_dict.get("future") + if future is not None and (not future.done()) and future.set_running_or_notify_cancel(): + future.set_result(result)
🧹 Nitpick comments (1)
executorlib/task_scheduler/interactive/shared.py (1)
31-31: Docstring grammarTweak phrasing.
- Execute a single tasks in parallel using the message passing interface (MPI). + Execute tasks in parallel using the message passing interface (MPI).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
 - Jira integration is disabled by default for public repositories
 - Linear integration is disabled by default for public repositories
 
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
executorlib/task_scheduler/interactive/shared.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
executorlib/task_scheduler/interactive/shared.py (3)
executorlib/executor/base.py (1)
future_queue(44-51)executorlib/task_scheduler/base.py (1)
future_queue(63-70)executorlib/standalone/interactive/communication.py (1)
SocketInterface(14-153)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: unittest_mpich (ubuntu-latest, 3.13)
 - GitHub Check: unittest_openmpi (ubuntu-22.04-arm, 3.13)
 - GitHub Check: unittest_old
 - GitHub Check: unittest_mpich (ubuntu-latest, 3.11)
 - GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
 - GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
 - GitHub Check: unittest_openmpi (ubuntu-24.04-arm, 3.13)
 - GitHub Check: unittest_openmpi (macos-latest, 3.13)
 - GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
 - GitHub Check: unittest_mpich (ubuntu-24.04-arm, 3.13)
 - GitHub Check: unittest_mpich (ubuntu-latest, 3.12)
 - GitHub Check: unittest_mpich (ubuntu-22.04-arm, 3.13)
 - GitHub Check: unittest_mpich (macos-latest, 3.13)
 - GitHub Check: unittest_win
 - GitHub Check: notebooks
 - GitHub Check: unittest_flux_openmpi
 - GitHub Check: notebooks_integration
 - GitHub Check: unittest_flux_mpich
 - GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
 - GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
 
| f = task_dict.pop("future") | ||
| if not f.done() and f.set_running_or_notify_cancel(): | ||
| try: | ||
| f.set_result(interface.send_and_receive_dict(input_dict=task_dict)) | ||
| except Exception as thread_exception: | ||
| interface.shutdown(wait=True) | ||
| _task_done(future_queue=future_queue) | ||
| f.set_exception(exception=thread_exception) | ||
| else: | ||
| _task_done(future_queue=future_queue) | ||
| 
               | 
          
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unify exception propagation; re-raise after setting Future exception
Without re-raising here, the outer loop keeps running with a shut-down interface, causing follow-up failures. Make this consistent with the cache path.
     if not f.done() and f.set_running_or_notify_cancel():
         try:
             f.set_result(interface.send_and_receive_dict(input_dict=task_dict))
         except Exception as thread_exception:
             interface.shutdown(wait=True)
             f.set_exception(exception=thread_exception)
+            raise thread_exception📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| f = task_dict.pop("future") | |
| if not f.done() and f.set_running_or_notify_cancel(): | |
| try: | |
| f.set_result(interface.send_and_receive_dict(input_dict=task_dict)) | |
| except Exception as thread_exception: | |
| interface.shutdown(wait=True) | |
| _task_done(future_queue=future_queue) | |
| f.set_exception(exception=thread_exception) | |
| else: | |
| _task_done(future_queue=future_queue) | |
| f = task_dict.pop("future") | |
| if not f.done() and f.set_running_or_notify_cancel(): | |
| try: | |
| f.set_result(interface.send_and_receive_dict(input_dict=task_dict)) | |
| except Exception as thread_exception: | |
| interface.shutdown(wait=True) | |
| f.set_exception(exception=thread_exception) | |
| raise thread_exception | 
🤖 Prompt for AI Agents
In executorlib/task_scheduler/interactive/shared.py around lines 100 to 107, the
except block sets the Future exception and shuts down the interface but does not
re-raise, which leaves the outer loop running against a shut-down interface;
update the except block to call interface.shutdown(wait=True), set the future
exception as done, and then re-raise the caught thread_exception (raise) so the
surrounding caller observes the failure—matching the cache path behavior.
          Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@            Coverage Diff             @@
##             main     #795      +/-   ##
==========================================
- Coverage   97.75%   97.75%   -0.01%     
==========================================
  Files          33       33              
  Lines        1473     1468       -5     
==========================================
- Hits         1440     1435       -5     
  Misses         33       33              ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️  Outside diff range comments (1)
executorlib/task_scheduler/interactive/shared.py (1)
150-153: Honor cancellation on cache hit to avoid InvalidStateErrorOn cache hits you call set_result without transitioning the Future to running. If the Future was cancelled, set_result raises InvalidStateError.
- _, _, result = get_output(file_name=file_name) - future = task_dict["future"] - future.set_result(result) + _, _, result = get_output(file_name=file_name) + f = task_dict["future"] + if f.set_running_or_notify_cancel(): + f.set_result(result)
♻️ Duplicate comments (1)
executorlib/task_scheduler/interactive/shared.py (1)
91-107: Exception is swallowed after shutting down the interface; verify contract or short‑circuit subsequent tasksHelpers set the Future exception and call shutdown(wait=True) but do not re‑raise, so the loop keeps dequeuing and will try to use a shut-down interface until it sees a shutdown sentinel. If downstream always enqueues a shutdown item immediately after a failing task, this is fine—otherwise you’ll churn through remaining tasks with spurious failures.
- If the design is “errors surface via Future, loop continues,” consider guarding later sends when the interface is down (e.g., detect the closed state and directly set exceptions on subsequent Futures until shutdown is received).
 - Alternatively, re‑raise here and keep the above try/finally in execute_tasks to ensure task_done; tests would need to expect the exception at f.result() or at execute_tasks consistently.
 Would you like a small follow-up patch that short-circuits further task execution once the interface has been shut down (while still draining the queue to the shutdown sentinel)?
🧹 Nitpick comments (3)
executorlib/task_scheduler/interactive/shared.py (2)
79-89: Centralizing task_done is good; wrap helper in try/finally so the queue never hangs if a helper unexpectedly raisesThis guarantees task_done is called even on unforeseen errors (e.g., InvalidStateError from Future state transitions).
- if cache_directory is None: - _execute_task_without_cache(interface=interface, task_dict=task_dict) - else: - _execute_task_with_cache( - interface=interface, - task_dict=task_dict, - cache_directory=cache_directory, - cache_key=cache_key, - ) - _task_done(future_queue=future_queue) + try: + if cache_directory is None: + _execute_task_without_cache(interface=interface, task_dict=task_dict) + else: + _execute_task_with_cache( + interface=interface, + task_dict=task_dict, + cache_directory=cache_directory, + cache_key=cache_key, + ) + finally: + _task_done(future_queue=future_queue)
31-31: Docstring grammar nit“Execute a single tasks” → “Execute a single task” or “Execute tasks.”
- Execute a single tasks in parallel using the message passing interface (MPI). + Execute a single task in parallel using the message passing interface (MPI).tests/test_mpiexecspawner.py (1)
536-547: Minor test consistency suggestionFor readability, consider aligning the non-cached failure test to the same pattern (execute_tasks runs; assert at f.result()).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
 - Jira integration is disabled by default for public repositories
 - Linear integration is disabled by default for public repositories
 
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
executorlib/task_scheduler/interactive/shared.py(1 hunks)tests/test_mpiexecspawner.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/test_mpiexecspawner.py (2)
executorlib/standalone/serialize.py (1)
cloudpickle_register(9-28)executorlib/task_scheduler/interactive/shared.py (1)
execute_tasks(16-88)
executorlib/task_scheduler/interactive/shared.py (3)
executorlib/task_scheduler/base.py (1)
future_queue(63-70)executorlib/executor/base.py (1)
future_queue(44-51)executorlib/standalone/interactive/communication.py (1)
SocketInterface(14-153)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
 - GitHub Check: unittest_flux_mpich
 - GitHub Check: unittest_flux_openmpi
 - GitHub Check: unittest_openmpi (ubuntu-22.04-arm, 3.13)
 - GitHub Check: notebooks
 - GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
 - GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
 - GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
 - GitHub Check: unittest_openmpi (macos-latest, 3.13)
 - GitHub Check: unittest_openmpi (ubuntu-24.04-arm, 3.13)
 - GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
 - GitHub Check: unittest_mpich (ubuntu-latest, 3.12)
 - GitHub Check: unittest_mpich (ubuntu-latest, 3.11)
 - GitHub Check: unittest_mpich (ubuntu-latest, 3.13)
 - GitHub Check: unittest_old
 - GitHub Check: notebooks_integration
 - GitHub Check: unittest_win
 - GitHub Check: unittest_slurm_mpich
 
🔇 Additional comments (1)
tests/test_mpiexecspawner.py (1)
538-546: Good: assert via Future on cached error pathMoving the assertion to f.result() matches the new behavior where execute_tasks doesn’t raise and continues to process until shutdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
tests/test_mpiexecspawner.py (3)
446-451: Good change: assert on Future exception after executionCalling
execute_tasks(...)first and then asserting theTypeErrorviaf.result()matches the new behavior where helpers set exceptions on the Future. Looks correct.To avoid a redundant internal
Queue.join()followed by the explicitq.join()in the test, either remove the explicitq.join()or passqueue_join_on_shutdown=Falsehere. Suggested inline tweak:execute_tasks( future_queue=q, cores=1, openmpi_oversubscribe=False, spawner=MpiExecSpawner, + queue_join_on_shutdown=False, )
462-467: Same adjustment here is correctThe refactor to check the
TypeErrorfromf.result()is consistent with the internal change.Apply the same optional join tweak to keep the test fully in control of queue joining:
execute_tasks( future_queue=q, cores=1, openmpi_oversubscribe=False, spawner=MpiExecSpawner, + queue_join_on_shutdown=False, )
536-547: Cache-path failure test aligns with new semanticsExecuting first and asserting the
TypeErrorvia the Future is appropriate for the cached path too.Consider the same join behavior tweak here:
execute_tasks( future_queue=q, cores=1, openmpi_oversubscribe=False, spawner=MpiExecSpawner, cache_directory="executorlib_cache", + queue_join_on_shutdown=False, )Optionally, add a quick post-assert check that the Future is marked done (and/or
f.exception()is aTypeError) to tighten the signal.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
 - Jira integration is disabled by default for public repositories
 - Linear integration is disabled by default for public repositories
 
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
tests/test_mpiexecspawner.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_mpiexecspawner.py (3)
executorlib/task_scheduler/interactive/shared.py (1)
execute_tasks(16-88)executorlib/standalone/interactive/spawner.py (1)
MpiExecSpawner(141-158)executorlib/standalone/serialize.py (1)
cloudpickle_register(9-28)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: unittest_mpich (macos-latest, 3.13)
 - GitHub Check: unittest_mpich (ubuntu-latest, 3.13)
 - GitHub Check: unittest_mpich (ubuntu-24.04-arm, 3.13)
 - GitHub Check: unittest_flux_openmpi
 - GitHub Check: pip_check
 - GitHub Check: notebooks
 - GitHub Check: unittest_slurm_mpich
 - GitHub Check: unittest_old
 - GitHub Check: notebooks_integration
 - GitHub Check: unittest_flux_mpich
 - GitHub Check: unittest_openmpi (ubuntu-22.04-arm, 3.13)
 - GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
 - GitHub Check: unittest_openmpi (ubuntu-24.04-arm, 3.13)
 - GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
 - GitHub Check: unittest_openmpi (macos-latest, 3.13)
 - GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
 - GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
 - GitHub Check: unittest_win
 
Summary by CodeRabbit