Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New examples for the updated documentation #495

Merged
merged 39 commits into from
Nov 20, 2024
Merged

New examples for the updated documentation #495

merged 39 commits into from
Nov 20, 2024

Conversation

jan-janssen
Copy link
Member

@jan-janssen jan-janssen commented Nov 12, 2024

Binder

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Introduced several new Jupyter notebooks demonstrating local and high-performance computing (HPC) capabilities, including parallel processing and resource management.
    • Added a new shell script for executing Jupyter notebooks, enhancing the testing workflow.
    • Implemented functions for task submission and resource allocation within the notebooks.
    • Expanded the README.md to highlight key features and usage examples of the executorlib.
    • Added new dependencies to enhance data visualization and graph-related functionalities.
  • Bug Fixes

    • Improved error handling in notebook execution processes.
  • Documentation

    • Enhanced documentation within notebooks to guide users on utilizing new features effectively.
    • Restructured installation and troubleshooting guides for clarity and usability.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

coderabbitai bot commented Nov 12, 2024

Caution

Review failed

The pull request is closed.

Walkthrough

The changes introduce several new Jupyter notebooks and a shell script for executing notebooks using Papermill. The build_notebooks.sh script iterates through Jupyter notebooks in the notebooks directory, executing each and tracking failures. Additionally, the workflow configuration has been updated to use this script instead of direct Papermill commands. The new notebooks cover topics such as local parallel computation, HPC submission, resource allocation, and developer practices, each demonstrating the use of the Executor class from the executorlib library.

Changes

File Change Summary
.ci_support/build_notebooks.sh New script added to execute Jupyter notebooks using Papermill, tracking execution failures.
.github/workflows/notebooks.yml Updated workflow to run the new shell script instead of direct Papermill commands.
notebooks/1-local.ipynb New notebook demonstrating local parallel computation with Executor, including several functions.
notebooks/2-hpc-submission.ipynb New notebook focused on HPC submission using pysqa, showcasing task execution and dependency handling.
notebooks/3-hpc-allocation.ipynb New notebook on HPC resource management using Flux and SLURM, including task submission and monitoring.
notebooks/4-developer.ipynb New notebook discussing serialization methods and subprocess management, introducing a new function for shell command submission.
README.md Restructured to highlight key features, installation instructions, and usage examples for executorlib.
binder/environment.yml New dependencies added to enhance data visualization and graph functionalities.
docs/_toc.yml Updated to include new notebooks and remove outdated examples.
docs/installation.md Restructured to clarify installation requirements and details for various features.
docs/trouble_shooting.md New troubleshooting format addressing common issues users may encounter.
executorlib/__init__.py Import statements modified for clarity and consistency.
pyproject.toml Optional dependencies reorganized and new groups added for better clarity.
notebooks/examples.ipynb Deleted file that contained examples of using the executorlib.Executor class.

Poem

🐇 In the meadow where notebooks bloom,
Scripts now dance, dispelling gloom.
With each command, a task takes flight,
Parallel dreams in the soft moonlight.
Executors hum, as data flows,
A rabbit's joy in the code that grows! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Outside diff range and nitpick comments (20)
.ci_support/build_notebooks.sh (2)

1-3: Enhance script documentation and add dependency checks

Consider adding more comprehensive documentation and dependency validation at the start of the script.

 #!/bin/bash
-# execute notebooks
+# Purpose: Execute Jupyter notebooks using Papermill and track failures
+# Usage: ./build_notebooks.sh
+
+# Check for required dependencies
+if ! command -v papermill &> /dev/null; then
+    echo "Error: papermill is not installed" >&2
+    exit 1
+fi
+
 i=0;

1-11: Consider performance optimizations for CI pipeline

The current implementation executes notebooks sequentially, which might not be optimal for CI performance. Consider these improvements:

  1. Add parallel execution using GNU Parallel
  2. Implement resource management for memory-intensive notebooks
  3. Add timeout handling for notebooks that might hang

Example implementation using GNU Parallel:

#!/bin/bash

# Function to process a single notebook
process_notebook() {
    local notebook="$1"
    local base_name=$(basename "$notebook" .ipynb)
    local output_file="${notebook%.*}-out.${notebook##*.}"
    local log_file="logs/${base_name}.log"
    
    echo "Processing: $notebook"
    if ! papermill "$notebook" "$output_file" -k python3 2>&1 | tee "$log_file"; then
        echo "Failed: $notebook" >&2
        return 1
    fi
    return 0
}

export -f process_notebook

# Execute notebooks in parallel with a maximum of 4 concurrent jobs
find notebooks -name "*.ipynb" -print0 | parallel -0 --jobs 4 process_notebook
🧰 Tools
🪛 Shellcheck

[error] 4-4: Iterating over ls output is fragile. Use globs.

(SC2045)

.github/workflows/notebooks.yml (1)

37-37: Enhance error handling and debugging capabilities.

For better debugging of notebook execution failures in CI:

-          .ci_support/build_notebooks.sh
+          .ci_support/build_notebooks.sh 2>&1 | tee notebook_execution.log

Also, consider uploading the execution log and failed notebooks as artifacts:

      - name: Upload execution log
        if: always()
        uses: actions/upload-artifact@v3
        with:
          name: notebook-execution-logs
          path: |
            notebook_execution.log
            notebooks/*-out.ipynb
notebooks/4-developer.ipynb (2)

87-94: Enhance the example to be more educational

While the example demonstrates basic usage, consider enhancing it to:

  1. Show error handling
  2. Demonstrate asynchronous execution with multiple commands
  3. Include comments explaining security considerations

Here's an improved example:

with Executor(backend="local") as exe:
    # Submit multiple commands asynchronously
    futures = [
        exe.submit(
            submit_shell_command,
            ["echo", f"task {i}"],
            universal_newlines=True,
            shell=False,
        )
        for i in range(3)
    ]
    
    # Handle results and errors properly
    for future in futures:
        try:
            result = future.result(timeout=5)  # Add timeout for result retrieval
            print(f"Success: {result.strip()}")
        except Exception as e:
            print(f"Error: {str(e)}")

16-17: Complete the placeholder sections with educational content

The notebook has incomplete sections that could benefit from:

  1. Comparison between pickle and cloudpickle with security considerations
  2. Interactive examples demonstrating real-world scenarios

Would you like me to help generate content for these sections? I can create examples that demonstrate:

  • Serialization differences and best practices
  • Interactive command execution patterns
  • Error handling scenarios

Also applies to: 102-128

notebooks/2-hpc-submission.ipynb (6)

16-17: Add context about pysqa in the introduction

The notebook would benefit from a brief introduction explaining what pysqa is and its role in HPC job submission. Consider adding a markdown cell that covers:

  • What pysqa is
  • Its key features
  • Prerequisites for using it

54-58: Enhance the basic example with error handling and documentation

The example could be improved in several ways:

  1. Add error handling for potential submission failures
  2. Document the available backend options (why "pysqa_flux"?)
  3. Consider making the cache directory configurable

Here's a suggested improvement:

 %%time
+# Create an executor using the pysqa_flux backend
+# Other available backends: <list other options here>
 with Executor(backend="pysqa_flux", cache_directory="./cache") as exe:
     future_lst = [exe.submit(sum, [i, i]) for i in range(1, 4)]\n",
-    print([f.result() for f in future_lst])
+    try:
+        results = [f.result() for f in future_lst]
+        print(results)
+    except Exception as e:
+        print(f"Error during execution: {e}")

67-69: Use more Pythonic function naming

Consider renaming add_funct to add_numbers or simply add to follow Python naming conventions.


86-94: Clarify the dependency chain example

The example demonstrates task dependencies but could be more explicit. Consider:

  1. Adding a comment explaining the dependency chain
  2. Showing the intermediate results
 with Executor(backend="pysqa_flux", cache_directory="./cache") as exe:
+    # Demonstrate task dependencies where each result feeds into the next addition
+    # Example: 4+4=8, 5+8=13, 6+13=19, 7+19=26
     future = None
     for i in range(4, 8):
         if future is None:
             future = exe.submit(add_funct, i, i)
         else:
             future = exe.submit(add_funct, i, future)
+        # Print intermediate results (optional)
+        print(f"Step {i}: {future.result()}")
     print(future.result())

146-156: Enhance cache cleanup with better error handling and logging

The cache cleanup could be improved with more specific error handling and status logging.

 import os
 import shutil
+import logging
+
+logging.basicConfig(level=logging.INFO)
 
 cache_dir = "./cache"
 if os.path.exists(cache_dir):
-    print(os.listdir(cache_dir))
+    logging.info(f"Found cache files: {os.listdir(cache_dir)}")
     try:
         shutil.rmtree(cache_dir)
-    except OSError:
-        pass
+        logging.info(f"Successfully cleaned up cache directory: {cache_dir}")
+    except OSError as e:
+        logging.error(f"Failed to clean up cache directory: {e}")

163-176: Complete the documentation for resource assignment and advanced configuration

The notebook has incomplete sections that are crucial for HPC documentation:

  1. Resource Assignment section needs implementation details for:
    • Thread-level parallelism
    • MPI configuration
    • GPU resource allocation
  2. Advanced Configuration section needs explanation of:
    • Config directory structure
    • Submission templates
    • Best practices

Would you like me to help create the content for these sections or open a GitHub issue to track this task?

notebooks/3-hpc-allocation.ipynb (4)

54-58: Consider adding error handling.

While the example is clear and effective, it would be more robust with proper error handling for potential Flux backend initialization failures or task submission errors.

 %%time
-with Executor(backend="flux") as exe:
-    future_lst = [exe.submit(sum, [i, i]) for i in range(2, 5)]
-    print([f.result() for f in future_lst])
+try:
+    with Executor(backend="flux") as exe:
+        future_lst = [exe.submit(sum, [i, i]) for i in range(2, 5)]
+        print([f.result() for f in future_lst])
+except Exception as e:
+    print(f"Error during execution: {e}")

67-69: Consider using operator.add instead of custom function.

The add_funct can be replaced with the built-in operator.add for better maintainability and performance.

+from operator import add
-def add_funct(a, b):
-    return a + b

256-258: Documentation needs completion.

The "Advanced Configuration" section contains a TODO item ("explain additional arguments") that needs to be addressed for completeness.

Would you like me to help generate the documentation for the additional arguments section?


240-275: SLURM section needs examples.

The SLURM section lacks concrete examples demonstrating resource assignment and combined usage with Flux. This is important for users to understand the practical implementation.

Would you like me to help generate example code cells demonstrating SLURM resource assignment and Flux integration?

notebooks/1-local.ipynb (5)

148-160: Address MPI CMA support warnings

The code generates warnings about Linux kernel CMA support. Consider adding documentation about these warnings and how to address them.

Add a markdown cell before the MPI examples explaining:

> Note: You may see warnings about Linux kernel CMA support. These warnings indicate that the system will fall back to an alternative single-copy mechanism, which might affect performance but not functionality. To resolve these warnings, you may need to adjust your system's ptrace settings.

403-405: Simplify initialization function

The init_function returns a dictionary with an unused parameter 'l'.

def init_function():
-    return {"j": 4, "k": 3, "l": 2}
+    return {"j": 4, "k": 3}

511-518: Improve cache cleanup error handling

The cache cleanup code uses a broad try-except block. Consider handling specific exceptions and using a configurable cache directory.

-cache_dir = "./cache"
+cache_dir = os.getenv("EXECUTOR_CACHE_DIR", "./cache")
 if os.path.exists(cache_dir):
     print(os.listdir(cache_dir))
     try:
         shutil.rmtree(cache_dir)
-    except OSError:
+    except (OSError, PermissionError) as e:
+        print(f"Failed to remove cache directory: {e}")
         pass

564-572: Add error handling to dependency chain example

The dependency chain example should include error handling for potential failures in the chain.

 with Executor(backend="local") as exe:
     future = None
-    for i in range(1, 4):
-        if future is None:
-            future = exe.submit(add_funct, i, i)
-        else:
-            future = exe.submit(add_funct, i, future)
-    print(future.result())
+    try:
+        for i in range(1, 4):
+            if future is None:
+                future = exe.submit(add_funct, i, i)
+            else:
+                future = exe.submit(add_funct, i, future)
+        print(future.result())
+    except Exception as e:
+        print(f"Error in dependency chain: {e}")

1-723: Add section descriptions and expected outputs

The notebook would benefit from additional markdown cells explaining each section's purpose and expected outputs. Consider adding descriptions before each major section (Local Testing, Parallel Functions, Performance Optimization, Dependencies).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a4c7c7b and 362a25a.

📒 Files selected for processing (6)
  • .ci_support/build_notebooks.sh (1 hunks)
  • .github/workflows/notebooks.yml (1 hunks)
  • notebooks/1-local.ipynb (1 hunks)
  • notebooks/2-hpc-submission.ipynb (1 hunks)
  • notebooks/3-hpc-allocation.ipynb (1 hunks)
  • notebooks/4-developer.ipynb (1 hunks)
🧰 Additional context used
🪛 Shellcheck
.ci_support/build_notebooks.sh

[error] 4-4: Iterating over ls output is fragile. Use globs.

(SC2045)

🔇 Additional comments (6)
.github/workflows/notebooks.yml (1)

37-37: Verify the shell script's error propagation.

Ensure that notebook execution failures are properly propagated to the GitHub Actions workflow.

#!/bin/bash
# Check if the shell script handles errors properly
if [ -f ".ci_support/build_notebooks.sh" ]; then
    echo "Content of build_notebooks.sh:"
    cat .ci_support/build_notebooks.sh
    echo -e "\nChecking for error handling:"
    grep -E "set -e|exit|trap" .ci_support/build_notebooks.sh
fi
notebooks/4-developer.ipynb (1)

43-43: LGTM! Imports are appropriate and well-organized.

Also applies to: 53-53

notebooks/2-hpc-submission.ipynb (1)

1-208: Overall assessment: Good foundation but needs completion

The notebook provides a solid foundation for demonstrating HPC submission with pysqa, but there are a few areas that need attention:

  1. Complete the unfinished sections (Resource Assignment and Advanced Configuration)
  2. Add more comprehensive error handling
  3. Enhance documentation with context and explanations
  4. Consider adding a troubleshooting section

The examples are well-structured and demonstrate key concepts, but the documentation would benefit from these improvements to serve as a complete guide for users.

notebooks/3-hpc-allocation.ipynb (3)

34-35: LGTM!

Clean and specific import statement.


103-109: LGTM!

The calc function properly handles MPI rank and size retrieval.


284-285: LGTM!

The SLURM command generation import and implementation are well-documented and properly typed.

Comment on lines +8 to +11
# push error to next level
if [ $i -gt 0 ]; then
exit 1;
fi;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance error reporting with execution summary

The current error handling doesn't provide enough information about the execution results.

Apply this diff to improve error reporting:

-# push error to next level
-if [ $i -gt 0 ]; then
-    exit 1;
-fi;
+# Print execution summary
+total=$(find notebooks -name "*.ipynb" | wc -l)
+successful=$((total - i))
+
+echo "Notebook Execution Summary:"
+echo "-------------------------"
+echo "Total notebooks: $total"
+echo "Successfully executed: $successful"
+echo "Failed executions: $i"
+
+if [ $i -gt 0 ]; then
+    echo "Error: $i notebook(s) failed execution. Check logs directory for details." >&2
+    exit 1
+fi
+
+echo "All notebooks executed successfully!"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# push error to next level
if [ $i -gt 0 ]; then
exit 1;
fi;
# Print execution summary
total=$(find notebooks -name "*.ipynb" | wc -l)
successful=$((total - i))
echo "Notebook Execution Summary:"
echo "-------------------------"
echo "Total notebooks: $total"
echo "Successfully executed: $successful"
echo "Failed executions: $i"
if [ $i -gt 0 ]; then
echo "Error: $i notebook(s) failed execution. Check logs directory for details." >&2
exit 1
fi
echo "All notebooks executed successfully!"

Comment on lines +4 to +6
for notebook in $(ls notebooks/*.ipynb); do
papermill ${notebook} ${notebook%.*}-out.${notebook##*.} -k python3 || i=$((i+1));
done;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix fragile file iteration and improve error logging

The current implementation has several issues:

  1. Using ls for iteration is fragile and will break with spaces in filenames
  2. Failed notebook executions aren't logged
  3. No progress indication during execution

Apply this diff to make the script more robust:

-for notebook in $(ls notebooks/*.ipynb); do
-    papermill ${notebook} ${notebook%.*}-out.${notebook##*.} -k python3 || i=$((i+1));
-done;
+# Create a log directory
+mkdir -p logs
+
+# Use shell globbing instead of ls
+for notebook in notebooks/*.ipynb; do
+    if [ ! -f "$notebook" ]; then
+        continue  # Skip if no notebooks found
+    fi
+    
+    echo "Processing: $notebook"
+    base_name=$(basename "$notebook" .ipynb)
+    output_file="${notebook%.*}-out.${notebook##*.}"
+    log_file="logs/${base_name}.log"
+    
+    if ! papermill "$notebook" "$output_file" -k python3 2>&1 | tee "$log_file"; then
+        echo "Failed to execute: $notebook" >&2
+        i=$((i+1))
+    fi
+done
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for notebook in $(ls notebooks/*.ipynb); do
papermill ${notebook} ${notebook%.*}-out.${notebook##*.} -k python3 || i=$((i+1));
done;
# Create a log directory
mkdir -p logs
# Use shell globbing instead of ls
for notebook in notebooks/*.ipynb; do
if [ ! -f "$notebook" ]; then
continue # Skip if no notebooks found
fi
echo "Processing: $notebook"
base_name=$(basename "$notebook" .ipynb)
output_file="${notebook%.*}-out.${notebook##*.}"
log_file="logs/${base_name}.log"
if ! papermill "$notebook" "$output_file" -k python3 2>&1 | tee "$log_file"; then
echo "Failed to execute: $notebook" >&2
i=$((i+1))
fi
done
🧰 Tools
🪛 Shellcheck

[error] 4-4: Iterating over ls output is fragile. Use globs.

(SC2045)

@@ -34,4 +34,4 @@ jobs:
timeout-minutes: 5
run: >
flux start
papermill notebooks/examples.ipynb examples-out.ipynb -k "python3"
.ci_support/build_notebooks.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Increase the workflow timeout.

The current 5-minute timeout may be insufficient for executing all 5 notebooks. Consider increasing the timeout-minutes value to ensure successful completion.

🔗 Analysis chain

Consider increasing the workflow timeout.

The workflow is now executing multiple notebooks through the shell script, but still uses a 5-minute timeout. This might be insufficient when running all notebooks sequentially, especially those involving HPC and resource allocation examples.

Consider:

  1. Increasing the timeout-minutes value
  2. Adding timeouts per notebook in the shell script instead
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Count notebooks and estimate required time
echo "Number of notebooks to be executed:"
fd .ipynb$ notebooks/ | wc -l

Length of output: 535


Script:

#!/bin/bash
# Count the number of Jupyter notebooks
echo "Total number of notebooks:"
fd .ipynb$ notebooks/ | wc -l

Length of output: 98

Comment on lines 63 to 68
"def submit_shell_command(\n",
" command: list, universal_newlines: bool = True, shell: bool = False\n",
"):\n",
" return subprocess.check_output(\n",
" command, universal_newlines=universal_newlines, shell=shell\n",
" )"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling and timeout to prevent security issues

The shell command execution function needs additional safeguards:

  1. Error handling for subprocess failures
  2. Timeout parameter to prevent hanging processes
  3. Input validation for the command list

Consider this safer implementation:

 def submit_shell_command(
-    command: list, universal_newlines: bool = True, shell: bool = False
+    command: list, 
+    universal_newlines: bool = True, 
+    shell: bool = False,
+    timeout: int = 30
 ):
+    if not command or not isinstance(command, list):
+        raise ValueError("Command must be a non-empty list")
+    try:
         return subprocess.check_output(
-            command, universal_newlines=universal_newlines, shell=shell
+            command,
+            universal_newlines=universal_newlines,
+            shell=shell,
+            timeout=timeout
         )
+    except subprocess.CalledProcessError as e:
+        raise RuntimeError(f"Command failed with exit code {e.returncode}: {e.output}")
+    except subprocess.TimeoutExpired as e:
+        raise RuntimeError(f"Command timed out after {timeout} seconds")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"def submit_shell_command(\n",
" command: list, universal_newlines: bool = True, shell: bool = False\n",
"):\n",
" return subprocess.check_output(\n",
" command, universal_newlines=universal_newlines, shell=shell\n",
" )"
def submit_shell_command(
command: list,
universal_newlines: bool = True,
shell: bool = False,
timeout: int = 30
):
if not command or not isinstance(command, list):
raise ValueError("Command must be a non-empty list")
try:
return subprocess.check_output(
command,
universal_newlines=universal_newlines,
shell=shell,
timeout=timeout
)
except subprocess.CalledProcessError as e:
raise RuntimeError(f"Command failed with exit code {e.returncode}: {e.output}")
except subprocess.TimeoutExpired as e:
raise RuntimeError(f"Command timed out after {timeout} seconds")

Comment on lines 103 to 109
"def calc(i):\n",
" from mpi4py import MPI\n",
"\n",
" size = MPI.COMM_WORLD.Get_size()\n",
" rank = MPI.COMM_WORLD.Get_rank()\n",
" return i, size, rank"
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Optimize MPI usage and add error handling

Consider these improvements for the MPI example:

  1. Move the mpi4py import to the top level to avoid import overhead in each task
  2. Add error handling for MPI initialization
  3. Document the expected output format
+from mpi4py import MPI
+
 def calc(i):
-    from mpi4py import MPI
-
-    size = MPI.COMM_WORLD.Get_size()
-    rank = MPI.COMM_WORLD.Get_rank()
-    return i, size, rank
+    try:
+        size = MPI.COMM_WORLD.Get_size()
+        rank = MPI.COMM_WORLD.Get_rank()
+        return i, size, rank
+    except Exception as e:
+        raise RuntimeError(f"MPI initialization failed: {e}")

+# Example showing MPI parallel execution with 2 cores
+# Returns a list of tuples: (input_value, total_processes, process_rank)
 with Executor(backend="pysqa_flux", cache_directory="./cache") as exe:
     fs = exe.submit(calc, 3, resource_dict={"cores": 2})
     print(fs.result())

Also applies to: 126-129

Comment on lines 164 to 170
"with flux.Flux() as handle:\n",
" rs = flux.resource.status.ResourceStatusRPC(handle).get()\n",
" rl = flux.resource.list.resource_list(handle).get()\n",
" print(\n",
" \"nodes: \", rs.nodelist, \" #cores: \", rl.all.ncores, \" #free: \", rl.free.ncores\n",
" )"
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for Flux API calls.

The resource status checks should handle potential Flux API errors gracefully.

 with flux.Flux() as handle:
+    try:
         rs = flux.resource.status.ResourceStatusRPC(handle).get()
         rl = flux.resource.list.resource_list(handle).get()
         print(
             "nodes: ", rs.nodelist, " #cores: ", rl.all.ncores, " #free: ", rl.free.ncores
         )
+    except Exception as e:
+        print(f"Error accessing Flux resources: {e}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"with flux.Flux() as handle:\n",
" rs = flux.resource.status.ResourceStatusRPC(handle).get()\n",
" rl = flux.resource.list.resource_list(handle).get()\n",
" print(\n",
" \"nodes: \", rs.nodelist, \" #cores: \", rl.all.ncores, \" #free: \", rl.free.ncores\n",
" )"
]
"with flux.Flux() as handle:\n",
" try:\n",
" rs = flux.resource.status.ResourceStatusRPC(handle).get()\n",
" rl = flux.resource.list.resource_list(handle).get()\n",
" print(\n",
" \"nodes: \", rs.nodelist, \" #cores: \", rl.all.ncores, \" #free: \", rl.free.ncores\n",
" )\n",
" except Exception as e:\n",
" print(f\"Error accessing Flux resources: {e}\")\n",
"]"

Comment on lines 123 to 128
"def calc(i):\n",
" from mpi4py import MPI\n",
"\n",
" size = MPI.COMM_WORLD.Get_size()\n",
" rank = MPI.COMM_WORLD.Get_rank()\n",
" return i, size, rank"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove duplicate function definition

The calc function is defined twice with identical implementation. Remove the duplicate definition at lines 330-336.

Also applies to: 330-336

@jan-janssen
Copy link
Member Author

Wait for fixes in #504

@jan-janssen jan-janssen marked this pull request as ready for review November 20, 2024 10:39
@jan-janssen jan-janssen merged commit 6adbf5f into main Nov 20, 2024
25 of 26 checks passed
@jan-janssen jan-janssen deleted the examples branch November 20, 2024 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant