Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler enhancements #7703

Merged
merged 13 commits into from
Nov 30, 2021
Merged

Scheduler enhancements #7703

merged 13 commits into from
Nov 30, 2021

Commits on Nov 30, 2021

  1. Use cgroup limits in worker memory calculations

    Worker processes may have memory limitations imposed by Systemd. But
    /proc/meminfo shows the entire system memory regardless of these limits.
    This results in the scheduler believing the worker has the entire system
    memory avaliable and the worker being allocated too many tasks.
    
    This change attempts to read cgroup memory limits for the worker
    process. It supports cgroups v1 and v2, and compares cgroup limits
    against the system memory and returns the most conservative values to
    prevent the worker from being allocated too many tasks and potentially
    triggering an OOM event.
    clinta authored and magik6k committed Nov 30, 2021
    Configuration menu
    Copy the full SHA
    e2a1ca7 View commit details
    Browse the repository at this point in the history
  2. Report memory used and swap used in worker res

    Attempting to report "memory used by other processes" in the MemReserved
    field fails to take into account the fact that the system's memory used
    includes memory used by ongoing tasks.
    
    To properly account for this, worker should report the memory and swap
    used, then the scheduler that is aware of the memory requirements for a
    task can determine if there is sufficient memory available for a task.
    clinta authored and magik6k committed Nov 30, 2021
    Configuration menu
    Copy the full SHA
    c4f4617 View commit details
    Browse the repository at this point in the history
  3. Use a float to represent GPU utilization

    Before this change workers can only be allocated one GPU task,
    regardless of how much of the GPU resources that task uses, or how many
    GPUs are in the system.
    
    This makes GPUUtilization a float which can represent that a task needs
    a portion, or multiple GPUs. GPUs are accounted for like RAM and CPUs so
    that workers with more GPUs can be allocated more tasks.
    
    A known issue is that PC2 cannot use multiple GPUs. And even if the
    worker has multiple GPUs and is allocated multiple PC2 tasks, those
    tasks will only run on the first GPU.
    
    This could result in unexpected behavior when a worker with multiple
    GPUs is assigned multiple PC2 tasks. But this should not suprise any
    existing users who upgrade, as any existing users who run workers with
    multiple GPUs should already know this and be running a worker per GPU
    for PC2. But now those users have the freedom to customize the GPU
    utilization of PC2 to be less than one and effectively run multiple PC2
    processes in a single worker.
    
    C2 is capable of utilizing multiple GPUs, and now workers can be
    customized for C2 accordingly.
    clinta authored and magik6k committed Nov 30, 2021
    Configuration menu
    Copy the full SHA
    93e4656 View commit details
    Browse the repository at this point in the history
  4. Permit workers to override resource table

    In an environment with heterogenious worker nodes, a universal resource
    table for all workers does not allow effective scheduling of tasks. Some
    workers may have different proof cache settings, changing the required
    memory for different tasks. Some workers may have a different count of
    CPUs per core-complex, changing the max parallelism of PC1.
    
    This change allows workers to customize these parameters with
    environment variables. A worker could set the environment variable
    PC1_MIN_MEMORY for example to customize the minimum memory requirement
    for PC1 tasks. If no environment variables are specified, the resource
    table on the miner is used, except for PC1 parallelism.
    
    If PC1_MAX_PARALLELISM is not specified, and
    FIL_PROOFS_USE_MULTICORE_SDR is set, PC1_MAX_PARALLELSIM will
    automatically be set to FIL_PROOFS_MULTICORE_SDR_PRODUCERS + 1.
    clinta authored and magik6k committed Nov 30, 2021
    Configuration menu
    Copy the full SHA
    4ef8543 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    36868a8 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    b961e1a View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    c9a2ff4 View commit details
    Browse the repository at this point in the history
  8. Fix docsgen

    magik6k committed Nov 30, 2021
    Configuration menu
    Copy the full SHA
    6d52d85 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    f25efec View commit details
    Browse the repository at this point in the history
  10. fix sched tests

    magik6k committed Nov 30, 2021
    Configuration menu
    Copy the full SHA
    a597b07 View commit details
    Browse the repository at this point in the history
  11. fix lint

    magik6k committed Nov 30, 2021
    Configuration menu
    Copy the full SHA
    001ecbb View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    cf20b0b View commit details
    Browse the repository at this point in the history
  13. worker: Typo in resources cmd usage

    Co-authored-by: Aayush Rajasekaran <arajasek94@gmail.com>
    magik6k and arajasek committed Nov 30, 2021
    Configuration menu
    Copy the full SHA
    330cfc3 View commit details
    Browse the repository at this point in the history