Skip to content

Memory Specification Appears Broken for SLURM #1714

@HarryMWinters

Description

@HarryMWinters

Environment:

  • aws-parallelcluster-2.6.0
[aws]
aws_region_name = us-east-1

[cluster default]
key_name = <ENTER-VALUE>
vpc_settings = vpc-public-east-1
master_instance_type = m4.2xlarge
compute_instance_type = m4.xlarge
fsx_settings = fs
# must be at least as big as the size on the AMI, which was created at 100GB
master_root_volume_size = 100
compute_root_volume_size = 100
scheduler = slurm
initial_queue_size = 1
maintain_initial_size = true
max_queue_size = 15000

# alinux2, py3.6.8, custom AMI
custom_ami=ami-0ac26e5989580566f

[vpc vpc-public-west-2]
vpc_id = vpc-<our-vpc-id>
master_subnet_id = subnet-<our-subnet>

[vpc vpc-public-east-1]
vpc_id = vpc-<a-different-subnet>

[global]
cluster_template = default
update_check = true
# if not set to false, AWS (bug) calls `ec2 run-instance --dryrun` to validate
# the ec2 account limits...but since pcluster uses an autoscaling group (ASG), 
# no point in checking these limits
sanity_check = false

[aliases]
ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS}

[fsx fs]
shared_dir = /shared
storage_capacity = 3600
# sunday at midnight
weekly_maintenance_start_time = 7:00:00

Bug description and how to reproduce:
When specifying the memory option on the cluster the task fail to submit and the following error returns:

sbatch: error: Memory specification can not be satisfied
sbatch: error: Batch job submission failed: Requested node configuration is not available

To recreate make a sh file:

#!/bin/bash 
#SBATCH -p compute 
#SBATCH -n 1 
#SBATCH -N 1 
#SBATCH --mem=64g 
#SBATCH -o log.txt
sleep 30
echo Hello World

Then sbatch test.sh

Additional context:
This might be a duplicate of #1517 . If so, is supporting scheduling with memory options with slurm planned?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions