Description
After the shut down on 1/28/2022, the oversubscribe parameter change back to exclusive and only one job can run on each node
(base) [aronton@scopion ~]$ scontrol show partition scopion1
PartitionName=scopion1
AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=NO QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
Nodes=scopion1[01-09]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=EXCLUSIVE
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=432 TotalNodes=9 SelectTypeParameters=NONE
JobDefaults=(null)
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
Probable solution
On 11/19/2021 we change the oversubcribe parameter to "No", but we didn't change it in /etc/slurm/slurm.conf
I think we have to change the setting in /etc/slurm/slurm.conf
Reference
https://slurm.schedmd.com/cons_res_share.html?fbclid=IwAR1NFxsIpUhzPdiKVxLJ_lEzTXtqYbxj3yHqzHs6maEm7ZmLUiNrehjGPCA
Description
After the shut down on 1/28/2022, the oversubscribe parameter change back to exclusive and only one job can run on each node
(base) [aronton@scopion ~]$ scontrol show partition scopion1
PartitionName=scopion1
AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=NO QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
Nodes=scopion1[01-09]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=EXCLUSIVE
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=432 TotalNodes=9 SelectTypeParameters=NONE
JobDefaults=(null)
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
Probable solution
On 11/19/2021 we change the oversubcribe parameter to "No", but we didn't change it in /etc/slurm/slurm.conf
I think we have to change the setting in /etc/slurm/slurm.conf
Reference
https://slurm.schedmd.com/cons_res_share.html?fbclid=IwAR1NFxsIpUhzPdiKVxLJ_lEzTXtqYbxj3yHqzHs6maEm7ZmLUiNrehjGPCA