SLURM errors: Socket timed out. What?

If the SLURM master (the process that listens for SLURM requests) is busy, you might receive the following error:

[bfreeman@rclogin12 ~]$ squeue -u bfreeman
squeue: error: slurm_receive_msg: Socket timed out on send/recv operation
slurm_load_jobs error: Socket timed out on send/recv operation

Since SLURM is scheduling 1 job every 2 seconds (let alone doing the calculations to schedule this job on 1 of approximately 1000 compute nodes), it's going to be a bit busy at times. Don't worry. Get up, stretch, pet your cat, grab a cup of coffee, and try again.
Last updated: November 19, 2014 at 17:29 pm

Posted in: c. Jobs and SLURM

CC BY-NC 4.0 This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Permissions beyond the scope of this license may be available at Attribution.