SLURM errors: Socket timed out. What?

If the SLURM master (the process that listens for SLURM requests) is busy, you might receive the following error:

[bfreeman@holylogin02 ~]$ squeue -u bfreeman
squeue: error: slurm_receive_msg: Socket timed out on send/recv operation
slurm_load_jobs error: Socket timed out on send/recv operation

Since SLURM is scheduling 1 job every second (let alone doing the calculations to schedule this job on 1 of approximately 100,000 compute nodes), it's going to be a bit busy at times. Don't worry. Get up, stretch, pet your cat, grab a cup of coffee, and try again.
Last updated: October 7, 2019 at 13:48 pm

Posted in: c. Jobs and SLURM

CC BY-NC-SA 4.0 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at Attribution.