Will single core/thread jobs run faster on the cluster?

The cluster cores, in general, will not be any faster than the ones in your workstation, in fact they may be slower if your workstation is relatively new. While we have a variety of chipsets available on the cluster, most of the cores are AMD and will be slower than many Intel chips, which are most common in modern desktops and laptops. The reason we use so many AMD chips is that we could purchase a larger number of cores and RAM this way. This is the power of the cluster. The cluster isn't designed to run a single core code as fast as possible as the chips to do that are expensive. Rather you trade off raw chip speed for core count. Then you gain speed and efficiency via parallelism. So the cluster excels at multicore jobs (using threads or MPI ranks) or doing many jobs that take a single core (such as parameter sweeps or image process). This way you leverage the parallel nature of the cluster and the 60,000 cores available.

So if you have a single job, the cluster isn't really a gain. If you have lots of jobs you need to get done, or your job is to large to fit on a single machine (due to RAM or its parallel nature), the cluster is the place to go. The cluster can also be useful for offloading work from your workstation. That way you can use your workstation cores for other tasks and offload the longer running work onto the cluster.

In addition since the cluster cores are a different architecture from your workstation one needs to be aware that the code will need to be optimized differently. This is where compiler choice and compiler flags can come in handy. That way you can get the most out of both sets of cores. Even there you may not get the same performance out of the cluster as your local machine. The main processor we have on the cluster is now 4 years old, and if you are using serial_requeue you could end up on hardware bought today to stuff purchased 7 years ago. There is about a factor of 2-4 in performance in just the natural development of processor technology.

Posted in: c. Jobs and SLURM