Summary

FASRC is adding 216 Intel Sapphire Rapids nodes with 1TB of RAM each, 4 Intel Sapphire Rapids nodes with 2TB of RAM each and 144 A100 80GB GPUs to the Cannon cluster. The Sapphire Rapids cores will be made available in the new ‘sapphire’ partition. The new A100 GPUs will be added to the ‘gpu’ partition. Partitions will be reorganized to account for the larger memory of the new nodes. FASSE will gain additional ‘fasse_bigmem’ and ‘fasse_gpu’ capacity. The base gratis fairshare will be increased to 200 on the Cannon cluster.

These updates will go live on January 22nd, 2024.

Overview

Cannon 2.0 represents the expansion of the liquid-cooled Cannon cluster, providing access to Intel's latest Sapphire Rapids processors and Nvidia's A100 GPUs to the Harvard research community. This update aims to reduce wait times and provide additional resources to the community. On the Cannon cluster, we observed that limited memory on the nodes was contributing to extended wait times for workflows requiring more memory. The additional memory in these new compute nodes will help bridge this gap.

Cannon 2.0 consists of:

CPUs: 216 nodes with 1TB of RAM and 112 cores each of Lenovo SD650 V3 direct water cooling servers. These nodes offer a total of 24,192 cores of Intel 8480+ “Sapphire Rapids” processors. The interconnect is NDR 400 Gbps Infiniband (IB) connected to 400 Gbps IB core.
CPUs: 4 nodes with 2TB of RAM and 112 cores each of Lenovo SD650 V3 direct water cooling servers. These nodes have 448 cores of Intel 8480+ “Sapphire Rapids” processors. The interconnect is NDR 400 Gbps Infiniband (IB) connected to 400 Gbps IB core.
GPUs: 36 nodes each node with four Nvidia A100 80G GPUs for a total of 144 new GPUs of Lenovo SR670 V2 servers with direct water cooling. Each GPU node has 64 CPU cores, and 1TB of RAM.

As part of our standard process for new installs we follow a phased (formerly called tiered) testing plan:

Phase 1: Lenovo burn in, HPL benchmarking, Top/Green 500 Runs (Done)
Phase 2: Internal Testing (Done)
Phase 3: Harvard Community Grand Challenge Runs (in progress)
Production - Jan 22nd

Partitions

With the advent of Cannon 2.0, we have reconsidered the organization of the partitions. All the Sapphire Rapids nodes have 1TB of memory, which exceeds the capacity of our current ‘bigmem’ partition, rendering the name of that partition somewhat misleading. Additionally, the new A100s are the 80GB variety on the NDR (faster) fabric, indicating that we cannot simply merge them into the existing GPU partition. Finally, we need to consider the future needs of FASSE.

Thus the updated partitions are as follows:

Partition	Nodes	Cores per Node	CPU Core Types	Total # of Cores	Usable Mem per Node (GB)	Time Limit
sapphire	196	112	Intel Sapphire Rapids	21,952	990	3 days
shared	277	48	Intel Cascade Lake	13,296	184	3 days
hsph	36	112	Intel Sapphire Rapids	4,032	990	3 days
test	12	112	Intel Sapphire Rapids	1,344	990	12 hours
intermediate	12	112	Intel Sapphire Rapids	1,344	990	14 days
bigmem	4	112	Intel Sapphire Rapids	448	1988	3 days
bigmem_intermediate	3	64	Intel Ice Lake	192	2000	14 days
gpu	36	64	Intel Ice Lake, A100 80GB	144 GPUs	990	3 days
gpu_test	14	64	Intel Ice Lake, A100 40GB, MIG 3g.20gb	112 MIG GPUs	448	12 hours
remoteviz	1	32	Intel Cascade Lake	32	380	3 days
unrestricted	8	48	Intel Cascade Lake	384	184	none
fasse	42	48	Intel Cascade lake	2,016	184	7 days
fasse_bigmem	16	64	Intel Ice Lake	1,024	500	7 days
fasse_gpu	4	64	Intel Ice Lake, A100 40GB	16 GPUS	488	7 days

The reshuffle of the Cannon 2.0 system aims to better serve the community by absorbing existing ‘bigmem’ jobs into the new higher capacity ‘sapphire’ partition and ‘bigmem_intermediate’ into the ‘intermediate’ partition. The use of MIG mode for ‘gpu_test’ will double the effective number of GPUs, allowing for more users.

All the Cannon changes will go live on Jan 22nd 2024. None of these changes require a downtime for the cluster nor interruption of user workflow or jobs. Existing jobs will finish on the older nodes. FASSE changes will occur throughout the week as nodes are moved over to the secure environment.

To take advantage of the new partitions, you will need to update your job scripts and adjust job parameters according to your needs. See the FAQ below for additional recommendations. Please free to join our office hours https://www.rc.fas.harvard.edu/training/office-hours/ or contact us https://www.rc.fas.harvard.edu/about/contact/ if you have any questions. To cite use of this resource please see: https://www.rc.fas.harvard.edu/cluster/publications/

Fairshare

This new hardware will increase our computational power on our public Cannon partitions by roughly 70% over Cannon 1.0. As a result we have recalculated our base gratis fairshare for all groups. The base gratis fairshare on Cannon will be changed from 120 to 200. This new base gratis fairshare score will be applied to all groups on Cannon when the new hardware is made live.

For FASSE the new hardware does not significantly impact the computational power of that cluster. As such the base gratis fairshare for FASSE will remain the same at 100.

Cannon FAQ

Q. I use bigmem/bigmem_intermediate, do I need to update my scripts?

Yes, you should move to the updated ‘sapphire’ and ‘intermediate’ partitions. The new partitions now have 1T of RAM.

Q. I use ultramem, do I need to update my scripts?

Yes, you should move to ‘bigmem’ or ‘bigmem_intermediate’. The updated partitions now have 2T of memory which is the same as what ‘ultramem’ offered.

Q. I use gpu_mig, do I need to update my scripts?

Yes, you should move to ‘gpu_test’ which is set up in MIG mode and allows for experimentation with that feature.

Q. I use gpu_test, do I need to update my scripts?

Maybe. ‘gpu_test’ will be moving from V100’s to A100’s in MIG mode. Depending on your script you may need to adjust for the new GPU type.

Q. I use test/intermediate/gpu, do I need to update my scripts?

No. Changes to these partitions do not necessitate updating your script.

Q. I use shared, do I need to update my scripts?

Maybe. ‘shared’ will remain as is, but if you want to leverage the new Sapphire Rapids nodes you should consider updating your script to point to the ‘sapphire’ partition, or adding ‘sapphire’ as an additional partition (i.e. -p shared,sapphire). It is worth testing to see which partition will give you better performance.

Q. I am part of Kempner, does the gratis base fairshare impact me?

No. The base gratis fairshare only impacts your Cannon Slurm account, not the Kempner Slurm accounts.

FASSE FAQ

Q. I use fasse_bigmem, do I need to update my scripts?