Each year our primary data center, MGHPCC (Holyoke), performs a full power shutdown for electrical maintenance. This requires us to power down all FASRC systems at MGHPCC starting the evening before. It also allows us a window to fit in maintenance, including the CentOS 7 cluster upgrade, that would otherwise require us shutting off various resources during normal operations. Note that this power event will mean the termination of all running jobs as power to the entire facility will be out.
- Monday 5/21/18 at 6PM: All running jobs will be terminated and we will begin powering down all devices at MGHPCC/Holyoke.
- Tuesday 5/22/18: Power will be out the entire day as MGHPCC performs their work. Harvard network updates 9pm-1am.
- Wednesday 5/23/18: We expect to be back to normal operations by approximately 8PM.
WHAT IS AFFECTED
- Resources in Holyoke will be affected for the duration of the event. This includes the compute cluster, scheduler, regal, and storage housed at MGHPCC
- Resources in Boston and Cambridge, including storage, are likely to be affected by network updates which will take place Tuesday night 5/22/18 between 9PM and 1AM. Please plan accordingly.
A list of affected storage will be posted here as soon as it is complete.
CENTOS 7 UPGRADE (O3)
During this event, once basic power is available to us, we will also be upgrading all of our partitions to CentOS 7 as part of the "O3" (Odyssey 3) update. Please note this will affect all users of the cluster once complete on 5/23/18. Labs with their own partitions are being contacted individually to facilitate the upgrade of their partitions and to move early where possible to lessen the impact. NCF and ATLAS are not a part of this upgrade.
A CentOS 7 testing queue (test7) is available now for you to test your code prior to this upgrade. A CentOS 7 login node (login7.rc.fas.harvard.edu) is available as well. More details on O3/CentOS 7 and test resources can be found on our O3 page.
We encourage all cluster users to begin testing in advance of this upgrade.
Reminder: During the downtime, all jobs still running will be terminated on Monday evening. As power will be out at MGHPCC and the cluster will be upgraded to CentOS7, jobs cannot be paused, they must be stopped before we begin power-down.
We will notify the community via our email lists when we are back to normal operations. You can also check back here or on our Status Page