#

Monthly maintenance Oct 3, 2022 7am-11am

Regular monthly maintenance will take place Monday Oct. 3rd, 2022 7-11am

NOTICES

  • GPU_TEST and REMOTEVIZ PARTITIONS
    Due to failed nodes, the gpu_test partition is down to 2 nodes at the moment. We are working with the vendor to revive these nodes, but no ETA at this time. We are also investigating the root cause for multiple failures.

    • We had already planned to modify the QoS (job limitations) on the gpu_test partition due to misuse of the partition, but the failures forced us to implement this yesterday.  
      Going forward, gpu_test is limited to 1 job per user. That job is limited to a maximum 16 cores and 90GB memory.
    • Please note: gpu_test should not be used to avoid the GPU queues and scheduling. See Running Jobs (https://docs.rc.fas.harvard.edu/kb/running-jobs/#Slurm_partitions) for a list of available partitions, including the gpu partition(s)
    •  The remoteviz node/partition is also down for the same reason. The remoteviz partition is only one node, QoS remains the same.  

Updates on our status page:  https://status.rc.fas.harvard.edu/cl8a94kcf17664hvoj8oksxanx

NETWORK MAINTENANCE 10/4

  • Please be aware that network maintenance on the fibre links at MGHPCC/Holyoke will take place the evening of Tuesday Oct 4th, 2022 7-9PM
    Any interruption should be very brief. However, as always the possibility does exist for knock-on issues.
    We will post any updates here and FASRC staff will be monitoring for any related issues.
    See status at: https://status.rc.fas.harvard.edu/cl8lrzi9g806767hn1ugbv6rvv

GENERAL MAINTENANCE 10/3

  • Login node and VDI node reboots
    • Audience: Anyone logged into a a login node or VDI/OOD node
    • Impact: Login and VDI/OOD nodes will be unavailable while updating and rebooting
  • Scratch cleanup ( https://docs.rc.fas.harvard.edu/kb/policy-scratch/ )
    • Audience: Cluster users
    • Impact: Files older than 90 days will be removed.
    • Reminder: Scratch 90-day file retention purging runs occur regularly not just during maintenance periods.

Updates on our status page: https://status.rc.fas.harvard.edu

Thanks!
FAS Research Computing
https://www.rc.fas.harvard.edu/
https://status.rc.fas.harvard.edu

The event is finished.

Date

Oct 03 2022
Expired!

Time

7:00 am - 11:00 am
Category
QR Code