Monthly maintenance Oct 3, 2022 7am-11am
Regular monthly maintenance will take place Monday Oct. 3rd, 2022 7-11am
NOTICES
- GPU_TEST and REMOTEVIZ PARTITIONS
Due to failed nodes, the gpu_test partition is down to 2 nodes at the moment. We are working with the vendor to revive these nodes, but no ETA at this time. We are also investigating the root cause for multiple failures.- We had already planned to modify the QoS (job limitations) on the gpu_test partition due to misuse of the partition, but the failures forced us to implement this yesterday.
Going forward, gpu_test is limited to 1 job per user. That job is limited to a maximum 16 cores and 90GB memory. - Please note: gpu_test should not be used to avoid the GPU queues and scheduling. See Running Jobs (https://docs.rc.fas.harvard.edu/kb/running-jobs/#Slurm_partitions) for a list of available partitions, including the gpu partition(s)
- The remoteviz node/partition is also down for the same reason. The remoteviz partition is only one node, QoS remains the same.
- We had already planned to modify the QoS (job limitations) on the gpu_test partition due to misuse of the partition, but the failures forced us to implement this yesterday.
Updates on our status page: https://status.rc.fas.harvard.edu/cl8a94kcf17664hvoj8oksxanx
- GLOBUS PERSONAL CLIENT - UPDATE BY DEC 17
If you are using the Globus Connect Personal client on your machine, please ensure you have updated and are running version 3.2 or greater by December 17th, 2022. You will not be able to use version 3.1 or below after that date.
https://docs.globus.org/ca-update-2022/#globus_connect_personal
- TRAINING
New training sessions, including monthly new user training, are available.
You can find a list and links to sign up here: https://www.rc.fas.harvard.edu/upcoming-training/
NETWORK MAINTENANCE 10/4
- Please be aware that network maintenance on the fibre links at MGHPCC/Holyoke will take place the evening of Tuesday Oct 4th, 2022 7-9PM
Any interruption should be very brief. However, as always the possibility does exist for knock-on issues.
We will post any updates here and FASRC staff will be monitoring for any related issues.
See status at: https://status.rc.fas.harvard.edu/cl8lrzi9g806767hn1ugbv6rvv
GENERAL MAINTENANCE 10/3
- Login node and VDI node reboots
- Audience: Anyone logged into a a login node or VDI/OOD node
- Impact: Login and VDI/OOD nodes will be unavailable while updating and rebooting
- Scratch cleanup ( https://docs.rc.fas.harvard.edu/kb/policy-scratch/ )
- Audience: Cluster users
- Impact: Files older than 90 days will be removed.
- Reminder: Scratch 90-day file retention purging runs occur regularly not just during maintenance periods.
Updates on our status page: https://status.rc.fas.harvard.edu
Thanks!
FAS Research Computing
https://www.rc.fas.harvard.edu/
https://status.rc.fas.harvard.edu
