WHEN IS IT HAPPENING?
This is a reminder that PACE’s next Maintenance Period starts at 6:00AM on Tuesday, 10/24/2023, and is tentatively scheduled to conclude by 11:59PM on Thursday, 10/26/2023. PACE will release each cluster (Phoenix, Hive, Firebird, ICE, and Buzzard) as soon as maintenance work is complete.
WHAT DO YOU NEED TO DO?
As usual, jobs with resource requests that would be running during the Maintenance Period will be held until after the maintenance by the scheduler. During this Maintenance Period, access to all the PACE-managed computational and storage resources will be unavailable. This includes Phoenix, Hive, Firebird, ICE, and Buzzard. Please plan accordingly for the projected downtime.
WHAT IS HAPPENING?
ITEMS REQUIRING USER ACTION:
• [Firebird] Migrate from the Moab/Torque scheduler to the Slurm scheduler. If you are a Firebird user, we will get in touch with you and provide assistance with rewriting your batch scripts and adjusting your workflow to Slurm.
ITEMS NOT REQUIRING USER ACTION:
• [Network] Upgrade network switches
• [Network][Hive] Configure redundancy on Hive racks
• [Network] Upgrade firmware on InfiniBand network switches
• [Storage][Phoenix] Reconfigure old scratch storage
• [Storage][Phoenix] Upgrade Lustre controller and disk firmware, apply patches
• [Datacenter] Datacenter cooling tower cleaning
WHY IS IT HAPPENING?
Regular maintenance periods are necessary to reduce unplanned downtime and maintain a secure and stable system.
WHO IS AFFECTED?
All users across all PACE clusters.
WHO SHOULD YOU CONTACT FOR QUESTIONS?
Please contact PACE at pace-support@oit.gatech.edu with questions or concerns.