Dear PACE Users,
All PACE clusters, including Phoenix, Hive, Firebird, PACE-ICE, COC-ICE, and Buzzard, are ready for research. As usual, we have released all user jobs that were held by the scheduler.
Due to complications with the RHEL7.9 upgrade, 36% of Phoenix compute nodes remain under maintenance. We will work to return the cluster to full strength in the coming days. All node classes and queues have nodes available, and all storage is accessible.
Researchers who did not complete workflow testing on our Testflight environments on Phoenix and Hive, and Firebird users for whom a testing environment was not available, could experience errors related to the upgrade (see blog post). Please submit a support ticket to pace-support@oit.gatech.edu for assistance if you encounter any issues.
Our next maintenance period is tentatively scheduled to begin at 6:00 A on Wednesday, May 11, 2022, and conclude by 11:59 PM on Friday, May 13, 2022. Additional maintenance periods are tentatively scheduled for August 10-12 and November 2-4.
The following tasks were part of this maintenance period:
ITEMS REQUIRING USER ACTION:
- [Complete on most nodes][System] Phoenix, Hive and Firebird clusters’ operating system will be upgraded to RHEL7.9.
ITEMS NOT REQUIRING USER ACTION:
- [Deferred][Datacenter] Databank will repair/replace the DCR, requiring that all PACE compute nodes be powered off.
- [Complete][Storage/Hive] Upgrade GPFS controller firmware
- [Complete][Storage/Phoenix] Reintegrate storage previously borrowed for scratch into project storage
- [Complete][Storage/Phoenix] Replace redundant storage controller and cables
- [Complete][System] System configuration management updates
- [Complete][Network] Upgrade IB switch firmware
If you have any questions or concerns, please contact us at pace-support@oit.gatech.edu.
Best,
The PACE Team