[Update – 08/13/2021 – 10:00AM]
Dear PACE Users,
Our scheduled maintenance has completed ahead of the schedule! All Coda datacenter clusters are ready for research. As usual, we have released all users jobs that were held by the scheduler. We appreciate everyone’s patience as we worked through these maintenance activities.
Our next maintenance period is tentatively scheduled to begin at 6:00AM on Wednesday, 11/03/2021, and it is tentatively scheduled to conclude by 11:59PM on Friday, 11/05/2021.
Here is an update on the tasks performed during this maintenance period.
ITEMS REQUIRING USER ACTION:
ITEMS NOT REQUIRING USER ACTION:
- [Complete] [Datacenter] Databank will need to replace components of one of the transformers feeding the room that will require a complete power off for the research hall that includes the PACE managed clusters.
- [Complete] [Storage] Upgrade controller for the storage appliances: SFA200NV, SFA18KE
- [Complete] [Storage] Replace a miniSAS cable on Hive storage appliance: SFA14KXE
- [Complete] [Storage] Replace a failed hard drive on a pre-production OSG cluster
- [Complete] [System/Security] Operating system patch installs
- [Complete] [System/Security] Endpoint Protection Updates
- [Complete] [Benchmarks] Conduct IO500 and HPCG benchmarks for Hive and Phoenix clusters
- [Complete] [System] Update NVidia drivers and add NVidia specific libraries
- [Complete] [System] Reboot scheduler nodes
If you have any questions or concerns, please do not hesitate to contact us at pace-support@oit.gatech.edu.
Best,
The PACE Team
[Original Message – 07/13/2021 – 4:15PM that was updated on August 4, 2021 with list of tasks]
Dear PACE Users,
This is another friendly reminder that our next Maintenance period is scheduled to begin at 6:00AM on Wednesday, 08/11/2021, which is tentatively scheduled to conclude by 11:59PM on Friday, 08/13/2021. Please note, as usual, jobs with resource requests that would be running during the Maintenance Period will be held until after the Maintenance Period by the scheduler. During the Maintenance Period, access to all the PACE managed computational and storage resources will be unavailable.
Please see the list of activities to be completed:
ITEMS REQUIRING USER ACTION:
ITEMS NOT REQUIRING USER ACTION:
- [Datacenter] Databank will need to replace components of one of the transformers feeding the room that will require a complete power off for the research hall that includes the PACE managed clusters.
- [Storage] Upgrade controller for the storage appliances: SFA200NV, SFA18KE
- [Storage] Replace a miniSAS cable on Hive storage appliance: SFA14KXE
- [Storage] Replace a failed hard drive on a pre-production OSG cluster
- [System/Security] Operating system patch installs
- [System/Security] Endpoint Protection Updates
- [Benchmarks] Conduct IO500 and HPCG benchmarks for Hive and Phoenix clusters
- [System] Update Nvidia drivers and add Nvidia specific libraries
- [System] Reboot scheduler nodes
If you have any questions or concerns, please do not hesitate to contact us at pace-support@oit.gatech.edu.
Best,
The PACE Team