[Update 2022/12/08, 5:52PM EST]
Work was been completed on the cable replacement on the redundant storage controller and associated systems connecting to the storage were restored back to normal. We were able to replace 2 cables on the controller without interruption to service.
[Update 2022/12/05, 9:00AM EST]
Summary: Phoenix project & scratch storage cable replacement for redundant controller and potential outage and subsequent temporary decreased performance
Details: A cable connecting enclosures of the Phoenix Lustre device, hosting project and scratch storage, to the redundant controller needs to be replaced, beginning around 10AM Wednesday, December 8th, 2022. The expected time to finish the work for cable replacement will take about 3-4 hours. After the replacement, pools will need to be rebuilt over the course of about a day.
Impact: Because we are replacing a cable on the redundant controller while maintaining the main controller, there should not be an outage during the cable replacement. However, a similar replacement has previously caused storage to become unavailable, so an outage is possible. If this happens, your job may fail or run without making progress. If you have such a job, please cancel it and resubmit it once storage availability is restored. In addition, performance may be slower than usual for a day following the repair as pools rebuild. Jobs may progress more slowly than normal. If your job runs out of wall time and is cancelled by the scheduler, please resubmit it to run again. PACE will monitor Phoenix Lustre storage throughout this procedure. If a loss of availability occurs, we will update you.
Please accept our sincere apology for any inconvenience that this temporary limitation may cause you. If you have any questions or concerns, please direct them to pace-support@oit.gatech.edu.