Our July 2016 maintenance is now substantially complete. Again, we sincerely apologize for the unfortunate additional unplanned downtime.
As previously communicated, we’ve had an unexpected delay caused by the data migrations from the old scratch system to newly acquired system. Some of these transfers are still in progress, with a limited number of users still remaining. We have temporarily disabled access for these users to prevent jobs running on incomplete scratch data. We are reaching out to the affected users individually with more details. These users will not be able to login and their previously submitted jobs will not run until their scratch migration is complete. If you have not received a further notification from us and experience problems with logins or anything else, please do let us know as soon as possible by sending an email to pace-support@oit.gatech.edu.
Scratch performance may be reduced as these migrations complete, and we are doing everything we can to finish these migrations as soon as possible.
We have brought compute nodes online released previously submitted jobs. As usual, we have a number of compute nodes that still need to be brought back online, but we are actively working to make them available asap.
DDN/GPFS work
The new DDN SFA-7700 system is now operational and serving scratch storage for all users. We updated client software versions on all nodes. We have encountered an anomaly that reduces its internal redundancy but does not affect normal operation. We expect be able to rectify this while in production.
Electrical work
Tasks complete as described
Bonus objectives
Network and local storage upgrades were implemented on schedulers as planned. Additional diskless nodes were converted to diskfull as planned.