Our May 2018 maintenance (https://blog.pace.gatech.edu/?p=6158) is complete ahead of schedule. We have brought compute nodes online and released previously submitted jobs. Login nodes are accessible and your data are available. As usual, there are a small number of straggling nodes we will address over the coming days.
Our next maintenance period is scheduled for Thursday, Aug 9 through Saturday, Aug 11, 2018.
Schedulers
Job-specific temporary directories (may require user action): Complete as planned. Please see the maintenance day announcement (https://blog.pace.gatech.edu/?p=6158) to see how this impacts your jobs.
ICE (instructional cluster) scheduler migration to a different server (may require user action): Complete as planned. Users should not notice any differences.
Systems Maintenance
ASDL cluster (requires no user action): Complete as planned. Bad CMOS batteries are replaced and the fileserver has a replacement CPU. Memory problems were related to bad CPU, which are resolved without changing any Memory DIMMs.
Replace PDUs on Rich133 H37 Rack (requires no user action): Deferred per the request of cluster owner.
LIGO cluster rack replacement (requires no user action): Complete as planned.
Storage
GPFS filesystem client updates on all of the PACE compute nodes and servers (requires no user action): Complete as planned, and tested. Please report any missing storage mounts to pace-support.
Run routine system checks on GPFS filesystems (requires no user action): Complete as planned, no problems found!
Network
The IB network card firmware upgrades (requires no user action): Complete as planned.
Enable 10GbE on physical headnodes (requires no user action): Complete as planned.
Several improvements on networking infrastructure (requires no user action): Complete as planned.