Early today, the InfiniBand Fabric located in the Rich Datacenter (where most PACE resources are located) developed issues reaching the subnet managers. After on-site troubleshooting, the subnet manager was initialized. As of 11:30 AM local time, the InfiniBand Fabric is operational.
Some running jobs might have been affected during the outage period as well as potential issues in new jobs using MPI.
Please check any jobs for any potential issues and we deeply apologize for any inconvenience that may have occurred.