PACE A Partnership for an Advanced Computing Environment

June 26, 2020

[Resolved] Issue with InfiniBand Fabric and subnet managers

Filed under: Uncategorized — Semir Sarajlic @ 5:33 pm

Early today, the InfiniBand Fabric located in the Rich Datacenter (where most PACE resources are located) developed issues reaching the subnet managers. After on-site troubleshooting, the subnet manager was initialized. As of 11:30 AM local time, the InfiniBand Fabric is operational.

Some running jobs might have been affected during the outage period as well as potential issues in new jobs using MPI.

Please check any jobs for any potential issues and we deeply apologize for any inconvenience that may have occurred.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress