PACE A Partnership for an Advanced Computing Environment

November 10, 2011

Updated: Network troubles, redux (FIXED)

Filed under: tech support — admin @ 8:38 pm

We’ve got the switch back.  The outage looks to have caused our virtual machine farm to reboot, so connections to head nodes will have been dropped.

This also affected the network path between compute nodes and the file servers.  With a little luck, the NFS traffic should resume, but you may want to check on any running jobs to make sure.

Word from the network team is that they were following published instructions from the switch vendor to integrate the two switches when the failure occurred.  We’ll be looking into pretty intensely, as this these switches are seeing a lot of deployments in other OIT functions.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress