PACE A Partnership for an Advanced Computing Environment

June 17, 2014

Physical host failure for VMs – potential job impact

Filed under: News,tech support — Semir Sarajlic @ 1:05 pm

This morning (approximately between 3am and 8am) we suffered a failure in one of our physical hosts which makes up part of our VM farm. This failure caused several head nodes to go offline, as well as one of the PACE run license servers for software.

For ALL PACE run clusters, it would be wise to double check your job runs in case they may have lost their license server prior to kicking off this morning or if it was running during this time.

The following head nodes went offline, but have returned:

The following license server went offline, but has returned:

In the cases of the head nodes, no jobs should have been affected nor any data lost because of nodes being offline.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress