PACE A Partnership for an Advanced Computing Environment

December 30, 2017

[Resolved] All PACE nodes temporarily offline due to storage trouble

Filed under: Uncategorized — Semir Sarajlic @ 10:16 pm

Update (12/31/2017, 10:15am): We have addressed the issue and the majority of nodes started running jobs again. As far as we can tell, this was caused by a network related “event” that’s internal to the system. We are working with the vendor to identify the exact root cause.

Original post: One of the primary storage systems (pace2) went offline today, potentially impacting running jobs referencing to that system.

Our automated scripts offlined PACE nodes to prevent new jobs from starting. They will be online once the storage issues are addressed.

PACE team is currently investigating the problems and we will keep you updated.

We are sorry for the delays that may be caused due to the limited staff availability on holidays.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress