PACE A Partnership for an Advanced Computing Environment

May 17, 2013

PC1 & PB1 filesystems back online

Filed under: News,tech support — Tags: — Semir Sarajlic @ 3:25 am

Hey folks,

It looks like we may have finally found the issue tying up the PB1 file server and the occasional lock up of the PC1 file server. We’ve isolated the compute nodes that seemed to be generating the bad traffic, and have even isolated the processes which appear to have compounded the problem on a pair of shared nodes (thus linking the two server failures). With any luck, we’ll get those nodes online once their other jobs complete or are cancelled.

Thank you for the patience you have given us while we tracked this problem down. We know it was quite inconvenient, but we have a decent picture of what occurred and thankfully it was something that is very unlikely to repeat itself.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress