PACE A Partnership for an Advanced Computing Environment

August 7, 2013

PC1 bad news, good news

Filed under: tech support — Semir Sarajlic @ 5:09 pm

UPDATE: 2013-08-07, 13:34 –

BEST NEWS OF ALL: /pc1 is now online, and should not fall over under heavy loads anymore.

Have at it folks. Sorry it has taken this long to get to the final
resolution of this problem.

Earlier Status:
Bad news:

If you haven’t been able to tell, the /pc1 filesystem has failed again.

Good news:

We’ve been working on a new load for the OS for all storage boxes
which we had hoped to get out on last maintenance day (July 17), but
ran out of time to verify whether it was

  • deployable
  • resolved the actual issue

Memo (Mehmet Belgin) greatly assisted me is testing this issue by finding some of the cases we’ve known to cause failures and replicating them against our test installs. Many loads were broken confirming our suspicions, and also confirming our new image. It will take heavy loads a LOT better than before.

With verification done, we have been planning to have all Solaris based storage switched to this by the end
of the next maintenance day (October 15).

However, due to need, this will be going on the PC1 fileserver is just a little bit. We’ve
verified the process of how to do this without impacting any data
stored on the server, so we anticipate having this fileserver back up
and running at 1:30pm, and the bugs which have been causing this
problem since April will have been removed.

I’ll follow up with progress messages.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress