PACE A Partnership for an Advanced Computing Environment

April 9, 2014

PACE quarterly maintenance – April 15-16 2014

Filed under: tech support — admin @ 7:16 pm

PACE Quarterly maintenance has begun

See this space for updates.

PACE Quarterly maintenance notification

It’s time again for our quarterly maintenance.  We will have the clusters down April 15 & 16.

As usual, we’ve instructed the schedulers to avoid running jobs that would cross into a planned maintenance window.  This will prevent running jobs from being killed, but also may mean jobs you submit now may not run until after maintenance completes.  I would suggest checking the wall times for the jobs you will be submitting and, if possible, modify them accordingly so they will complete sometime before the maintenance. Submitting jobs with longer wall times is still OK, but they will be held by the scheduler and released after maintenance completes.

Much of our activities time around are not directly visible, with a couple of notable exceptions.

We will be upgrading the operating system on our TestFlight cluster from RedHat 6.3 to RedHat 6.5.  Please do test your codes on this cluster over the coming weeks and months, as we plan to roll it out (along with any needed fixes) to all other RedHat 6 clusters in July.  This update is expected to bring some performance improvements, as well as some critical security fixes.  Additionally, it adds support for the Intel Ivy Bridge platform, which many of you are ordering.  Any new Ivy Bridge platforms will start with RedHat 6.5.

Other user visible changes include:

  • conclude the migration of the Athena cluster to RedHat 6.3.  We’ll plan to take Athena to 6.5 in July.
  • conclude the migration of the BioCluster /nv/pb4 filesystem to the DDN/GPFS space.
  • migrate mathlocal from /nv/hp24 to /nv/pma1 (Math cluster project space)
  • application of recommended and security patches to our Solaris storage systems.  This is a widespread update will affect filesystems that start with /nv.  A rapid reversion process is available should unanticipated events occur.
  • firewall upgrades to increase bandwidth between PACE and campus

Not so apparent changes include:

  • repairing some electrical distribution to compute node racks
  • minor software/firmware update to DDN to enable support of DDN/WOS evaluation
  • updates to VMware “hardware” levels, enabled by previous migration to VMware 5.1

As always, please follow our blog for communications, especially for announcements during our maintenance activities – and let us know of any concerns via pace-support@oit.gatech.edu.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress