PACE A Partnership for an Advanced Computing Environment

May 18, 2020

[Resolved] Home directory failures

Filed under: Uncategorized — Michael Weiner @ 12:23 pm

[Update 5/18/20 4:25 PM]

Reliable access to home directories was restored early this afternoon. There was an issue with DNS on the GT network, where the DNS server allowing for a connection to the home and utility storage devices was reacting slowly but not completely down, so it did not fail over onto the backup server. In concert with OIT, we have reordered the DNS servers, and access is restored. Please contact us at pace-support@oit.gatech.edu with any questions.

If jobs failed due to the outage, please resubmit them to run again.

[ Issue began approximately 2 PM on 5/17/20 ]

We are experiencing an intermittent outage on PACE affecting home directories and certain other mounted utility directories. We are currently working to restore access. Thank you to those of you who reported the issue to us this afternoon. This intermittent mount failure can cause the following issues:

  • Home directories not loading on login nodes.
  • Login sessions starting with “bash” instead of “~” as the prompt and having warning messages displayed
  • Batch or interactive jobs failing immediately after launch due to an inability to load files with an error message such as “no such file or directory”
  • “pace-check-queue” and other PACE utilities failing to report information as expected
  • Missing home directories on file transfer utilities (scp or sftp)

 

For jobs that have failed, please wait until after we have completed the repair and then resubmit your jobs.

We will provide updates as they become available. Thank you for your patience.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress