PACE A Partnership for an Advanced Computing Environment

January 18, 2024

NetApp Storage Outage

Filed under: Uncategorized — Michael Weiner @ 5:22 pm

[Update 1/18/24 6:30 PM]

Access to storage has been restored, and all systems have full functionality. The Phoenix and ICE schedulers have been resumed, and queued jobs will now start.

Please resubmit any jobs that may have failed. If a running job is no longer progressing, please cancel and resubmit.

The cause of the outage was identified as an update this afternoon to resolve a specific permissions issue affecting some users on the ICE shared directories. The update has been reverted.

Thank you for your patience as we resolved this issue.

[Original Post 1/18/24 5:20 PM]

Summary: An outage on PACE NetApp storage devices is affecting the Phoenix and ICE clusters. Home directories and software are not accessible.

Details: At approximately 5:00 PM, an issue began affecting access to NetApp storage devices on PACE. The PACE team is investigating at this time.

Impact: All storage devices provided by NetApp services are currently unreachable. This includes home directories on Phoenix and ICE, the pace-apps software repository on Phoenix and ICE, and course shared directories on ICE. Users may encounter errors upon login due to inaccessible home directories. We have paused the schedulers on Phoenix and ICE, so no new jobs will start. The Hive and Firebird clusters are not affected.

Please contact us at pace-support@oit.gatech.edu with any questions.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress