PACE A Partnership for an Advanced Computing Environment

April 4, 2024

PACE clusters unreachable on the morning of April 4, 20204

Filed under: Uncategorized — Grigori Yourganov @ 10:54 am

The PACE clusters were not accepting new connections from 4 AM until 10 AM today (April 4, 2024). As part of the preparations to migrate the clusters to a new version of the operating system (Red Hat Enterprise Edition 9), an entry in the configuration management system from the development environment was accidentally applied to production, including the /etc/nologin file on the head nodes. This has been fixed and additional controls are in place to avoid reincidence. 

The jobs and the data transfers running during that period were not affected. The interactive sessions that started before the configuration change were not affected either. 

Currently, the clusters are back online, and the scheduler is accepting jobs. We strongly apologize for this accidental disruption. 

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress