The Firebird cluster will be migrating to the Slurm scheduler on October 24-26, 2023. PACE has developed a plan to transition researchers’ workflow smoothly. As you may be aware, PACE began the Slurm migration in July 2022, and we have successfully migrated the Hive, Phoenix, and ICE clusters already. Firebird is the last cluster in PACE’s transition from Torque/Moab to Slurm, bringing increased job throughput and better scheduling policy enforcement. The new scheduler will better support the new hardware to be added soon to Firebird. We will be updating our software stack at the same time and offering support with orientation and consulting sessions to facilitate this migration.
Software Stack
In addition to the scheduler migration, the PACE Apps central software stack will also be updated. This software stack supports the Slurm scheduler and runs successfully on Phoenix/Hive/ICE. The Firebird cluster will feature the provided applications listed in our documentation. Please review this list of non-CUI software we will offer on Firebird post-migration and let us know via email (pace-support@oit.gatech.edu) if any PACE-installed software you are currently using on Firebird is missing from the list. If you already submitted a reply to the application survey sent to Firebird PIs there is no need to repeat requests. Researchers installing or writing custom software will need to recompile applications to reflect new MPI and other libraries once the new system is ready.
We will freeze the new software installation in PACE central software stack in Torque stack from Sep 1st, 2023. You can continue installing the software under your local/shared space without interruption.
No Test Environment
Due to security and capacity constraints, it is infeasible to use a progressive rollout approach as we did for Phoenix and Hive. Hence there will not be a test environment. For researchers installing or writing their software, we highly recommend the following:
- For those with access to Phoenix, compile Non-CUI software on Phoenix now and report any issue you encounter so that we can help you before migration.
- Please report any self-installed CUI software you need which cannot be tested on Phoenix. We will try our best to make all dependent libraries ready and give higher priority to assisting with reinstallation immediately after the Slurm migration.
Support
PACE will provide documentation, training sessions [register here], and support (consulting sessions and 1-1 sessions) to aid your workflow transitions to Slurm. Documentation and a guide for converting job scripts from PBS to Slurm-based commands will be ready before migration. We will offer Slurm training right after Migration; future communications will provide the schedule. You are welcome to join our PACE Consulting Sessions or to email us for support.
We are excited to launch Slurm on Firebird to improve Georgia Tech’s research computing infrastructure! Please contact us with any questions or concerns about this transition.