As announced earlier, we will remove a set of old MPI stacks (and applications that use them) from the PACE software repository after the April maintenance day. This is required by the planned upgrade of the schedulers (torque and moab), which use libraries that are incompatible with the old MPI stacks. Some MPI-related Python modules (e.g. mpi4py) are built on one of these old MPI versions (namely mvapich2/1.6) and they will also stop working with the new scheduler.
Old MPI versions are also known to have significant performance and scalability problems, and they are no longer supported by developers, therefore their expulsion was inevitable regardless of the scheduler upgrades. Namely, all versions older than than “mvapich2/1.9” and “openmpi/1.6.2” are known to be incompatible, and will be removed along with applications that are compiled with them. MPI stacks newer than these versions are compatible with the new scheduler version, so they will continue to be available. PACE team is ready to offer assistance with all the changes you may need to replace these old MPI versions with new versions with minimal interruptions to your research.
We saw these problems as an opportunity to start creating a new and improved software repository almost from scratch, which not only fixes the MPI problems, but also provides added benefits such as:
* a cleaner MPI versioning without long confusing subversions such as “1.9rc1” or “2.0ga”: You will see a only a single subversion for each major release, e.g.,
mvapich2: 1.9, 2.0, 2.1, …
openmpi: 1.6, 1.7, 1.8, …
* latest software versions: We showed a best effort to compile the most recent (stable) versions as we could, unless they had compilation problems or proved to be buggy.
* a new python that allows parallelization without requiring InfiniBand (IB) network: Current python uses mvapich2, which requires IB network. The new python, on the other hand, will employ openmpi, which can run on *any* node regardless of their network connection while still taking advantage of IB when available.
We will start offering this new repository as an alternative after the April maintenance day. Switching between old and the new repository will be as easy as loading/unloading a module named “newrepo”. E.g.:
# Make sure there are no loaded modules
$module purge
$module load newrepo
… You are now using the new repo …
# since newrepo is also a module itself, ‘module purge’ will put you back in the old repo
$module purge
… You are back in the old repo …
The current plan is to decommission the old repository after the July maintenance, therefore strongly encourage you to try the new repository (which is still beta) as soon as possible to ensure a smooth transition. If the new repository is working for you, continue to use it and never look back. If you notice problems or missing components, you can continue to use the old repository while we are working on fixing them.
Please keep in mind that the new repo is created almost from scratch, so expect changes in module names, as well as new set of dependencies/conflicts between the modules. PACE team is always ready to provide module suggestions for your applications, or answer any other questions that you may have.
We hope the new repository will make a positive contribution to your research environment with visible improvements in performance, stability and scalability.
Thanks!
PACE Team