Commit fdf56162 authored by Olli-Pekka Lehto's avatar Olli-Pekka Lehto Committed by Morris Jette
Browse files

added script to help manage native and symmetric MPI runs within SLURM

Dear all,

As quick fix, I have put together this script to help manage native and symmetric MPI runs within SLURM. It's a bit bare-bones currently but I needed to get it working quickly :)

It does not provide tight integration between the scheduler and MPI daemons and requires a slot on the host, even when running fully on the MIC, so it's really far from an optimal solution but could be a stopgap.

It's inspired by the TACC Stampede documentation. They seem to have a similar script in place.

It's fairly simple, you provide the names of the MIC binary (with -m) and host binary (with -c). The host MPI/OpenMP parameters are given as usual and the Xeon Phi side parameters as environment variables (MIC_PPN, MIC_OMP_NUM_THREADS). Currently it supports only 1 card per host but extending it should be simple enough.

Here are a couple of links to documentation:

Our prototype cluster documentation:
https://confluence.csc.fi/display/HPCproto/HPC+Prototypes#HPCPrototypes-XeonPhiDevelopment
Presentation at the PRACE Spring School in Umeå earlier this week:
https://www.hpc2n.umu.se/sites/default/files/1.03%20CSC%20Cluster%20Introduction.pdf

Feel free to include this in the contribs -directory. It might need a bit of cleanup though and I don't know when I have the time to do this.

I have also added support for TotalView debugger (provided it's installed and configured properly for Xeon Phi usage).

Future ideas:

For the native MIC client, I've been testing it out a bit and looking at ways to minimize the changes needed for support. The two major challenges seem to be in scheduling and affinity:

I think it might be necessary to put it into a specific topology plugin, like the one for BG/Q, but it looks like a lot of work to do that.

Best regards,
Olli-Pekka
parent 36a17d12
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment