3
- OpenMPI shipped with OpenSUSE does not support Infiniband and seems to have
4
other porblems preventing it from usage on standard networks. It is better to
5
install newer version from OFED project:
6
http://download.opensuse.org/repositories/OFED:/Factory/openSUSE_Factory/OFED:Factory.repo
8
- There is a few interactivity problems to handle with ssh connection
9
* Non-interactive host key checking
10
* If ssh key is protected with password, the ssh-agent protocol forwarding should be enabled
11
This can be achieved with the following options:
12
mpirun --mca plm_rsh_agent "ssh -A -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null"
14
- OpenMPI will try to identify the best network protocol to use, but it can be configured
15
manually as well. The 'self' module should be always present
16
mpirun --mca btl openib,tcp,self ...
18
- On OpenSUSE 13.1 there are communication problems if more than 4 nodes are communicating
19
over Infiniband-over-IP network using tcp protocol. The MPI_Scatter (etc.) will just block
20
after serving few nodes. Using openib communication layer, everything works fine. I.e.
21
mpirun --mca btl openib,self - runs fine
22
mpirun --mca btl tcp,self - hangs
24
- If slots are not configured in the hostfile, the scheduler may run multiple instance on a
25
single cluster node despite availability of more nodes. The hostfile line should look like: