3
1. Infiniband is included in official kernels, since 3.2. For OpenSuSE 12.1
5
zypper ar http://download.opensuse.org/repositories/Kernel:/HEAD/standard/Kernel:HEAD.repo
6
- It may need to disable VT-d to use CUDA. Add "iommu=soft" to kernel options in grub
7
- The following modules should be loaded: mlx4_core, mlx4_ib, mlx4_en, ib_umad, rdma_ucm, ib_ipoib, ib_sdp
8
2. SDP module is deprecated and unsupported (but there is patches from me as temporary solution)
9
- SDP allows transparent executin of TCP applications just by preloading libsdp.so
10
LD_PRELOAD="/usr/lib64/lidsdp.so" ./tcp_application
11
3. OFED stack may be installed from OpenSuSE repository or tarball in Linux/Cluster
12
zypper ar http://download.opensuse.org/repositories/OFED:/Factory/openSUSE_11.3/
16
1. One of the node, should run opensm daemon, to enable full speed communication
17
and ipoib, the following line should be added to the /etc/opensm/partitions.conf
18
Default=0x7fff,ipoib,rate=7,sl=0 : ALL=full ;
19
2. If multichannel link to be configured (multiple ports on card or multiple cards)
20
and the channels are independent (point to point connection or connected through
21
different switches), an opensm instance should be launched for each port.
22
opensm -g 0x0002c903009f7431 --daemon
23
opensm -g 0x0002c903009f7432 --daemon
24
The -g specified an ID of port for the instance. Partition configuration may stay
26
3. It is possible to preload libsdp.so with all applications and limit the range
27
of IP addresses for which SDP protocol should be used, /etc/libsdp.conf
28
use both server * 192.168.11.0/24:*
29
use both client * 192.168.11.0/24:*
30
4. It is possible to set a name to simplify service discovery
31
echo <system_name> > /sys/class/infiniband/mlx4_0/node_desc
35
Port information: ibstat
36
Hardware graph: iblinkinfo
37
Network Diagnostic: ibdiagnet -ls 10 -lw 4x
38
Ping: ibping -S ; ibping <Baselid_of_server_reported_by_ibstat>
42
1. IPoIB may run in datagram (default) or connected modes. The datagram mode
43
seems to be faster (despite in internet it is supposed otherwise). Send the
45
/sys/class/net/ib0/mode
50
OFED: rdma_bw, rdma_lat, ib_read_bw, ib_write_bw, ib_send_bw, ib_read_lat, ib_write_lat, ib_send_lat
51
Server: rdma_bw, Client: rdma_bw <ip>
52
- Frequency throttling should be disabled
53
for name in /sys/devices/system/cpu/cpu[0-9]*; do echo "performance" > $name/cpufreq/scaling_governor; done
55
mpi-selector --system --set mvapich2_gcc-1.7
56
NetPipe: make; make mpi
57
Server: ./NPtcp, Client: ./NPtcp -h <server_ip>
58
mpich 2: mpirun -machinefile hostlist -np 2 ./NPmpi
59
- hostlist should contain local and remote Infiniband IP addresses
60
- NPmpi application should be on both systems in the same location (SSH used to connect & run)
63
rdma: 3227 MB/s (1.49 us latency)
64
rdma over optical link: 3064 MB/s (1.77 us)
65
mvapich 1.7: 3146 MB/s (1.83 us)
66
SDP: 2060 MB/s (5.79 us)
67
TCP: 996 MB/s (14.6 us)
73
SRP: Support for the SCSI RDMA Protocol (SRP) Target driver. The SRP protocol
74
is a protocol that allows an initiator to access a block storage device on
75
another host (target) over a network that supports the RDMA protocol (Linux 3.3)