3
- Ganesha is user-space implementation of NFS. Working trough the user-space
4
GlusterFS library, it reduces user/kernel context switches. Unlike Gluster native
5
NFS support which is limited to NFS3, Ganesha supports NFS4.1 and pNFS.
6
- It also supports RDMA transport using mooshika (SoftiWARP can be used to run it
7
over Ethernet network, but is not need for native Infiniband cards). However, support
8
is still very unstable. I need few patches to be able to mount the NFS partition. However,
9
it fails during the data transfer. To me it is unclear if the problem the Ganesha or
10
actual RDMA support in native Linux RDMA client. However, Ganesha defenitively crashes from
12
- To enable Ganesha, the kernel NFS and nfs support in all volumes should be turned off.
13
gluster volume set pdv nfs.disable on
14
- Gluster does not any other changes. The Ganesha should configure Gluster volumes individually,
15
no configuration required for RDMA. It is enabled by default:
16
* It is possible to export subdirectories, just give a full path.
17
* Unlike standard NFS, it is possible to export both directory and its subdirectories as separate shares
18
Export_Id = 1; Path = "/pdv"; Pseudo = "/pdv";
19
Access_Type = RW; Squash = No_Root_Squash; SecType = "sys";
20
FSAL { Name = "GLUSTER"; Hostname = localhost; Volume = "pdv"; }
22
192.168.26.170:/pdv /pdv nfs4 defaults,minorversion=1,_netdev,nofail,soft,nodiratime,noatime,noauto 0 0
23
192.168.26.170:/pdv /pdv nfs defaults,_netdev,nofail,soft,nodiratime,noatime,nofail,soft 0 0
24
192.168.26.170:/pdv /pdv nfs _netdev,nofail,vers=4,proto=rdma,port=20049,nofail,soft,nodiratime,noatime 0 0
25
- Setting access rights (in EXPORT {} block)
26
* No global option and %include is not working in EXPORT block
27
* First more specific with more rights.
30
clients = 141.52.64.104;
33
clients = 141.52.64.0/23;
37
- Version ganesha-2.2.0/ntirpc-0.1.1 performs well in synthetic benchmarks, but very
38
slow for real data movements. ganesha-2.3.0pre/ntirpc-1.2.1 seems fine.
43
- NFS with high avalability. It is possible to specify replicas in /etc/exports on NFS server. The following
44
options have to be added.
45
-vers=4,replicas=/mnt@serv1.org:/mnt@serv2.org
46
In this case, the client will use avaialble server. It is also possible to include a folder into the tree
47
exposed by serv1 the directory from serv2. This is done with adding new export with the following options:
48
-vers=4,refer=/mnt/b@serv2
49
- However, Ganesha does not support this. RH proposes to use floating IPs to provide availability and
50
recommends to use CTDB to achieve this (with heartbeat). However, this is not enough as there are caches
51
sitting in between. After, the IP changed the caches have to be made obsolete.
53
Ganesha HA (standard way)
55
- Gluster may configure High Availability for NFS using RedHat HA cluster (gcs, corosync, pacemaker). All
56
relevant packages have to be installed but it is not necessary to configure them.
57
- However, some glusterfs modifications are required:
58
* We need to create gluster_shared_storage volume and mount it under /var/run/gluster/shared_storage
59
gluster volume create gluster_shared_storage replica 2 transport rdma 192.168.11.131:/mnt/raid/gluster2/gss 192.168.11.132:/mnt/raid/gluster2/gss
60
* There is two parameters formats of pcs utility depending on the version
61
'pcs cluster setup --name cluster_nfs ...' or 'pcs cluster setup cluster_nfs ...'
62
it is necessary to configure appropriate type in the /usr/lib/ganesha/ganesha-ha.sh. With current version in OpenSuSE, the '--name' is required and,
63
therefore, the following line have to be commented out.
64
RHEL6_PCS_CNAME_OPTION=""
65
* Then, it seems the ganesha pid file is expected at '/var/run/ganesha.nfsd.pid', besides the location of configuration file have to be specified.
66
The following lines have to be added into the /etc/sysconfig/ganesha
67
OPTIONS="-L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -p /var/run/ganesha.nfsd.pid"
68
CONFFILE=/etc/ganesha/ganesha.conf
69
* If inifiniband addresses are not asociated with names, we need to set names in /etc/hosts
70
* An empty (yes, it is required) ganesha.conf should be added to /etc/ganesha and ganesha-ha.conf have to be configured
72
HA_VOL_SERVER="ibcompute2"
73
HA_CLUSTER_NODES="ibcompute1,ibcompute2"
74
VIP_ibcompute1="192.168.26.170"
75
VIP_ibcompute2="141.52.64.22"
76
- Then, pcs servers have to be started on all gluster nodes and authenticated
77
pcs cluster auth -u hacluster ibcompute1 ibcompute2 ibcompute3
78
- Then, we need to enable global ganesha support
79
gluster nfs-ganesha enable
80
it will create configured corosync cluster and register number of pacemaker resources
81
for synchronization. Type 'pcs status' to get configuration:
82
Clone Set: nfs-mon-clone [nfs-mon]
83
Started: [ ibcompute1 ibcompute2 ]
84
Clone Set: nfs-grace-clone [nfs-grace]
85
Started: [ ibcompute1 ibcompute2 ]
86
ibcompute1-cluster_ip-1 (ocf::heartbeat:IPaddr): Started ibcompute1
87
ibcompute1-trigger_ip-1 (ocf::heartbeat:Dummy): Started ibcompute1
88
ibcompute2-cluster_ip-1 (ocf::heartbeat:IPaddr): Started ibcompute2
89
ibcompute2-trigger_ip-1 (ocf::heartbeat:Dummy): Started ibcompute2
91
- And, enable ganesha for specific volumes
92
gluster vol set test1 ganesha.enable on
93
* This actually turns cache invalidation on. Which has significant performance impact making
95
gluster volume get test1 features.cache-invalidation
96
+ Possibly, it can be switched off, but I am unsure which effect it will have on cache
98
gluster volume set test1 features.cache-invalidation off
102
- Parallel NFS. All gluster nodes holding the bricks should run ganesha. One of the servers should
103
run NFS MDS (metadata server). It will be used by clients to mount the partition (no high avalability
104
yet). The MDS is configured in ganesha configuration by adding:
105
GLUSTER { PNFS_MDS = true; }
106
The volume should set cache invalidation on (However, this really affects performance and stability
107
data accesses often stalls when this option is turned on).
108
gluster volume set test1 features.cache-invalidation on
109
Then volumes should be able to mount with (not tested)
110
mount -t nfs4 -o minorversion=1 ...
111
* Multiple MDS servers are currently not supported and recommended. All clients should connect to the
113
* The GlusterFS performance is not stable when cache-invalidation is turned on. Also, if PNFS_MDS is
114
enabled in Ganesha configuration, the speed is horribly reduced. Either of option brings down speed
115
from ~ 100 MB/s to 4 MB/s. So, it is unusable as of now.