Glusterfs
GlusterFS Performance Tuning
performance tuning, there are no magic values for these which work on all systems. The defaults in GlusterFS are configured at install time to provide best performance over mixed workloads. To squeeze performance out of GlusterFS, use an understanding of the below parameters and how them may be used in your setup.
After making a change, be sure to restart all GlusterFS processes and begin benchmarking the new values.[1]
Commands
# gluster peer probe <IP>
# gluster peer status
# gluster --remote-host=nod1 peer status
# gluster pool list
# gluster peer detach <node>
# gluster peer detach <node> force
# gluster volume create <Volume name> transport tcp <brick path>
# gluster volume create Svazek nod1:/blok1 nod2:/blok1
# gluster volume list
# gluster volume get <Volume name> all
Option Value ------ ----- cluster.lookup-unhashed on ...
# gluster volume reset <Volume name> nfs.disable
# gluster volume set Svazek nfs.disable on
# gluster volume status <Volume name>
# gluster volume start <Volume name>
# gluster volume info <Volume name>
Remove bricks
root@nod1 :~# gluster volume remove-brick Svazek nod1:/blok1 start volume remove-brick start: success ID: c6ab64f7-d921-4e07-9350-0524b7d2a613 root@nod1 :~# gluster volume remove-brick Svazek nod1:/blok1 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 0 0 0 completed 0.00 root@nod1 :~# gluster volume remove-brick Svazek nod1:/blok1 commit Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y volume remove-brick commit: success
# mount -t glusterfs nod1:/Svazek /mnt
# mount -t glusterfs nod2:/Svazek /mnt -o backupvolfile-server=nod1
# mount -t glusterfs nod2,nod1:/Svazek /mnt
# gluster snapshot list# gluster snapshot status
# gluster snapshot info
Snap Name : build1_GMT-2015.05.26-14.19.01 Snap UUID : 3b3b8e45-5c81-45ff-8e82-fea05b1a516a Brick Path : nod1:/run/gluster/snaps/25fa0c73a6b24327927c0dc9f4f08dba/brick1/data Volume Group : ThinGroup Brick Running : No Brick PID : N/A Data Percentage : 40.80 LV Size : 80.00g Brick Path : nod2:/run/gluster/snaps/25fa0c73a6b24327927c0dc9f4f08dba/brick2/data Volume Group : ThinGroup Brick Running : No Brick PID : N/A Data Percentage : 40.75 LV Size : 80.00g Brick Path : nod3:/run/gluster/snaps/25fa0c73a6b24327927c0dc9f4f08dba/brick3/data Volume Group : ThinGroup Brick Running : No Brick PID : N/A Data Percentage : 40.82 LV Size : 80.00g
# gluster snapshot activate build1_GMT-2015.05.26-14.19.01
# gluster snapshot deactivate build1_GMT-2015.05.26-14.19.01
# mount -t glusterfs localhost:/snaps/build1_GMT-2015.05.26-14.19.01/buildinglab /mnt
root@nod1~# gluster snapshot clone build-new build1_GMT-2015.05.26-14.19.01 snapshot clone: success: Clone build-new created successfully root@nod1~# mount ... /dev/mapper/ThinGroup-build--new_0 on /srv/buildinglab type btrfs (rw,relatime,nodatasum,nodatacow,space_cache,autodefrag) /dev/mapper/ThinGroup-build--new_0 on /run/gluster/snaps/25fa0c73a6b24327927c0dc9f4f08dba/brick2 type btrfs (rw,relatime,nodatasum,nodatacow,space_cache,autodefrag) /dev/mapper/ThinGroup-build--new_0 on /run/gluster/snaps/build-new/brick2 type btrfs (rw,relatime,nodatasum,nodatacow,space_cache,autodefrag) ...
# gluster volume heal <Volume name> enable
# gluster volume heal git info
GlusterFS Configuration
GlusterFS volumes can be configured with multiple settings. These can be set on a volume using the below command substituting [VOLUME] for the volume to alter, [OPTION] for the parameter name and [PARAMETER] for the parameter value.
gluster volume set [VOLUME] [OPTION] [PARAMETER]
Example:
gluster volume set myvolume performance.cache-size 1GB
Or you can add the parameter to the glusterfs.vol config file.
vi /etc/glusterfs/glusterfs.vol
- performance.write-behind-window-size – the size in bytes to use for the per file write behind buffer. Default: 1MB.
- performance.cache-refresh-timeout – the time in seconds a cached data file will be kept until data revalidation occurs. Default: 1 second.
- performance.cache-size – the size in bytes to use for the read cache. Default: 32MB.
- cluster.stripe-block-size – the size in bytes of the unit that will be read from or written to on the GlusterFS volume. Smaller values are better for smaller files and larger sizes for larger files. Default: 128KB.
- performance.io-thread-count – is the maximum number of threads used for IO. Higher numbers improve concurrent IO operations, providing your disks can keep up. Default: 16.
Tunning Points[2]
When mounting your storage for the GlusterFS later, make sure it is configured for the type of workload you have.
- When mounting your GlusterFS storage from a remote server to your local server, be sure to dissable direct-io as this will enable the kernel read ahead and file system cache. This will be sensible for most workloads where caching of files is beneficial.
- When mounting the GlusterFS volume over NFS use noatime and nodiratime to remove the timestamps over NFS.
- # gluster volume set $vol performance.o-thread-count 64[3]
Today’s CPU are powerful enough to handle 64 threads per volume.
- # gluster volume set $vol client.event-threads XX
XX depend on the number of connections from the FUSE client to the server, you can get this number by running netstat and grep on the server IP and count the number of connections.
- # gluster volume set $vol server.event-threads XX
XX depend on the number of connections from the server to the client(s), you can get this number by running netstat and grep on “gluster" and count the number of connections.
I haven’t been working with GlusterFS for long so I would be very interested in your thoughts on performance. Please leave a comment below.