GLUSTER
GlusterFS is a very sophisticated GPLv3 file system that uses FUSE (File System in User Space). It allows you to aggregate disparate storage devices, which GlusterFS refers to as “storage bricks”, into a single storage pool or namespace. It is what is sometimes referred to as a “meta-file-system” which is a file system built on top of another file system. The storage in each brick is formatted using a local file system such as ext3 or ext4, and then GlusterFS uses those file systems for storing data (files and directories). GlusterFS is very interesting for many reasons. It allows you to aggregate and use disparate storage in a variety of ways. The nodes themselves are stateless; they don’t talk to each other. GlusterFS uses an elastic hashing algorithm instead of either a central or distributed metadata model. It is in use at a number of sites for very large storage arrays, for high performance computing, and for specific application arrays such as bioinformatics that have particular IO patterns (particularly nasty IO patterns in many cases). GlusterFS has a client and server structure. Servers are the storage bricks with each server running a GlusterFSD daemon to export the local fs as a volume. The GlusterFS client process connects to the servers via a custom protocol over TCP/IP or InfiniBand and creates composite virtual volumes from multiple remote servers. Files can be stored whole or stripped across multiple servers. The final volume can be mounted by the client using FUSE with NFS v3 or via gfapi client library. Native protocols can be re-exported via NFS v4 server, SAMBA, OpenStack storage (Swift) protocol using the UFO (Unified File and Object) translator. A lot of utility for GlusterFS comes from translators including file-based mirroring and replication, file load balancing, file stripping, scheduling, volume failover, disk caching, quotas, and snapshots.
ADVANTAGES
- No metadata server – it is fully distributed
- Scales to 100s of petabytes
- High availability – uses replication and heals automatically
- Gluster has no kernel dependencies
- Easy to install and manage
DISADVANTAGES
- Lose of a single server will cause loss of access to all files stored on that server
- If files are larger than sub-volume the write will fail
BEST USES
- Can provide HA(high availability)
- You have petabytes of data to manage
- Simple to use, basically a scale-out NAS
- Clients talk to Gluster storage via NFS