Monday, September 25, 2006

Why Use GPFS?

So we have GPFS enabled systems and an application that uses these file systems to store MILLIONS of very small files. The GPFS file system is supposed to allow multiple systems to share a filespace and data, but we continually have issues with backups. A lot of it is due to the number of files (9+ million is one file system alone), but also with the memory utilization of the scheduler. The system has only 4 GB of memory and when the scheduler is running it consumes at least 1GB. So my question is wouldn't a NAS based file system work just as well? Granted the GPFS file systems look local on each server, and in all respects are local, but the difference can't be that big especially when the majority of the data is under 10K in size (mostly under 1K to be exact). So anyone have experience with GPFS to state otherwise?

2 comments:

  1. I'm not sure using NAS would help - it all depends on the filesystem being used not the hardware hosting it, and GPFS is a filesystem rather than hardware hosting a filesystem.

    To offer the same functionality with a NAS box, wouldnt that mean basically just running GPFS on the NAS box, so not getting rid of GPFS at all anyway?

    ReplyDelete
  2. Actually, NDMP on NAS doing image backups is a possibility, just not a very good one. The overall performance of NDMP is not exactly wonderful, and doing file level restores can also give one headaches.

    5.4 client is supposed to allow file-restores from a regular TSM image backup.

    From a GPFS perspective, it is more usable for many things than NAS due to the cluster-aware file-locking (databases, shared application files, collaborative environments, HPC, etc.) Additionally, there is tuning that can be done to improve overall access and throughput for GPFS.

    With a NAS device you pretty much have a black box. Great as long as it works, but requires a sledge hammer to get into it when it doesn't.

    ReplyDelete