Well I am still having issues with GPFS. It turned out the mmbackup wont work with the filesystem size either and a chat with IBM support was not encouraging. Here is what one of our System Admins found out:
The problem was eventually resolved by IBM GPFS developers. It turns out, they never thought their filesystem would be used in this configuration (i.e. 100,000,000 + inodes on a 200GB filesystem). During the time the filesystem was down, we tried multiple times to copy the data off to a different disk. Due to the sheer number of files on the filesystem, every attempt failed. For instance, I found the following commands would have taken weeks to complete:
# cd $src;find . -depth -print | cpio -pamd $dest
# cd $src; tar cf - . | (cd $dest; tar xf -)
Even with the snapshot, I dont think TSM is going to be able to solve this one. This will probably need to be done at the EMC level, where a bit level copy can be made.
So GPFS is not all it was thought to be. So pass it along and make sure you avoid GPFS for application that will produce large numbers of files.
Showing posts with label GPFS. Show all posts
Showing posts with label GPFS. Show all posts
Monday, November 27, 2006
Monday, November 6, 2006
CDP For Unix?
Has anyone heard of when (if ever) Continuous Data Protection for Files will be available on the Unix platform? I could really use this feature with my GPFS system. Since the application creates hundreds of meta data files daily and is proprietary (hence no TDP support) I am getting killed by the backup timeframe since the each volume has in excess of 4 million files already and incrementals take close to 48hrs. to finish. Anyone heard anything at symposiums or seminars?
Monday, September 25, 2006
Why Use GPFS?
So we have GPFS enabled systems and an application that uses these file systems to store MILLIONS of very small files. The GPFS file system is supposed to allow multiple systems to share a filespace and data, but we continually have issues with backups. A lot of it is due to the number of files (9+ million is one file system alone), but also with the memory utilization of the scheduler. The system has only 4 GB of memory and when the scheduler is running it consumes at least 1GB. So my question is wouldn't a NAS based file system work just as well? Granted the GPFS file systems look local on each server, and in all respects are local, but the difference can't be that big especially when the majority of the data is under 10K in size (mostly under 1K to be exact). So anyone have experience with GPFS to state otherwise?
Subscribe to:
Posts (Atom)