Showing posts with label NAS. Show all posts
Showing posts with label NAS. Show all posts

Tuesday, October 9, 2007

Data DeDuplication - Been There Done That!

I just got off a pretty good NetApp webcast covering their VTL and FAS solutions. One of the items they discussed was the data deduplication feature with their NAS product. When the IBM rep spoke up they discussed TSM's progressive backup terminology and I find it interesting to contrast TSM's process with the growing segment of disk based storage that is the deduplication feature. The feature really helps save TONS of space with the competing backup tools since they usually follow the FULL+INC model causing them to backup files even when they haven't changed. Here deduplication saves them room by removing the duplicate unchanged files, but this shows how superior TSM is, in that it doesn't require this kind of wasted processing. What would be interesting is to see how much space is saved in redundant OS files, but that is still minor compared to the weekly full process that wastes so much space.

This brings us to the next item, disk based backup. This is definitely going to grow over time, but costs are going to have to come down for it to fully replace tape. The two issues I see with disk only based backups is in DRM/portability and capacity/cost. If you cannot afford to have duplicate sites with the data mirrored then you are left having to use a tape solution for offsite storage. Also with portability disk can be an issue. For example we are migrating some servers from one data center to another and we used the export/import feature. We have also moved TSM tapes from one site to another and rebuilt the TSM environment. To do this with disk is a little more time consuming, you would need the same disk solution and the network capacity to mirror the data (time consuming on slow connection) or have to move the whole hardware solution. Tape in this scenario is a lot easier to deal with. Now when it comes to capacity vs. cost there is a definite difference that will keep many on tape for years to come. Many customers want long term retention of their data, say 30+ days for inactive files and TDP backups (sometimes longer with e-mail and SARBOX data). So what is the cost comparison for that type of disk retention (into the PB) compared to tape. Currently it's no contest and tape wins in the cost vs. capacity realm, but hopefully that can someday change. So if any of you have disk based solutions or VTL solutions chime in I'd like to hear what you have to say and how it's worked for you.

Wednesday, October 4, 2006

EMC & NDMP

Does anyone know the answer to this question? I have an EMC Clariion and it looks like the easiest way to backup some problem file systems is by NDMP. Does EMC Clariion support NDMP? Is there any specific configuration that does or do they all support it natively? Does it need a Celerra NAS head? Any information from EMC knowledgeable people is appreciated.

Monday, September 25, 2006

Why Use GPFS?

So we have GPFS enabled systems and an application that uses these file systems to store MILLIONS of very small files. The GPFS file system is supposed to allow multiple systems to share a filespace and data, but we continually have issues with backups. A lot of it is due to the number of files (9+ million is one file system alone), but also with the memory utilization of the scheduler. The system has only 4 GB of memory and when the scheduler is running it consumes at least 1GB. So my question is wouldn't a NAS based file system work just as well? Granted the GPFS file systems look local on each server, and in all respects are local, but the difference can't be that big especially when the majority of the data is under 10K in size (mostly under 1K to be exact). So anyone have experience with GPFS to state otherwise?

Friday, October 14, 2005

NetApp TOC Issues

We have recently found out that the TOC file creation in TSM can fail when the NetApp volume has special characters in the filename.  The has led people to believe that the backups are unsuccessful and our group would be unable to restore data. That assumption could not be farther from the truth.  We can still restore an individual file, we just can’t load a graphical representation into the web based TSM client. Anyway, the response by Tivoli was that we could identify the file with the problem because an error will report when the TOC creation fails stating the filename that caused the problem.  So we would have to do this hundreds of times since we have, on our own, identified at least 400+ files with special characters. So I have good backups just can’t restore them easily, then the question is how does TSM react when trying to restore files with special characters?