Wednesday, January 14, 2009

ART: Restore Testing software for TSM

We have a new product that does something unusual.

ART (Automated Restore Testing for TSM) test-restores a random sample of files. And it does this for every node at your site, automatically, on a schedule.

We don't think anyone has ever done this kind of comprehensive restore testing before.

ART has uncovered problems at every customer site it has tested. The problems are usually operational, and often easy to fix.

If you'd like to give the free trial a whirl (it's full-featured but limited to 20% of your nodes), go to www.tsmworks.com/download. We appreciate feedback from TSM experts out there.
Thanks!

Wednesday, January 7, 2009

TSM & File System Support

First off I hope everyone had a good holiday season. Now that we can focus on work again I wanted to discuss a topic of file system types. I just had an incident where a Solaris server had ZFS used for some newer file systems. The admins had added them without consulting us, and we didn't catch it because TSM didn't even attempt to back them up. Our client level was 5.4.1.0 and ZFS support was added with the 5.4.1.2 update. Once I updated the client the file systems were backed up successfully and show the correct format. We did see one file system was returning a type of UNKNOWN and that should have alerted us, but we were not receiving errors or failures on the backup of the server in question.

So here is the question, how do you keep something like this from happening in the future as new, more bleeding edge file system types are added? Obviously you need to inform your Unix Admins to work with you whenever they add a newer file system type, but if they don't alert you, and TSM doesn't report failures, how would you know? It's bound to happen as the Linux community adds newer, more robust file system types. Other than stay as current as possible with my TSM client levels (which wont always be the fix) what would you suggest?

Monday, November 24, 2008

LAN-Free Unknown Feature

My boss brought this article to my attention so I thought I would pass it on. It discusses an unknown or little discussed capability of LAN-Free to act as a "pass-thru" server for other clients. In other words a normal client can connect to TSM through the LAN-Free agent that resides on another server. Basically the Storage Agent becomes a dummy TSM server. This can be helpful when you have backups that need to go to tape directly but the network connection between the client and the TSM server is hurt by inadequate bandwidth between switches, or firewall issues. You can check the article out here. I remember seeing this discussed on ADSM.org before, but have never attempted to try it.

Tuesday, November 18, 2008

Texas Government Pulls Plug on IBM

Well if you get a chance check out this article on ZDnet. It looks like IBM is losing the Texas government account due to backup failures and lost data. Having previously worked for IBM what I would like to know is what is not being said. When IBM takes over an account on an outsourcing deal they have to assume the horrible practices already in place then do their best to convert to a better software/hardware solution and processes. For example when I was with IBM we had a remote site that was using Arcserve for their backups and had two weeks worth of tapes. A couple months into the contract they lost a server and needed to restore the data and guess what...they couldn't. The process in place was for them to rotate two weeks worth of tapes continually. The tapes were four to five years old and had never been tested. They found out after the fact that the tapes had gone bad some time in the past and they were pretty much out of luck. Is this IBM's fault? They took some of the blame even though IBM was only a couple months into the contract and the smaller remote sites were secondary in the process change timeline.

The Monring News did report the following:

In a Nov. 3 letter to the governor’s office, IBM acknowledges the company overreached by assuming responsibility for existing technological conditions that are inadequate, inconsistent and not sustainable.

Saturday, October 25, 2008

AIX Image Backup Performance Issue

I am running 3 simultaneous image backup jobs on an AIX server and the HBA only shows 20% utilized and my throughput sucks. I am using -imagetype=static and the FS's in questions are EMC clones split off and remounted to a new server. When I have done image backups before they were wicked fast...now not so much. Any ideas what I might have missed? I have an ACSLS STK 8500 with LTO-3 drives. Even over gig-ether I've seen better performance. We did check the CPU and memory usage and they are fine. Lots of I/O wait, not sure why I would see that with statc image backups.

Wednesday, October 22, 2008

TSM facebook Group

If you're on facebook and would like to network with other TSM admins check out the new TSM group. It's currently looking to grow and can allow us all to network for information, jobs, tips, and just getting to know each other. Thanks to Henrik Wils for creating it!

Thursday, October 16, 2008

EMC Backup Advisor

I am currently working a project to install and setup EMC Backup Advisor in our environment. If anyone has experience with the product I'd like to hear your thoughts and opinions. So far it seems like it can monitor almost anything. Hopefully it doesn't become such a huge/complex tool that we'll spend half our time trying to manage it.