Wednesday, January 18, 2012

TSM Client Scheduler Issue

I just recently had an issue with a handful of TSM clients that would not run their backups. The clients all backup to a TSM 5.5.2 server and were all running Windows 2008. The clients use TSM version 6.2.3. The five clients had all been missing their backups for days and what makes the situation more interesting is that there are other Windows 2008 servers with this version of TSM installed and they are all running their schedules without issue.

When reviewing the TSM Schedule log the scheduler listed that it had received the schedule info and was waiting for the TSM server to initiate the schedule. The TSM server never made an attempt to contact the clients in question and never showed any errors other than ANR2578W stating the client missed its schedule. There were no errors in the error log and not much to go by from the TSM server activity log. Even though the TSM client backs up over the public network I switched to polling mode to see if client based initiation of the backup would work. It didn't! The TSM client scheduler would receive the schedule upon polling the TSM server but would never execute it. So now what? I added the TCPCLIENTADDRESS and TCPCLIENTPORT and switched back to SCHEDMODE PROMPTED, still the scheduler would not run backups.

Now I was getting frustrated. I removed the scheduler service and redefined it using dsmcutil and voila, the schedule ran...ONCE! After the initial schedule ran the previous problem returned. Schedules were not running and the TSM server would not show any errors saying it could not contact the client. It just would not run the schedule. Well that left me no choice but to call support. IBM support's response was to make sure the TCPCLIENTADDRESS and TCPCLIENTPORT were defined in the dsm.opt and also to define the client HLADDRESS and LLADDRESS on the TSM server? Define the HL and LL addess? TSM gets that when the client connects doesn't it? Yes and No! It appears that without the optional setting the TSM server can have issues contacting some clients. Why? No idea, but adding the HL and LL address did the trick and the backups have been running without issue since.

How many of you define the HL and LLADDRESS when registering nodes? I've never suspected it was needed until now.

Monday, December 19, 2011

DB2 Doesn't Make A Difference

I've been working with some IBM reps/consultants lately, and I find it kind of funny how they talk about TSM. We were discussing the issue with some queries to the TSM DB being so hard to process that many times they don't return any data, when the IBM rep said "With DB2 that wont happen." I laughed and said, "DB2 didn't help that much." For example try something like this and see how long it takes to get a response.


select  cast(sum(b.file_size/1073741824) as decimal(18,2)) AS GB_SIZE from backups a, contents b where a.node_name in ('DEV01_ORA','DEV02_ORA','DEV03_ORA','PRD01_ORA','PROD02_ORA') and a.backup_date < '2011-11-01 00:00:00' and a.object_id=b.object_id

I'm running this query to determine the amount of space I would free up if I deleted old oracle backup objects that they DBA's never reconciled through RMAN. I ran it over 30 minutes ago.....still waiting! The problem is the schema has not changed enough in the TSM table structure to make some select statements run any better than in pre-DB2 days. Anyone else seen this? 


(Yes! I know if I used a specific NODE_NAME then TSM would probably return some data, but handles queries 1000x times more complex than these in the non-TSM world)

Tuesday, December 6, 2011

TSM 6.x Client Deployment

Has anyone used TSM's client deployment process/function?  How well does it work? Is it worth the effort? I have a lot of servers we need to install TSM too and would like to utilize it if it will work.

Wednesday, November 16, 2011

Tivoli Storage Manager Reporting and Monitoring v6.3

This is a query from the TSM v6.3 agent:

select node_name, count(distinct volume_name) from volumeusage a, stgpools b where (a.stgpool_name=b.stgpool_name) and devclass in (select DEVCLASS_NAME from devclasses where devtype in ('3570','3590','3592','4MM','8MM','DLT','DTF','ECARTRIDGE','GENERICTAPE','LTO','QIC')) group by node_name

Could you run it on a TSM v5.x.x.x productive system for me!

Do you get any result in 10 minutes?

Tuesday, October 18, 2011

Simple TSM 6.2 Server Restore

I just completed a DR test and we had to restore one of our TSM servers from a Data Domain replicated copy. This was our first time restoring a TSM server from a replicated DD copy and after importing the replicated volumes and defining our initiators we set about restoring the TSM database. Our AIX server had been restored from an image (SysBack) and we had a current volhist and devconfig file so we began our restore. If you think that the restore from a Data Domain is not relevant to your environment because you use tape, think again. The Data Domain mimics an STK library with IBM drives and so we had to follow the same directions as anyone using tape backup.

To restore the TSM 6.x DB from tape you must have your volhist and devconfig files. You will need to modify the devconfig so that the only lines are those defining the devclass, server name, and server password; all other lines should be deleted. Then you need lines defining a manual library, a tape drive, and a line defining a path to the drive (which for us was an LTO3 drive).


DEFINE LIBRARY MANLIB LIBTYPE=MANUAL
DEFINE DRIVE MANLIB DRIVE1 ONLINE=YES 
DEFINE PATH TSMSERV1 DRIVE1 SRCT=SERVER DESTT=DRIVE LIBR=MANLIB DEVICE=/dev/rmt1 ONLINE=YES

Note: Do not define an element address or serial with the drive, TSM will detect these when you run the DSMSERV RESTORE DB command.

When running the DSMSERV RESTORE DB command TSM will start up and query the devconfig file to retrieve the information on the devclass, drive, library type, server name, and password. Once TSM has successfully queried the tape drive it will query the volhist file for the most current DB backup volume depending on whether you are restoring to the most current date or to a specific point in time. When TSM has identified the volume to use it will prompt you to mount the tape. When I saw the mount I went into the Data Domain web based GUI and moved the DB backup volume from its "virtual slot" to the drive that is /dev/rmt1. Once the tape was mounted, TSM was able to recognize the tape had been loaded and began restoring the DB. If more than one tape is required to complete the restore TSM will prompt you for each tape. With the library web GUI available you can move the tapes as needed and accomplish the restore. Once the restore completes you can bring TSM back up and audit/fix anything that could be out of sync. With the switch to DB2 I was expecting a little more work to get TSM back up and running, but surprisingly it was quite simple.

Now if you don't have a SysBack of your TSM server the rebuild can take a lot longer and requires you to recreate some of the DB2 dependent files. I might have to do a BRM restore without an image in the near future and if I do I'll post a step by step process for everyone.  If anyone has already done this and would like to post the process on TSMAdmin let me know.



Sunday, October 2, 2011

Thursday, September 15, 2011

Double The Trouble

So the TSMDBMGR stanza was accidentally removed from a TSM 6.x server and the log space filled because DB backups could not be run. So upon bringing TSM back up (which is a nightmare with DB2) TSM support let my coworker know she had to take two DB backups to clear the logs. OK, so I have heard this since 6.x came out but it turns out the reason is due to a bug in TSM that they have not cleared up. So IBM's solution for now is that two DB backups clear the logs. You think that would be fixed by now.....go figure!