Showing posts with label TSM server. Show all posts
Showing posts with label TSM server. Show all posts
Monday, January 12, 2015
Poll Results
The TSM Server usage poll closed Jan 1st and the results were interesting. See for yourself, but I thought the usage of 6.3 and 7.1 over 6.4 was interesting. I guess it's not worth going to 6.4 seeing as 7.1 is out???
Friday, June 10, 2011
TSM Server V6 AIX Install/Upgrade Gotchas
So after setting up numerous TSM 6.1 and 6.2 servers here are a few of the things that have been little gotchas. They were never upgrade stoppers but they did cause some headaches as we determined what was causing the errors.
- Check the ATAPE version (recommend 11.x)
- Check your xlC C++ runtime level (recommend 9.0.0.8 or greater)
- With AIX you must have IOCP set to available or else you'll have to update the OS setting and reboot the server.
- Make sure the user id that the TSM 6.x instance is running under has ulimit set to unlimited. Real pain when you go to create disk volumes and you forgot to set the ulimit. It was my absentminded moment!
- Don't forget that the tsmdbmgr.log file ownership needs to be the new ID not root.
- Also when using RAW disk volumes for TSM diskpools chown the device file (example: /dev/rtsmdata01) to the new user id or TSM will say it's unavailable.
- With a recent upgrade we could not get the TSM DB backup to execute without an error. It turned out the TSM client's dsmtca file ownership had been accidentally changed to the TSM server's user ID and it MUST BE OWNED BY ROOT for the backup of the TSM DB to execute successfully.
Monday, September 1, 2008
Defaults should never be trusted
I've performed a few audits recently and this one keeps popping up, again and again!
Please be aware that the defaults provided when defining a new storage pool will turn on collocation by GROUP. If this default is specified and no collocation groups are defined then the TSM server will collocate data by NODE!
Pleasant little surprise eh?
It never hurts to double check everything, here's some queries I use all the time to sanity check collocation: (replace the stgpool_name as needed)
Please be aware that the defaults provided when defining a new storage pool will turn on collocation by GROUP. If this default is specified and no collocation groups are defined then the TSM server will collocate data by NODE!
Pleasant little surprise eh?
COLlocate
Specifies whether the server attempts to keep data belonging to a single client node, group of client nodes, or client file space stored on as few volumes as possible. This parameter is optional. The default value is GROUP.
If you specify COLLOCATE=GROUP but do not define any collocation groups or if you specify COLLOCATE=GROUP but do not add nodes to a collocation group, data is collocated by node.
It never hurts to double check everything, here's some queries I use all the time to sanity check collocation: (replace the stgpool_name as needed)
- select stgpool_name, access, collocate, maxscratch, reusedelay from stgpools
- select count(distinct node_name) as "Number of Nodes", volume_name from volumeusage where stgpool_name='%_TAPE' group by volume_name order by "Number of Nodes" desc
- select count(distinct volume_name) as "Number of Tapes", node_name from volumeusage where stgpool_name='%_TAPE' group by node_name, stgpool_name order by "Number of Tapes" desc
- select count(distinct volume_name) as "Number of Tapes", node_name, stgpool_name from volumeusage group by node_name, stgpool_name order by "Number of Tapes" desc
Wednesday, April 16, 2008
TSM DB Dump Load Audit
There's nothing quite like the smell of TSM database structure issues in the morning.Only run through this proceedure if you are sure you have DB structure issues.
Here are the instructions on running a dump/load/audit.
.
1. Make a copy of the following files:
- dsmserv.opt
- dsmserv.dsk
- volhist (volume history file)
- devconfig (device configuration file)
.
2. Set the following options in your dsmserv.opt file:
EXPINTERVAL 0
DISABLESCHEDS YES
NOMIGRRECL
.
3. Define a file devclass un... click here to read the full article
Here are the instructions on running a dump/load/audit.
.
1. Make a copy of the following files:
- dsmserv.opt
- dsmserv.dsk
- volhist (volume history file)
- devconfig (device configuration file)
.
2. Set the following options in your dsmserv.opt file:
EXPINTERVAL 0
DISABLESCHEDS YES
NOMIGRRECL
.
3. Define a file devclass un... click here to read the full article
Labels:
database,
DRM,
Restore,
Server Recovery,
structure,
TSM,
TSM server
Thursday, June 22, 2006
The Linux Big But!
“I love Linux, I support Linux, I would love to use Linux but….” You wont hear this just from me. The problem is a frustrating one. I for one have a number of Linux TSM servers and they have become somewhat of a problem. For example we had a situation where we decided to deploy a Linux TSM server and proceeded to setup the site. Problems arose when we attempted to connect the Library. As it turned out the version of SUSE was down level and the driver would not work. So we upgraded Linux and the library worked fine, but (there it is again) the RAID controller for the server would not longer work and there was no updated one. This has been and continues to be the problem with Linux. Sure companies beat their chests and yell, “We support Linux!” The problem is that it’s limited support and we the users who implement get left with what turns out to be a patchwork solution. I’m sure many will say, “But I have Linux working and it works fine.” Great! The problem is in the choices available to you hardware wise when architecting the solution. I can be pretty sure that I wont run into to many problems when solutioning for AIX, HP/UX, Solaris, or Windows. With Linux you have to do twice the homework and hope the hardware company keeps its drivers up to date. Just to risky for the data. Don’t get me wrong it has gotten better but there is a long way to go before I would commit to using Linux again.
Thursday, May 4, 2006
UPDATE on NDMP Problem!
Support has identified the NDMP problem exists in TSM 5.3.3 if the ONTAP version is lower than 7.1. If you have NetApps make sure your ONTAP version is at or higher than version 7.1 before you upgrade your TSM server to 5.3.3 or higher.
Tuesday, November 22, 2005
Setting Up A Secondary TSM Instance
Someone requested a post on how to setup a secondary instance of TSM on a UNIX server so here is the skinny on how that is setup:
First create a directory where the config files for the new server will be stored.
mkdir /usr/tivoli/tsm/serverb
mkdir /usr/tivoli/tsm/serverb/bin
then copy the dsmserv.opt over and modify the needed settings in it like devconfig and volhist to save in the new dir. Then create DB and Log volumes that this instance will use. Once those are created you need to export the following environmental variables:
export DSMSERV_CONFIG=/usr/tivoli/tsm/serverb/bin/dsmserv.opt
export DSMSERV_DIR=/usr/tivoli/tsm/serverb/bin
export DSMSERV_ACCOUNTING_DIR=/usr/tivoli/tsm/serverb/bin
Now you can run the dsmserv format command to initialize the DB and Log volumes and it will create the dsmserv.dsk in the serverb directory. Make sure you run the dsmserv runfile commands to load the scripts and webimages (even if TSM 5.3). The final step is to create the startup script so that TSM initializes correctly. Here is our script:
#!/bin/ksh
exec > /tmp/libserv.out 2>&1 #optional -> sends output to out file
ulimit -d unlimited
export DSMSERV_CONFIG=/usr/tivoli/tsm/serverb/bin/dsmserv.opt
export DSMSERV_DIR=/usr/tivoli/tsm/serverb/bin
export DSMSERV_ACCOUNTING_DIR=/usr/tivoli/tsm/serverb/bin
print "$(date '+%D %T') Starting Tivoli Storage Manager Server"
cd /usr/tivoli/tsm/serverb/bin
dsmserv
We use the following command to start TSM so we don’t have to deal with nohup:
echo “/usr/tivoli/tsm/serverb/bin/rc.adsmserv” | at now
This uses the at command to run the script immediately. You can then edit the inittab and place a line in to start this instance on boot or put a script in the run level startup folders, your choice. You should now be ready to run the second instance. One note to those thinking of doing this and sharing a library, Tivoli recommends that you create a TSM instance to be just a library manager, no clients, no real work other than handling the library and tape mounts. I agree with this and it has been a lot easier to manage and handle library issues. Not knowing how large the DB could get I gave it 2GB and it is currently 3.8% utilized, and it has been in place for over a year and a half. Swapping a library manager from one system to another is not as hard as it would seem so consider it and if anyone wants docs on how to do the switch let me know I’ll post it.
First create a directory where the config files for the new server will be stored.
mkdir /usr/tivoli/tsm/serverb
mkdir /usr/tivoli/tsm/serverb/bin
then copy the dsmserv.opt over and modify the needed settings in it like devconfig and volhist to save in the new dir. Then create DB and Log volumes that this instance will use. Once those are created you need to export the following environmental variables:
export DSMSERV_CONFIG=/usr/tivoli/tsm/serverb/bin/dsmserv.opt
export DSMSERV_DIR=/usr/tivoli/tsm/serverb/bin
export DSMSERV_ACCOUNTING_DIR=/usr/tivoli/tsm/serverb/bin
Now you can run the dsmserv format command to initialize the DB and Log volumes and it will create the dsmserv.dsk in the serverb directory. Make sure you run the dsmserv runfile commands to load the scripts and webimages (even if TSM 5.3). The final step is to create the startup script so that TSM initializes correctly. Here is our script:
#!/bin/ksh
exec > /tmp/libserv.out 2>&1 #optional -> sends output to out file
ulimit -d unlimited
export DSMSERV_CONFIG=/usr/tivoli/tsm/serverb/bin/dsmserv.opt
export DSMSERV_DIR=/usr/tivoli/tsm/serverb/bin
export DSMSERV_ACCOUNTING_DIR=/usr/tivoli/tsm/serverb/bin
print "$(date '+%D %T') Starting Tivoli Storage Manager Server"
cd /usr/tivoli/tsm/serverb/bin
dsmserv
We use the following command to start TSM so we don’t have to deal with nohup:
echo “/usr/tivoli/tsm/serverb/bin/rc.adsmserv” | at now
This uses the at command to run the script immediately. You can then edit the inittab and place a line in to start this instance on boot or put a script in the run level startup folders, your choice. You should now be ready to run the second instance. One note to those thinking of doing this and sharing a library, Tivoli recommends that you create a TSM instance to be just a library manager, no clients, no real work other than handling the library and tape mounts. I agree with this and it has been a lot easier to manage and handle library issues. Not knowing how large the DB could get I gave it 2GB and it is currently 3.8% utilized, and it has been in place for over a year and a half. Swapping a library manager from one system to another is not as hard as it would seem so consider it and if anyone wants docs on how to do the switch let me know I’ll post it.
Wednesday, July 27, 2005
Volume History Issues
I have a very large shared library environment and currently have 3 TSM instances connecting to a 9 frame 3584. When I first configured this library we had one of the 3 instances running as the library controller (2 of the instances are on the same server the third is another system). The problems we ran into were numerous when we needed to do library maintenance since we adversely affected the server running as the library manager that was also a production TSM server for backups. So after a suggestion from Tivoli that we at least create a library manager instance we did so. It has helped a lot with handling tape issues, but one issue we were unaware of and had difficulty in resolving was with the volume history on the old library manager not releasing volumes listed as REMOTE. The new library manager could not force the old manager to "let go" of the tapes so they never were freed back into the scratch pool. We tried deleting all volumes of TYPE=REMOTE but it said it needed more parameters. Here is an example:
DELete VOLHistory TODate=TODAY Type=REMOTE FORCE=Yes
ANR2022E DELETE VOLHISTORY: One or more parameters are missing.
So no luck on that working to free up all the REMOTE tapes. I looked through the documentation on deleting volume history information and found nothing on REMOTE volumes. Also you'll notice there is nothing mentioned in the documentation that states individual volumes can be deleted. So we were in a serious jam. So I looked up the issue on ADSM.org and found the following command posted by a contributor.
DELete VOLHistory TODate=TODAY Type=REMOTE VOLume=NT1904 FORCE=Yes
This command allowed us to delete the individual volume and freed the tape to return to a scratch status. Thank goodness for search.ADSM.org or I'd never find the answer to half my problems. This command worked as advertised and the tapes were deleted from the old library managers volume history and they went back to a scratch status.
DELete VOLHistory TODate=TODAY Type=REMOTE FORCE=Yes
ANR2022E DELETE VOLHISTORY: One or more parameters are missing.
So no luck on that working to free up all the REMOTE tapes. I looked through the documentation on deleting volume history information and found nothing on REMOTE volumes. Also you'll notice there is nothing mentioned in the documentation that states individual volumes can be deleted. So we were in a serious jam. So I looked up the issue on ADSM.org and found the following command posted by a contributor.
DELete VOLHistory TODate=TODAY Type=REMOTE VOLume=NT1904 FORCE=Yes
This command allowed us to delete the individual volume and freed the tape to return to a scratch status. Thank goodness for search.ADSM.org or I'd never find the answer to half my problems. This command worked as advertised and the tapes were deleted from the old library managers volume history and they went back to a scratch status.
Monday, July 25, 2005
ETA Please!
Well I just had to perform a restore for a small server and the speed at which the restore ran was atrocious. I mean it was slower than rush hour in LA. Let's discuss, and THIS TIME I WANT AND REQUEST FEEDBACK! It turned out a disk went bad on a webserver and the system admins requested a number of filesystem restores. The combined amount was about 22-25GB. OK! No problem! Well that probably would have been the case if the restore request had been during the day but the request came in at night and the restore was competing with the nightly backups. Over a gig-ether (Fiber Gig-ether) connection I was able to get 1.3MBs and an aggregate rate of 668Kps. So do the math and it took a long period of time. The other thing that didn't help was it was a web server with TONS of little objects. It's livable but there were a lot of small files. The problem was eveyone and their brother wanted an ETA. "How long? It's Small! It should only take a couple hours max!" and so on. Well people now want some solution to this situation but of course the problem will be keeping it somewhat cheap. Even though everyone asks if we can halt the backups while we perform the restores we all know that's not really a viable option, so I came up with this idea, tell me what you think. Since major restores are few and far between I am proposing we create a new VLAN and run a single cable to each row of servers in the server room with enough slack to stretch to any server in the row. If a restore is required we simply plug in the "restore" connection, set an IP and away it rips. When finished we put the system back on the backup network it is assigned and rollup the excess ethernet cord and place it in the rack of the server in the middle of the row. I am only thinking of this for major restores and since I am not requesting that we buy more NIC's I think it's doable. Let me know what restore process you have in place when the network is saturated. I'd love suggestions!
Tuesday, July 19, 2005
TSM Server Upgrade Issue
Just this last week I and a co-worker were trying to get a TSM 5.3 upgrade working. The server kept saying it needed the dsmserv upgradedb command run against it. Everytime we tried the DB upgrade failed with an TIVGUID error. Since it was eating into backup times we rolled back to 5.2.4. Unfortunately the dsmserv upgrade caused 5.2.4 to state that the DB was higher level and TSM could not work with it. This happened even though the upgrade on 5.3 failed. Since support had not responed at this time we restored the DB from a full+inc that was taken just before the upgrade (WHEW! LUCKY WE HAD THAT!). When Tivoli finally responded this is what we were told:
This is a know problem with the 5.3.0 upgrade.
From TSM Support:
The problem that you are probably experiencing is a known problem with the 5.3 upgrade. If an admin has an expired password, or if a password is too short for the 5.3 password enforcement, then the upgrade can fail. Here are some steps that usually fix the problem:
1) Disable AES using the hidden option AllowAES No in dsmserv.opt file.
2) Re-initiate the upgrade db
3) Start the server. Preferably lock out all sessions & other activity
4) Use show node to identify admin & node ids that have expired or have passwords that are too short. Fix these.
(4 Alternative) Set the minimum password length to 0
5) Halt the server
6) Remove the AllowAES option
7) Start the server -- this will upgrade the passwords to AES encryption in the background
If you follow these steps then the upgrade db will probably continue successfully.
After this, start the server in background as usually and run a db backup.
SO ALL THIS OVER A PASSWORD!
This is a know problem with the 5.3.0 upgrade.
From TSM Support:
The problem that you are probably experiencing is a known problem with the 5.3 upgrade. If an admin has an expired password, or if a password is too short for the 5.3 password enforcement, then the upgrade can fail. Here are some steps that usually fix the problem:
1) Disable AES using the hidden option AllowAES No in dsmserv.opt file.
2) Re-initiate the upgrade db
3) Start the server. Preferably lock out all sessions & other activity
4) Use show node to identify admin & node ids that have expired or have passwords that are too short. Fix these.
(4 Alternative) Set the minimum password length to 0
5) Halt the server
6) Remove the AllowAES option
7) Start the server -- this will upgrade the passwords to AES encryption in the background
If you follow these steps then the upgrade db will probably continue successfully.
After this, start the server in background as usually and run a db backup.
SO ALL THIS OVER A PASSWORD!
Thursday, July 7, 2005
FYI: Using Mixed Media In An LTO Library
We recently upgraded one of our 3584's at work with some LTO2 drives. This is the first attempt at a mixed media environment and according to IBM and the following Redbook technote here, you must work out the issues with MOUNTLIMITS or all LTO2 drives could end up in use when you need them since they can also read LTO1 media. So we set our mountlimits accordingly, but it does not help that we have more LTO1 drives than LTO2. So by setting the mount limit to the number of LTO1 drives didn't stop TSM from accessing the LTO2 drives. This they did not explain well and they left one crucial piece out of the puzzle. When using a mixed media library if you do not specificly state which media format to use TSM will use both LTO1 and LTO2 media in a LTO designated storage pool. You want me to explain further? Ok here is how it affected us. We added LTO2 drives and an additional 2 frames to our library and setup the devclasses accordingly, setting the FORMAT to DRIVES so it would use the highest format available by the drive assigned. Well since we didn't partition the library TSM is going to grab whatever drive comes available and will mount the appropriate scratch tape. So if TSM assigns an LTO1 drive then you'll use an LTO1 tape. The only way you can force the LTO2 media to be used is to set the FORMAT setting in the devclass to ULTRIUM2 or ULTRIUM2C (w/Compression). So we didn't think about that and were bit by it when we ran out of LTO1 scratch. We didn't catch it due to our script only looking for scratch in the library not being able to designate between the two media types (which you can really only do if a different vol series is used for labeling). So without the LTO1 scratch we basically lost 2/3 of the drives and didn't know it. So I had to go switch our script to monitor both scratch types and we had to force the LTO's to their appropriate format. Once I realized what was happening it was a "NO BRAINER" that TSM would work that way. The big problem is that TSM did not seperate out the media format for LTO2 so it would be a different devclass type like they did with the 3592's. So be aware how TSM works and make sure you don't make the same mistake I did.
Saturday, June 11, 2005
The Case For Raw Volumes!
If you are serious about TSM server rebuild times and want the quickest way to get up and running then I suggest you look into raw logical volumes for all your TSM DB, Log, and storage needs. Of course if you are running on NT I can't say I know of any way TSM can use raw, but in our AIX shop we live by raw volumes. The creation time is quick and with a little script I can have my volumes created and ready for the DB restore in no time. I have been down the road of DSMFMT and know how long large volumes can take to create and since TSM does not like more than 16 volumes some older Unix servers can take time to format. The other nice thing about raw volumes is if the server crashes its rare, except for disk failure, for volume corruption to occur. I have had too many dirty super blocks to deal with in my time, and I don't miss them. Remember, all TSM is really doing with DSMFMT is creating a file and in a way converting it back into raw. So why do the extra steps, save yourself some time if you ever are in a true DR situation.
Subscribe to:
Posts (Atom)