Showing posts with label IBM. Show all posts
Showing posts with label IBM. Show all posts

Monday, February 16, 2015

TSM Rumor Mill

So I was talking with an IBM source about IBM's recent layoffs and she stated that many groups were being restructured and TSM was affected. According to my source she said that TSM is possibly going to have a name change. This is no shock for those who remember when TSM was called ADSM. A partial name drop was something along the lines of Specter blah blah blah!! So take this rumor with some reservations, but IBM is going through a major upheaval as IBM decides what its services path with be in the future.

Wednesday, July 31, 2013

IBM P7 Strange Behaviour

We have a P7 frame that has 4 LPARs that are used as TSM storage agents from which snapshots of our SAP DB's are mounted for backup. They have always had great performance until one LPAR had a bad HBA that phoned home and was replaced. After it was replaced performance for backups dramatically decreased from 800MB/s to 150MB/s and overall performance of the server would drastically drop. When the DB requiring backup is over 25TB that is a huge hit, and we could not find the root cause.  At first IBM said it was our Hitachi disk that was the problem. We eliminated that right away, so we then replaced the new HBA, checked our fiber, and then checked the GBIC and nothing seemed to fix the situation. During the first week I asked the IBM service technician if we could possibly have a bad drawer or slot and he emphatically said "No! If you did you would have errors all over the place." So we checked firmware, we moved cards within the frame (again), we double checked the fiber, now we were going into the third week. So I kept asking if something could be wrong with the drawer/slots and I kept getting the same answer. The reason I suggested it was due to previous experience. I have seen hardware go bad without totally going "out". So after exhausting everything other than the replacing the slots, IBM finally replaced the slots. Viola! Backup speeds went back to normal and system degradation during the backup disappeared.  So the slots/drawer was the issue. No errors relating to a slot/drawer hardware issue occurred but something caused the slots to degrade performance.  It took almost a month to resolve the issue, I wouldn't say that IBM support was very thorough and at times tried to push off the problem to other vendors (i.e. Hitachi). I can only suggest in the future you trust your instincts and push the CE's to follow down every avenue. My headache is over, but now the RCA begins.

Thursday, August 6, 2009

TSM 5.5 to 6.1 Video

If you don't already subscribe to IBM's TSM Information Update & Storage Newsletter then you might not be aware of the following video IBM has posted to their website. They have provided a video tutorial upgrading TSM 5.5 to 6.1. Check it out here.

Tuesday, July 7, 2009

Still Looking For TSM Admins In Boulder, CO

Hello,
My name is Arjun and I'm a recruiter at Artech.

Artech has an urgent contract for one of our direct clients:

Job Title: TSM Administrator
Location: BOULDER, CO
Job Description:
Required Skill: TSM support

If you are qualified, available, interested, planning to make a change, or know of a friend who might have the required qualifications and interest, please call me ASAP at (973) 993-9383 Ext.3319, even if we have spoken recently about a different position. If you do respond via e-mail please include a daytime phone number so I can reach you. In considering candidates, time is of the essence, so please respond ASAP.

Artech is a global IT Consulting company with over 30 Fortune 500 customers. You may visit our website at http://www.artechinfo.com/ to learn more about us.

Thank you.
Sincerely yours,
Arjun Dheer
(973) 993-9383 Ext.3319
arjun_dheer@artechinfo.com

Monday, November 27, 2006

GPFS Revisited

Well I am still having issues with GPFS. It turned out the mmbackup wont work with the filesystem size either and a chat with IBM support was not encouraging. Here is what one of our System Admins found out:

The problem was eventually resolved by IBM GPFS developers. It turns out, they never thought their filesystem would be used in this configuration (i.e. 100,000,000 + inodes on a 200GB filesystem). During the time the filesystem was down, we tried multiple times to copy the data off to a different disk. Due to the sheer number of files on the filesystem, every attempt failed. For instance, I found the following commands would have taken weeks to complete:

# cd $src;find . -depth -print | cpio -pamd $dest
# cd $src; tar cf - . | (cd $dest; tar xf -)


Even with the snapshot, I dont think TSM is going to be able to solve this one. This will probably need to be done at the EMC level, where a bit level copy can be made.

So GPFS is not all it was thought to be. So pass it along and make sure you avoid GPFS for application that will produce large numbers of files.

Monday, August 28, 2006

Disk Question

Ok so in the DS4000 Redbook (page 153) they give a TSM example for setting up disk and they describe using a RAID 10 configuration. What!?! I've always heard and taught that TSM should use TSM mirroring since the extra DB copies are used for writing to help with performance and TSM keeps them all sync'd. So why would I create two RAID 10 arrays and let the DS4000 handle the mirroring? Why not create two RAID 1 arrays and mirror thru TSM? This has implications not only on the DS4000 series but on every high-end enterprise disk out there. So which is it IBM?

Thursday, August 10, 2006

Leaving IBM

Friday will be my last day with IBM.  It’s been a great 5 years, but I realized that the only way my career would grow was to make a change.  I have learned a lot and am thankful for the opportunity IBM provided.  My new employer will be Infocrossing, and I look forward to the challenges that I will encounter. For one it won’t be a majority IBM hardware user so I will be able to get my hands dirty with ACSLS and STK libraries.  I love IBM equipment so it will be interesting to see what it’s like in the non-IBM world. I will still be doing TSM so this blog is not going anywhere.

Tuesday, July 18, 2006

Have You Checked Your Drive Firmware Lately?

Well the weekend is over and I thought I would update you on the problem SAP system.  In the last update we had discussed how this one TSM server was consistently having its mounts go into a RESERVED and hanging. The only thing we could do was cycle the TSM server. So after talking to support they stated it was a known problem listed in APAR IC49066 and fixed in the TSM server 5.3.3.2 release. So we upgraded the TSM server, updated ATAPE on the problem server to the same level as on the controller, and turned on SANDISCOVERY. The system came back up and started to mount tapes, albeit slowly. So we let it run and all seemed well for about the first 8 hours then all hell broke loose. TSM started taking longer and longer to mount tapes until it couldn’t mount anything. We began receiving the following errors also:

7/13/2006 9:38:45 AM ANR8779E Unable to open drive /dev/rmt25, error number=16.
7/13/2006 9:38:45 AM ANR8311E An I/O error occurred while accessing drive DR25 (/dev/rmt25) for SETMODE operation, errno = 9.

We reopened the problem ticket and talked to a number of Tivoli support reps and almost had to force them to have us run a trace. The original problem fix began Friday and here we were still trying to fix it well into early Sunday morning. So after getting the trace to Tivoli they looked it over and seemed perplexed at the errors. The errors seemed as if Tivoli was trying to mount tapes for drives that the library client was not pathed for. Also it seemed TSM was polling the library for drives but was unable to get a response so it went into a polling loop until it found a drive. This caused our mount queue to get as high as 100+ tape mounts waiting and the mounts that did complete sometimes took 30-45 minutes to do so. It was NUTS! So Sunday morning a new Tivoli Rep was assigned (John Wang). As we discussed the problem and Tivoli was trying to get a developer to analyze the trace John mentioned how he had seen this type of error before in a call about an LTO-1 library. He stated that it took 3 weeks to determine the problem but that the end result was that the firmware on the drives was down-level and causing the mount issues. So I checked my 3584’s web interface (I feel bad for all those people out there without web interfaces on their libraries) and found the drive firmware level at 57F7. This seemed down-level from what little information we had so I had my oncall person call 1-800-IBM-SERV and place a SEV-1 service call. The CE called me and we discussed the firmware level. When he saw how down-level it was he was surprised and lectured me on making sure we always check with CE’s before we do any changes to the environment. The CE then gathered his needed software and came to the account to update the drives. I brought all the TSM servers down and after 30 minutes the library had dismounted all tapes from the drives. The CE then proceeded to update the firmware, which actually only took 15-20 minutes. I expected longer. So we went from firmware level 57F7 to 64D0. Huge jump! So after the firmware upgrade I audited the library and brought the controller back up. Viola! It started mounting tapes as soon as the library initialized and the response was back to what it should have been. It’s now Tuesday morning and there have been no problems. So before you upgrade TSM be sure to have checked your libraries firmware (both library and drives). It could mean the difference between sink and swim!

Sunday, October 30, 2005

Managing RAW Volumes

Well I was recently called by another IBMer and asked how to use RAW volumes. The person called to ask why sometimes DSMFMT will format quite fast on one machine then take forever on another machine. Well one thing you have to understand about DSMFMT is that it's taking that file and making the space within it RAW. If you've ever looked inside an unused TSM volume you'll see it is a text file filled with ADSM over and over (they might have changed the fill but when I was teaching TSM thats what we saw). So why use RAW instead of actual files (other than files being a redundant process)? FAST! EASY! and when speed in DR is key it's the only way to go. It's actually easier than one would think and with a little script you can manage your RAW volumes and hdisks easily. I'll post the script along with a script to querry the serial and WWN of your tape drives. These two scripts come courtesy of Hari Patel my co-worker who is a PERL mad man. (Download tar/zip)