Western Digital Time-Limited Error Recovery (TLER) with RAID and SE, SE16, GP Models

JakFrost

Limp Gawd
Joined
Dec 2, 2005
Messages
241
Just wanted to write up my experiences with my RAID problems.

I have four of the Western Digital Caviar SE16 500 GB WD5000AAKS 16 MB SATA-II 300 MB/s hard drive models. Previously I had them configured in a 2.0 TB (1.5 TB usable) array on my Silicon Image 3114 PCI SATA-I 150 MB/s RAID 0,1,5 Onboard Controller with firmware 5.4.03. One day I lost came home to find that 3 drivers dropped out of the array and only 1 drive was left and that the array was destroyed. Luckily the array was used mostly for archive storage for videos that I routinely backed up to DVD's so I didn't lose any data. (I presume that only the one drive that was left in the array was the cause of the problem, timing out too long while doing error recovery on a bad sector, that actually caused the other 3 drives from being dropped, instead of vice versa as it should have happened.)

None of the utilities that Silicon Image provides were able to restore the array from the other 3 drives, even though the drives that dropped should have been able to be brought online as an array and the single drive left in the array should have been able to be brought back in and rebuilt. I e-mailed the vendor with a complete case report and all details but nobody responded. An understandable experience with a second-rate foreign vendor that mostly supplies motherboard manufacturers with crappy software RAID chips. (The array while it was operational for many months was dog slow because it was software RAID5 and the parity calculations were interrupting the CPU constantly on every write making my system slow and unresponsive any time I would perform any extended writes to the array. Learned my lesson to never use Silicon Image PCI based software RAID5 and recommend against these chips now to everyone.)

After that experience I took my drives off the cheap Silicon Image controller and made a nVidia RAID1 (mirror) array with 2 x 500 GB and then took the other two drives to be used for external storage for on-site and off-site storage drives. A few months back I came back home to find that one of the drives in the new array with an Error status. After rebooting the system and re-adding the drive to the array everything worked fine without any errors. Last week the same thing happened again and the drive dropped from the array again. I re-added it and no errors were found and the array is running again.

So today I went to research what was causing this and I came across a mention of the Time-Limited Error Recovery feature on the Western Digital RE models. The explanation of TLER that feature sounded exactly like the problem that I was having. After more research I found the WDTLER utility that runs from a DOS boot disk.

I used this utility to enable TLER on my SE16 drives to prevent them from timing out too long during error recovery on my RAID1 arrays and hopefully this issue with dropped drives will not happen again.

If you have a Western Digital Caviar SE, SE16, GP or Raptor hard disk drive being used in a RAID array make sure to Enable TLER using the WDTLER utility to put a limit on the error recovery time from a read or write error due to a bad sector or the RAID controller/driver will drop your hard disk from the array thinking that it has failed any time that the hard drive takes too long to recover.

If you have a Western Digital Caviar RE, RE2, RE2-GP or Raptor hard disk drive being used as a stand-alone desktop drive that is not in a RAID array then you should Disable TLER using the WDTLER utility to give the drive more time to recover from read or write errors due to bad sectors.

If your hard disk is dropped from an array it will need to be re-added manually requiring the entire hard disk to be rebuild and resynchronized with the array causing you performance problems. If two hard disks happen to be dropped from the array at around the same time then your array will be marked as damaged and the data in the array will be destroyed requiring a complete restore from backup or a manual recreation of the array using the vendor's specialized tools.

Update: (2008-09-15) As an answer to some of the posts. Enabling or disabling the TLER settings on your hard drives will not decrease speed or performance and it will not cause you any data loss and you will not have to recreate your arrays or reformat your drives. Changing of the setting is invisible to the system and can be switched without any changes, damage, or problems.

Update: (2008-09-15)It has now been 6-months since I enabled TLER on my drives and my computer has been running pretty much 14-hours a day everyday for months and recently 24-hours a day for the last month and there has been no issues at all with the drives and no more dropped hard disks from the arrays.

Before WDTLER - TLER Disabled
Code:
WDTLER Version 1.03
Copyright (C) 2004-2006 Western Digital Corporation
Western Digital Time Limit Error Recovery Utility

Model: WDC WD5000KS-00MNB0 Serial Number: WD-WMANU1234567
   Read TLER is disabled.
   Write TLER is disabled.

Model: WDC WD5000KS-00MNB0 Serial Number: WD-WMANU1234567
   Read TLER is disabled.
   Write TLER is disabled.

Model: WDC WD3200KS-00PFB0 Serial Number: WD-WCAPD1234567
   Read TLER is disabled.
   Write TLER is disabled.

Model: WDC WD3200KS-00PFB0 Serial Number: WD-WCAPD1234567
   Read TLER is disabled.
   Write TLER is disabled.

After WDTLER - TLER Enabled

Code:
WDTLER Version 1.03
Copyright (C) 2004-2006 Western Digital Corporation
Western Digital Time Limit Error Recovery Utility

Model: WDC WD5000KS-00MNB0 Serial Number: WD-WMANU1234567
   Read TLER time is 7.000 seconds.
   Write TLER time is 7.000 seconds.

Model: WDC WD5000KS-00MNB0 Serial Number: WD-WMANU1234567
   Read TLER time is 7.000 seconds.
   Write TLER time is 7.000 seconds.

Model: WDC WD3200KS-00PFB0 Serial Number: WD-WCAPD1234567
   Read TLER time is 7.000 seconds.
   Write TLER time is 7.000 seconds.

Model: WDC WD3200KS-00PFB0 Serial Number: WD-WCAPD1234567
   Read TLER time is 7.000 seconds.
   Write TLER time is 7.000 seconds.

[size=-2]
CPU: AMD Opteron 175 Dual-Core 2.2 GHz 1 MB 90 nm 939 11x CCBWE 0543TPMW - 2,563 MHz (233x11x4), 1.50 V (1.55 V x 100.0 %)
FAN: Zalman CNPS9500 LED 92 mm Fan+Heatsink + Arctic Silver 5 Thermal Paste 99.9 % - 35 C Idle, 45 C Load
MOB: DFI LanParty UT SLI-DR Expert 939 nF4 PCI-e Rev.AA0 - BIOS: 2006-04-06 Modded, LDT 1.20 V, Chipset 1.52 V, 42 C Load
RAM: Mushkin 4x 1 GB XP4000 Redline 991493 DDR500 3-3-2-8 2.6-2.9 V CE-6 - 233 MHz (1:1) 2T 3-3-2-8, 2.80 V
VID: eVGA nVidia 8800 GTS 512 MB G92 670/972 MHz PCI-e 16x 2xDVI 1xSVid 1xHDTV - 750/1100 MHz Overclock, 65% Fan, 58C Load
SAT: Silicon Image 3114 PCI SATA-I 150 MB/s RAID 0,1,5 Onboard Controller FW: 5.3.14 Modded
HDD: Western Digital Caviar SE16 320 GB 16 MB SATA-II 300 MB/s - 2x in nVidia 298 GB Mirror Array (RAID1)
HDD: Western Digital Caviar SE16 500 GB 16 MB SATA-II 300 MB/s - 2x in nVidia 465 GB Mirror Array (RAID1)
NIC: Marvell Yukon 88E8001 Gigabit Onboard PCI NIC
SOU: Creative Labs Sound Blaster X-Fi Xtreme Music 24-bit 128-Voice 109dB SNR
SOU: RealTek ALC850 AC'97 Rev 2.3 8-Channel Onboard Audio
DVD: 2x NEC ND-3540A DVD+-RW SL/DL 16X DVD 48X CDR - FW: 1.06
POW: OCZ PowerStream 520W SLI ADJ ATX2.0 EPS12, +3.3V 28A +5V 40A +12V 33A
CAS: Lian-Li PC-V1200 Plus Mid-ATX Aluminum 4x5.25 6x3.5 2x120mm
[/size]
 
Actually really glad you pointed this out. I had for some time had to hold off on getting several of these drives and dropping them into a RAID array due to the increased cost in the "raid enabled" drives. The real difference in the drives being TLER is enabled on one and disabled on another.

I'll seriously have to look at the cost again and see about potentially doing this.
 
Best write-up I've seen for the TLER situation yet!

Links to the utility and everything.

Nice job JakFrost! (Big Thumbs Up and this Bud's for you!) :)
 
I can confirm that the WDTLER utility works on the WD10EACS.
 
I really hope that this greedyness by Western Digital to sell "Raid Edition" drives for much more comes and bites them hard back in the ass.

I can't even count how many stories I've heard of people losing their arrays due to WD's TLER issues.
 
Well, according to The Tech Report, the RE2-GP drives also have "Rotary Acceleration Feed Forward (RAFF), which detects and compensates for the ambient vibrations typical of multi-drive environments" in addition to the TLER. I could be wrong, but somehow I think if you were the type of person to be going with these GP drives, you're likely not all *THAT* concerned about the performance hit ambient vibrations may cause.
 
Does anyone know if this fix is needed for RAID 0 also since RAID 0 is not exactly RAID.
 
(The array while it was operational for many months was dog slow because it was Software RAID5 and the parity calculations were interrupting the CPU constantly on every write making my system slow and unresponsive any time I would perform any extended writes to the array. Learned my lesson to never use Software RAID5 and recommend against it now to everyone.)

I'm a little disappointed by this conclusion. Software RAID of any level can be done well, and in some cases it's all you need. Shun a particular implementation (SilImage? Shun away!) but don't dismiss the whole concept. Highpoint has a decent implementation, for example, and Linux software raid is very capable and steady.

But all kinds of RAID depend on good drivers, period. If you have bad drivers for your hardware controller, you'll end up with a corrupt array in pretty short order. But if you have good drivers, you'll have no troubles with a software RAID array.

So how do you find good drivers? There are only two ways I know of: word of mouth and try-and-see. I recommend the first, having lost data to the other one. Ask around and see what's recommended before buying.
 
works on my drive.

04-10-08_1321.jpg




EDIT: here is the program if you cant get it from that other site http://mupfc.marshall.edu/~providenti/WDTLER.zip
 
Can I just ask a question here as I am about to buy 8 x 750gb WD drives for my RAID5 array. Do I need to buy the RE drives or can I simply buy the normal desktop drive and enable TLER? I assume I can get some 750gb drives that aren't the stupid GP?
 
Can I just ask a question here as I am about to buy 8 x 750gb WD drives for my RAID5 array. Do I need to buy the RE drives or can I simply buy the normal desktop drive and enable TLER? I assume I can get some 750gb drives that aren't the stupid GP?

yes, you can but whats wrong with the GP drives?

if u want performance get the 640gb drives
 
Can I just ask a question here as I am about to buy 8 x 750gb WD drives for my RAID5 array. Do I need to buy the RE drives or can I simply buy the normal desktop drive and enable TLER? I assume I can get some 750gb drives that aren't the stupid GP?

I would highly recomend getting the RE versions if you are planning on raiding.
 
noce post, thank!

if you have a raid card im guessing you need to put the drivers in the dos image? how else will the drives (raid partition) be recognized, unless you connect the drives to the motherboard directly?

or yes,you can connect the drives directly to the motherboard while your doing your quick change.

what would be the best procedure.
 
Same thing I was curious about myself when he made this statement. I have been using the SE 640gb editions with not a problem at all. I have four of the drives with a Areca 1210 controller. My RAID5 has been rock solid.
 
i think that RE are supposed to be a much more rugged, and have a longer life span. plus a 5 year warranty.

SE has only a 3 year warranty.

Edit 5 year warranty for the RE
 
i think that RE are supposed to be a much more rugged, and have a longer life span. plus a 3 year warranty.

SE has only a 3 year warranty.


my regular 1tb gp drive has a 3 year warranty on it. from my understanding the RE drives just have TLER enabled and thats the only difference.
 
So is there a way to load the driver for the raid card when doing this. I have 4 WD drives that have been disabled from running on anything but the raid controller thanks to the lovely staggered boot issue. I need to be able to do the TLER, but it doesn't see my drives because of them being on the raid controller.
 
the way i did it with raid was..
i unplugged the raid card and plugged the drives into the mobo, then botted with a boot cd.
and did the TLER

Important: physically disconnect the raid card from the mobo.
 
The problem is the motherboard won't spin up the drives because they are waiting for the signal to spin up from the raid controller. That is the problem with WD not supporting the staggered spinup. If it happens once, they won't work without it again.
 
The problem is the motherboard won't spin up the drives because they are waiting for the signal to spin up from the raid controller. That is the problem with WD not supporting the staggered spinup. If it happens once, they won't work without it again.

So that is why my 8x WD10EACS will only work with my RR2320.
 
Yep, really disapoints me with WD. I like WD but this is upsetting. I will probably be watching to see how the Samsung F1 320 drives do. I like using the smaller single platter high density drives for the raid setup.
 
The problem is the motherboard won't spin up the drives because they are waiting for the signal to spin up from the raid controller. That is the problem with WD not supporting the staggered spinup. If it happens once, they won't work without it again.

really?
goodness, guess its a good thing mine work with both.
500YS drives with a 5 year warranty...
 
Yep, really disapoints me with WD. I like WD but this is upsetting. I will probably be watching to see how the Samsung F1 320 drives do. I like using the smaller single platter high density drives for the raid setup.

Wish I had known about this earlier. I thought they went bad or something, so I RMA'd them. Fortunately, the replacement drives were all new, guess WD didn't have many refurbs at the time.
 
The problem is the motherboard won't spin up the drives because they are waiting for the signal to spin up from the raid controller. That is the problem with WD not supporting the staggered spinup. If it happens once, they won't work without it again.
Could you/anyone expand on that? I don't quite understand what is happening, but it sounds pretty serious. The drive somehow remembers that it usually gets a spin up signal, and for all time after having recieved a spin up signal just once, it won't ever spin ut without it ever again?

Also, have anyone verified weather "WESTERN DIGITAL WD CAVIAR 640GB SATA2 7200RPM 16MB (WD6400AAKS)" and "WESTERN DIGITAL Digital Caviar GP 750GB SATA2 16MB IntelliPower (WD7500AACS)" work with this utility? The WD10EACS have quite a bit higher price/GB here than those two drives, so I would rather buy them if they can have TLER enabled.
 
^ I would also want to know more of this.
I have 4 of WD1000FYPS RE2 drives, but in my case, on my 3ware card, I cant get them to do staggered spinup at boot at all.
Have they disabled this function all together?
 
I need to get on this. I've been having this problem for sometime on my Raptor drives, and my 500GB SE16 drives.

Great find!
 
^ I would also want to know more of this.
I have 4 of WD1000FYPS RE2 drives, but in my case, on my 3ware card, I cant get them to do staggered spinup at boot at all.
Have they disabled this function all together?

Nope, staggered spinup works just fine on my 8xWD10EACS. It takes a full 120 seconds to get past the bios now. The problem is that the drives will no longer function on a non-raid controller.
 
Dew: Do you know if this problem can be avoided if staggered spinup is disabled before the disks are connected for the first time?
 
Dew: Do you know if this problem can be avoided if staggered spinup is disabled before the disks are connected for the first time?

Not sure. I'm pretty sure I had it disabled when I did this.

Note: On a HighPoint, the drives will only get modified if you initialize them.
 
could be controller dependant then?
i have a Promise ex8350 for now and 5 WD 500YS drives
i can plug the sata into the controller or into the mobo and it doesnt matter. they are always detected.
 
Could you/anyone expand on that? I don't quite understand what is happening, but it sounds pretty serious. The drive somehow remembers that it usually gets a spin up signal, and for all time after having recieved a spin up signal just once, it won't ever spin ut without it ever again?

Also, have anyone verified weather "WESTERN DIGITAL WD CAVIAR 640GB SATA2 7200RPM 16MB (WD6400AAKS)" and "WESTERN DIGITAL Digital Caviar GP 750GB SATA2 16MB IntelliPower (WD7500AACS)" work with this utility? The WD10EACS have quite a bit higher price/GB here than those two drives, so I would rather buy them if they can have TLER enabled.

I can verify that the WD6400AAKS work with the utility. I have four of the drives and I started over with my RAID5 and took each drive and enabled TLER on all four of them. Drives have been put back in RAID and they all are working fine.
 
I did this on my Raptors. I'll see how it goes. I'll also use this utility on my SE16 drives and see what happens.
 
I did this on my Raptors. I'll see how it goes. I'll also use this utility on my SE16 drives and see what happens.

you don't have to. my original 36gb raptors already had a r 7.5 and w 8.0

now my Western Digital Caviar SE WD3200JD has tler disabled.
 
you don't have to. my original 36gb raptors already had a r 7.5 and w 8.0

now my Western Digital Caviar SE WD3200JD has tler disabled.

Well I've been having problems with the drives dropping out of the array and showing up as failed. So hopefully this fixes the issue. I also had the same issue with this same problem on my 500GB SE16 drives on my HTPC. So again, I'll have to try this out on those drives.
 
Well I've been having problems with the drives dropping out of the array and showing up as failed. So hopefully this fixes the issue. I also had the same issue with this same problem on my 500GB SE16 drives on my HTPC. So again, I'll have to try this out on those drives.


if u type wdtler.exe only it will display the information.

05-12-08_2036.jpg
 
Well since running this utility none of my drives have failed out of the array.
 
A lot of the first gen RE2 drives have issues with dropping out of arrays, and I discovered this the hard way. If you look at my sig I have four WD5000YS listed. They aren't WD5000YS anymore. They've all been RMA'd to WD5001ABYS, and I received new not refurb drives. After two failed I asked WD to replace the other two as well and they agreed.
 
Back
Top