What RAID can and cannot do (straight talk on RAID's limitations/pitfalls)

DougLite

Supreme [H]ardness
Joined
Jan 3, 2005
Messages
4,764
This document presumes the reader to have a basic understanding of RAID concepts and levels. If you need to get up-to-speed on the mechanics of the different RAID levels, AC&NC's RAID.edu is an excellent place to start. With that out of the way, I will move on to considerations when deploying RAID - RAID may not be necessary, may not be effective for your application, and may actually be less effective than a single disk for a particular application.

What RAID can do
RAID can protect uptime. RAID levels 1, 0+1/10, 5, and 6 (and their variants such as 50 and 51) allow a mechanical hard disk to fail while keeping the data on the array accessible to users. Rather than being required to perform a time consuming restore from tape, DVD, or other slow backup media, RAID allows data to be restored to a replacement disk from the other members of the array, while being simultaneously available to users in a degraded state. This is of high value to enterprises, as downtime quickly leads to lost earning power. For home users, it can protect uptime of large media storage arrays, which would require time consuming restoration from dozens of DVD or quite a few tapes in the event of a disk failing that is not protected by redundancy.

http://www.hardforum.com/showpost.php?p=1028684072&postcount=186

RAID can increase performance in certain applications. RAID levels 0, and 5-6 all use variations on striping, which allows multiple spindles to increase sustained transfer rates when conducting linear transfers. Workstation type applications that work with large files, such as image and video editing applications, benefit greatly from disk striping. The extra throughput offered by disk striping is also useful in disk-to-disk backups applications.

What RAID cannot do
RAID cannot protect the data on the array. A RAID array has one file system. This creates a single point of failure. A RAID array's file system is vulnerable to a wide variety of hazards other then physical disk failure, so RAID cannot defend against these sources of data loss. RAID will not stop a virus from destroying data. RAID will not prevent corruption. RAID will not save data from accidental modification or deletion by the user. RAID does not protect data from hardware failure of any component besides physical disks. RAID does not protect data from natural or man made disaster such as fires and floods. To protect data, data must be backed up to removable media, such as DVD, tape, or an external hard drive, and stored in an off site location. RAID alone will not prevent a disaster , when (not if) it occurs, from turning into data loss. Disaster is not preventable, but backups allow data loss to be prevented.

http://www.hardforum.com/showpost.php?p=1029749378&postcount=1
http://www.hardforum.com/showthread.php?t=976436
http://www.hardforum.com/showthread.php?t=1127622

RAID cannot simplify disaster recovery. When running a single disk, the disk is usually accessible with a generic ATA or SCSI driver built into most operating systems. However, most RAID controllers require specific drivers. Recovery tools that work with single disks on generic controllers will require special drivers to access data on RAID arrays. If these recovery tools are poorly coded and do not allow providing for additional drivers, then a RAID array will probably be inaccessible to that recovery tool.

http://www.hardforum.com/showthread.php?t=1077702
http://www.hardforum.com/showthread.php?t=1083933
http://www.hardforum.com/showthread.php?t=1071619
http://www.hardforum.com/showthread.php?t=1088737

RAID cannot provide a performance boost in all applications. This statement is especially true with typical desktop application users and gamers. Most desktop applications and games place performance emphasis on the buffer startegy and seek performance of the disk(s). Increasing raw sustained transfer rate shows little gains for desktop users and gamers, as most files that they access are typically very small anyway. Disk striping using RAID-0 increases linear transfer performance, not buffer and seek performance. As a result, disk striping using RAID-0 shows little to no performance gain in most desktop applications and games, although there are exceptions. For desktop users and gamers with high performance as a goal, it is better to buy a faster, bigger, and more expensive single disk than it is to run two slower/smaller drives in RAID-0. Even running the latest, greatest. and biggest drives in RAID-0 is unlikely to boost performance more than 10%, and performance may drop in some access patterns, particularly games.

http://www.hardforum.com/showthread.php?t=1001325&highlight=proved+ineffective

RAID is not readily moved to a new system. When using a single disk, it is relatively straightforward to move the disk to a new system. Simply connect it to the new system, provided it has the same interface available. However, this is not so easy with a RAID array. A RAID BIOS must be able to read metadata from the array members in order to successfully construct the array and make it accessible to an operating system. Since RAID controller makers use different formats for their metadata (even controllers of different families from the same manufacturer may use incompatible metaadata formats) it is virtually impossible to move a RAID array to a different controller. When moving a RAID array to a new system, plans should be made to move the controller as well. With the popularity of motherboard integrated RAID controllers, this is extremely difficult to accomplish. Generally, it is possible to move the RAID array members and controllers as a unit, and software RAID in Linux and Windows Server Products can also workaround this limitation, but software RAID has other limitations (mostly performance related).

http://www.hardforum.com/showthread.php?t=1083933
 
Good post Doug. You should definitely sticky this.

Hopefully this will help get the message out that RAID cannot replace a backup system. :)

 
I have already added it to ANSWERS, and it is mainly a consolidation of lessons learned from the recent wave of disasters we have had recently :( Feel free to contribute anything that should be added.
 
DougLite said:
I have already added it to ANSWERS, and it is mainly a consolidation of lessons learned from the recent wave of disasters we have had recently :(

I honestly think it might be a good idea just to sticky this particular thread for a short while too. Especially for those users that have already read the ANSWERS thread, and aren't as likely to read it again with this new section.

Just my opinion.

 
DougLite said:
RAID can also increase performance in certain applications. RAID levels 0, and 5-6 all use variations on striping, which allows multiple spindles to increase sustained transfer rates when conducting linear transfers. Workstation type applications that work with large files, such as image and video editing applications, benefit greatly from disk striping.

So RAID 0 can benifit programs like DVD shrink?
 
Not to get too obvious, but here's one you missed:
Combine all your drives into a single volume
One of the essential properties of RAID levels is that they take a bunch of disks and stick them together in some manner. If you've got 4 200gb disks to put 500gb of data on, then your options are either to figure out some way to break the data into chunks or to use RAID to combine the disks into a single logical volume. Either raid 0 or raid 5 would work in this situation, and with 300gb disks raid 10 would also be an option.

 
unhappy_mage said:
Not to get too obvious, but here's one you missed:
Combine all your drives into a single volume
One of the essential properties of RAID levels is that they take a bunch of disks and stick them together in some manner. If you've got 4 200gb disks to put 500gb of data on, then your options are either to figure out some way to break the data into chunks or to use RAID to combine the disks into a single logical volume. Either raid 0 or raid 5 would work in this situation, and with 300gb disks raid 10 would also be an option.


"This document presumes the reader to have a basic understanding of RAID concepts "
 
I just wanted to clarify this:
"sustained transfer rates when conducting linear transfers" This is what happnes when you back up a DVD ?
 
Theoretically, yes, but in practice DVD drives are way slower than hard drives. My DVD drive (granted, it's an old one) only goes up to about 12x on reads at max, or about 17 MB/s. This is nothing for a hard drive; I can get 50 MB/s onto one drive. Striping drives just so you can back up DVDs faster is like using a chainsaw to swat flies - a powerful approach, but the wrong one.

 
FreshPrinceOfBellAir said:
So RAID 0 can benifit programs like DVD shrink?
My understanding of DVD shrink is that it is primarily CPU limited while breaking encryption and reauthoring, then it is limited by your DVD-RW drive's speed when actually writing to DVD.

If you had a ridiculously fast CPU setup (it would probably take a CPU that is not yet commercially available) then a RAID-0 array would help to feed data for the CPU to crunch as fast as possible. I don't think we're there yet. RAID-0 is more of an advantage for frequent saves and loads when editing individual frames/scenes using workstation applications.
 
DeCSS is actually fairly easy to do; if you're doing a straight copy of a DVD, that's not a limitation. If you just strip out some streams, that's easy too; only when you get into compressing the video do you hit a limitation in CPU power. In any case, that's pretty much the only limitation on the speed of DVD Shrink. Spend the extra money on a faster CPU (or a better DVD-ROM drive!), hard drives aren't the problem here.

 
RAID in itself offers little to no advantage over non-RAID configurations. But I've found a major (and fatal) disadvantage of single-drive systems compared to multiple-drive (non-RAID) systems: Removable media backup is still molasses-slow compared to hard disks, forcing very infrequent backups. And it may take weeks or even months of downtime in the event of a catastrophic failure of a single hard disk without a duplicate image on a second hard disk. And today's ultra-large-capacity hard disks actually contribute to such laziness.
 
I think it should be stressed that virtually all consumer level RAID controllers are software based and offload all of their calculations onto the CPU. In other words, they're no better than software RAID, with the added disadvantage that if the controller or the motherboard that it's fixed to dies, then you have to buy a motherboard with an identical chipset if you want to see your data again.
 
Why is it that SATA drives do not require a jumper to be installed for use.

If you have a raid 0 array on a raid controller and plug another drive onto that same controller that isnt part of that array will there be any problems with that disk or the array being recognized seperately?

I am assuming you have to add that drive to the array.
and if thats the case how does the raid controller determine which drive has the os on it (assuming I plugged in a hard drive from another system with an OS on it.)

Just wondering!
 
1) There's only one drive on a channel, so it's not necessary to keep track of master/slave. This is different with port multipliers, but they have different configuration and most people won't run into them any time soon.

2) Nope, but don't add it to the array; it'll be seperate from the array.

 
I know this is in the sticky, but I'm gonna bump this thread, and take suggestions on how it can be improved. Links to threads with good info on RAID's limitations are especially welcome.

EDIT: Now that it's at the top of the list, I see that this thread has amassed 3,000 views. A good start.
 
Can I please use this guide on the Wikipedia article for RAID arrays? It would be VERY valuable information. It has cleared up a lot of things for me personally.
 
Awsome write-up :D Places where striped RAID arrays tend to really shine (0/10/0+1) is in raw transfer speed. If you are ripping HDTV content (off a camera) this will move much, much faster than a single disk. Something I think everybody should be aware of is RAID3/5/6 all involve calculating a parity. Calculating parity isn't a super intensive task (For example, my Areca raid card has a 500mhz processor and I doubt it even comes close to fully utilizing it) but it can really impede on your write speeds. My 8 disk array gets like 220mb/s write speeds or so, and if it were raid0 I'm sure it would be significantly faster.

Imo, RAID is good for critical data that you'll always want to be available even in the event of a drive failure. This seems to really come down to schoolwork/actual work/the operating system. I had a drive fail on me once right before finals week when a lot of stuff was due, as soon as the quarter was over and I had time I immediately moved to raid1 for my OS drive, and my current workstation is a single raid6 array with hot swapping so I don't even have to worry about that downtime. If you really want to go raid, go big :D
 
In reply to all this crap about how raid0 doesn't have support for fast backups,


There should really be some info on Intel Matrix Raid. Where it lets to define different volumes and raid types across one single array.

IE: I have 2x320GB Seatgate .10s hooked to an ICH8R. Where 500GB is defined for my RAID0 (stripe) and 48GB is RAID1 (mirror) across the two drives. This way I can run everyday stuff (windows, games, apps, etc) on my very fast RAID0, and keep a backup (effectively 2 backups) of important stuff on my RAID1. Both contain a copy of windows, both are bootable.

I pull 410MB/s Burst with 160MB/s Seq Read, 140MB/s Avg Read with RAID0. Thats roughly TRIPLE (3x) the performance of my RAID1 array on the same physical drives.
 
@hokatichenci: Isn't capping from HDTV camcorders still 1x/realtime speed? Do frames still drop on a regular non-RAID drive?

@este: Intel Matrix RAID...is that handled in software, or hardware by the ICH, or a mixture of both?

Thinking about keeping a RAID 5 + a single Raptor in my next machine. On that note, how badly does RAID 5 kill write speeds?
 
este said:
In reply to all this crap about how raid0 doesn't have support for fast backups,


There should really be some info on Intel Matrix Raid. Where it lets to define different volumes and raid types across one single array.

IE: I have 2x320GB Seatgate .10s hooked to an ICH8R. Where 500GB is defined for my RAID0 (stripe) and 48GB is RAID1 (mirror) across the two drives. This way I can run everyday stuff (windows, games, apps, etc) on my very fast RAID0, and keep a backup (effectively 2 backups) of important stuff on my RAID1. Both contain a copy of windows, both are bootable.

I pull 410MB/s Burst with 160MB/s Seq Read, 140MB/s Avg Read with RAID0. Thats roughly TRIPLE (3x) the performance of my RAID1 array on the same physical drives.
I think you misunderstood the article:
"The extra throughput offered by disk striping is also useful in disk-to-disk backups applications."

explains that there are good uses for a RAID-0 system for nearline backup. For example a second system that has temporarily stores the live data before it is written to tape. By using nearline storage, the initial 'grab a copy' time is reduced.

and
"A RAID array has one file system."

You have two virtual RAID-arrays on two disks. You have one operating system that can access that at all times, therefore a simple "format all volumes" virus will make your `backups' disappear rather quickly.

The reason that you the RAID-0 array has triple the STR of the RAID-1 array is likely due to its location off the R0 array on the disk, i.e. at the beginning.
 
movax said:
Thinking about keeping a RAID 5 + a single Raptor in my next machine. On that note, how badly does RAID 5 kill write speeds?

It's not too bad. I have a four drive array of DiamondMax 10s in Linux software RAID 5. Reads at 160MB/s, writes at about 75MB/s.
 
movax said:
@hokatichenci: Isn't capping from HDTV camcorders still 1x/realtime speed? Do frames still drop on a regular non-RAID drive?

The dv cameras I've worked at do not copy at 1x realtime, they copy at whatever rate is possible.

movax said:
Thinking about keeping a RAID 5 + a single Raptor in my next machine. On that note, how badly does RAID 5 kill write speeds?

It depends, on software raid systems (as noted this is most chipsets) since the CPU is calculating all the parity, you lose write speeds. In a hardware raid array you still lose write speeds, but they have some optimized parity accelerator either on a CPU or a custom built unit so its much quicker in comparison to software. As far as my recent experience tells me, even getting really fast current gen processors won't get you to hardware raid speeds. Maybe some really good implementation can do better but I havn't found one yet. If you're just storing stuff it probably won't be an issue.
 
Did you realize "raid 3/5/6" are striped ;) And calculating raid 5 parity is even more trivial than you'd think. Here's some output from my 866 mHz dual p3 machine:
Code:
raid5: automatically using best checksumming function: pIII_sse
   pIII_sse  :  1751.000 MB/sec
raid6: mmxx2      861 MB/s
So I think it's safe to say that CPU speed isn't the limiting factor if you're getting under say 800MB/s writes.

Now, it's true that raid 5 is slower in practice. But there's not much theoretical reason for that, although I haven't really looked at the software implementation from the Linux kernel.

 
Good thread. Definitely seems to put a negative spin on RAID0 for home users/gamers...or, possibly does not list, in detail, enough of the possible benfits of a well-designed RAID0 array.

While RAID0 has it's drawbacks . . . some of those can be avoided simply by education. Don't put critical data on a RAID0 array, for example. If you choose to, beef up your backup strategy that much further.

Real-world benfits from RAID0 include:
Faster boot times - this has been proven time and time again. Faster game level load times - again, proven time and time again. Faster file transfers - I perform large disk to disk, or array to array, or array to disk backups/transfers (RAID0 has decreased these times dramatically). Faster defragmenting. Faster Windows installs. Faster game installs, etc.

As "faster" and "dramitically" are relative terms, this will vary from user to user.
Be it 8% or 15% or whatever, RAID0 is faster in many areas.
10% to me is very dramatic - even if it costs an arm and a leg. To another user, 10% is a joke of an increase and isn't worth the money.

I think we can all agree with what I've posted here. Yes there are drawbacks to RAID0...Cost, Reliability - more than halving your MTBF with just a 2 HDD RAID0 array, gettings worse with 3 HDDs etc..slightly slower seek times (which, I always use Raptors, so this isn't an issue (especially if you set up a small short-stroke boot volume on the outer edge of the disk(my 2 WD740ADFD's get 6.3ms avg. on their boot volume in RAID0, high 7's/low 8's avg. for the rest of the HDD.)), are the major drawbacks...heat/power/noise are all relative and vary too much in any given scenario.

If you're willing to deal with those pitfalls and have some $$, RAID0 can be very beneficial and enjoyable-yes, even for a gamer or "home user."



As far as read and write speeds of the different RAID types - they can vary dramatically. Speaking from an Enterprise level here (same should/does apply to home use), RAID 1 or if $ allows, RAID 10 will both yield the fastest writes while RAID 5 or similar yield the fastest reads. This is critical on large SQL and/or Exchange database servers. The transaction logs are placed on RAID1/10 array for extremely fast, constant writing, while the data(base) is placed on the RAID5 array for extremely fast, constant reads/queries. While RAID is certainly not limited to these types, these are the more standard types. My NetApp clustered SAN actually uses it's proprietary RAID DP (dual parity) which allows up to 2 HDD failures per RAID group with no data loss. The NetApp unit dismisses RAID1/5/10 as it uses a lot of caching, write optimiziation and a sheer amount of spindles reading and writing -averege of 14 HDDs/arms/spindles working across a 4Gbit backplane.
 
movax said:
@este: Intel Matrix RAID...is that handled in software, or hardware by the ICH, or a mixture of both?

Intel Matrix Raid (afaik) is handled 100% in hardware. My HDTach reports 0-1% CPU Utilization. I compared that to a raptor raid0 array that was using a Via chipset, that had 7% CPU, who know if that Via is any good though.

Intel has a decent windows app that lets you manage, change, verify, or fix raid stuff as well. It has a tre service that always runs but it takes 700kb of ram so I'm ok with that.


I'm not saying RAID0 doesn't have problems. (2x the chance of drive failure is obviously one, at the same time I've never had a single hdd fail in 10 years). But I certainly don't agree with people that say RAID0 has no place on a desktop PC. You just need some form of backup, matrix raid is really the best I've seen for that, at least the most conveient.

As far as cost of two drives go.... come on! :) My setup cost me $188 for 2 320gb drives, and I'm happy with the speeds I'm getting from them.
 
What I found interesting in the Anandtech articles, that I missed before, was that the games were loaded on separate(the tested) drives while the OS stayed on another. Who the hell does that? And may explain why they don't get any improvement as most people see when the have the OS/apps/games on the same drive/RAID0 setup.
 
Brahmzy said:
Um... those benchies do nothing to help your cause man. In 95% of the comparisons, they're comparing a single HDD with 2 RAID0 HDDs of a completely DIFFERENT type. Apples n oranges.
Read much?

First of all there are 7 comparisons, so how can there be 95% of anything? That would be 6.65.
Secondly, there is 1 review that compares a single faster drive against 2 and 4 disk RAID-0 configurations of a slower drive and 1 review where the model numbers are not explicitly stated.

So that's 5/7 that directly compare drive A vs 2 x drive A in RAID-0
I've highlighted those 5 in Red so you can find them without having to read through all the words.


Single Seagate 7200.10 750GB vs 2 x Seagate 7200.10 750GB RAID-0
http://www.anandtech.com/storage/showdoc.aspx?i=2760&p=10


Single Seagate 7200.10 750GB vs 2 x Seagate 7200.10 750GB RAID-0
http://www.anandtech.com/storage/showdoc.aspx?i=2760&p=10



Single WD740GD vs 2 x WD740GD RAID-0
http://www.anandtech.com/storage/showdoc.aspx?i=2101&p=10


Single WD1500ADFD vs 2 x WD740GD RAID-0 vs 4 x WD740GD RAID-0
http://forums.storagereview.net/index.php?showtopic=21621&st=0&p=221874&#entry221874

Single WD360GD vs 2 x WD360GD RAID-0
http://www.overclockers.com/articles1063/index02.asp


Model numbers of drives not given, but no reason to believe they aren't the same
http://faqs.ign.com/articles/606/606669p1.html

Single Hitachi Deskstar 250GB vs 2 x Hitachi Deskstar 250GB RAID-0
http://www.amdzone.com/modules.php?op=modload&name=Sections&file=index&req=printpage&artid=66
 
Back
Top