Expandable software Raid-5?

Dew

2[H]4U
Joined
Jun 23, 2003
Messages
3,854
If I setup a 4x 320GB RAID-5 software array, and want to eventually expand to 8x, can I do this without recreating the whole array? I want to add one HDD every few months. Or should I just say screw it, buy another 4 HDDs now, and be done with it? (4 more HDDs = $450, Hardware 8Port PCI-X = $300)

This will be a dedicated fileserver, so I can use BSD or Linux.
 
It depends on the raid card. LSR does not support it, so if you want to do that (which I'd recommend) you need all the disks at once. Highpoint 2220 and 2320 cards do support OCE, but I'd recommend you take a backup before you try to expand. It'll be a good deal simpler to buy all at once.

On the other hand, if you want to try something really new, Solaris 10 with ZFS supports adding disks to a raid-Z partition, which is essentially the same concept as raid 5 - store parity as well as files, to prevent loss. My brother tried it and gave up, but maybe you'll have better luck.

 
What kind of controller are you using or plan on buying? This really depends on the type of controller card you have. I've seen some newer ones support adding disks dynamically, but most U320 RAID controllers that I work with don't support this (HP netraid and Dell PERC-X series).

Have you considered just making a 2nd array? This might be a bit more efficient for you. You'll lose a disk worth of space, but it will be easier to maintain, backup, and restore in case of a problem.

Also, RAID 6 (or RAID a/d/g) is kind of a cool idea, but I don't think this exists for SATA.
 
More than likely a second array will be in order. From all my research doing a software raid5 and expanding it is either not possible, or very likely to result in a catastrophic data loss.

As long as my storage needs don't exceed 1TB on the array I should be fine. My current storage needs are about 450GB growing at between 50GB and 100GB per month. I'll likely be at maximum capacity on this array by the end of the year.

Then I guess I'll just buy another 4 drives(400GB or 500GB) and be good for another year or so.

Thanks for the pointer on ZFS though, although it can't expand, it looks like it is higher performance than a RAID-5 and has better data integrity built into the filesystem.
 
If you want to go hardware the Areca cards support online expansion/level migration. I just converted from raid5/4 drives to raid6/8 drives and it went very smoothly.Yo ucould pick up a 16 port for about a grand and have enough SATA ports to last you an eternity.
 
Dew said:
If I setup a 4x 320GB RAID-5 software array, and want to eventually expand to 8x, can I do this without recreating the whole array? I want to add one HDD every few months. Or should I just say screw it, buy another 4 HDDs now, and be done with it? (4 more HDDs = $450, Hardware 8Port PCI-X = $300)

This will be a dedicated fileserver, so I can use BSD or Linux.

I'm doing this exact thing with my file server.

My basic setup:
Highpoint RocketRAID 2220 PCI-x card
Western Digital 250GB drives. Currently have 5 but am thinking of getting 3 WD 320GB since the prices have dropped and begin migrating/replacing them.
Suse Linux 10.0

The OLM (On-line raid Level Migration) is easy to use and can add a drive to the array without losing the array. I have done it twice now (started out with 3 drives and have added one at a time) without a problem. It takes roughly 12-16 hours to complete depending on how full the drives are. I went paranoid and umounted the array from Linux first.

After that's done, run parted to resize the partition to include the new drive, remount the filesystem and you're done. Word of advice: use parted to create your ext2 partition, DO NOT USE FDISK!!!!! I used fdisk to create the partition for 3 drives and couldn't expand it with parted so I had to move everything off, nuke and pave, and move everything back on.
 
Another word of advice - don't use plain ext2. Ext3, at a minimum, and consider trying one of the other filesystems; ext3 is looking pretty dated these days. XFS, JFS or Reiser would be my top 3.

 
Large reiser partitions are the devil. They take forever and a day to mount, my dual core takes about a minute to mount my 980gb reiser partition. XFS mounts immediately.
 
hokatichenci said:
Large reiser partitions are the devil. They take forever and a day to mount, my dual core takes about a minute to mount my 980gb reiser partition. XFS mounts immediately.

Reiser also doesn't resize. I origionally had ext3 but parted doesn't like to resize it either and converting to and from ext2 didn't help matters either. That's how I stumbled across the whole fdisk/parted problem in the first place.
 
From what you guys have said, I'm now giving serious thought to the RockerRAID 2320, since it's PCIe, $265 shipped from ZZF. My original plan was to use a PCI-X card in a 32bit PCI slot, but that would have seriously hampered its performance. The PCIe card looks like its a speed monster.

If I do that, what filesystem do you recommend? I guess I should have told you this will be bulk file storage. There should never be a file under 100MB. From reading the RocketRAID performance sheet, it looks like 512k block sizes is the best for the CPU utilization to read/write speed ratio.

Edit: Ordered the 2320.
 
XFS is what I'm using, and it's highly recommended - it does everything I've asked it to, and it's plenty fast.

 
unhappy_mage said:
XFS is what I'm using, and it's highly recommended - it does everything I've asked it to, and it's plenty fast.

I 2nd that. Go for XFS, its said that XFS has issues recovering from power loss if mid-write but supposedly the problem isn't unique to XFS. I suggest just making sure when you first get it running to test it by dropping drives from the array and shutting off the power etc. It'll also give you an accurate idea of rebuild times.

Good luck!
 
So I have the 2320. But I don't have drivers for Gentoo and it looks like someone has deleted the entire highpoint site, DOH!

Guess I'll have to wait a few days before I can transfer my files.

Unless someone happens to have this:
rr2320-linux-src-1.0-051109.tar.gz

The raid is 959.something GB before formatting, it should be ~894GB after I format with XFS.


On another note, the 2320 is expandable! You can use a pair of them for up to sixteen drives in a single array!

So my Raid5 array has a maximum capacity of 4470GB formatted, 4172 if I use a hotspare(which if I have that many drives, is a must.)
 
Rocco123 said:
What kind of controller are you using or plan on buying? This really depends on the type of controller card you have. I've seen some newer ones support adding disks dynamically, but most U320 RAID controllers that I work with don't support this (HP netraid and Dell PERC-X series).

Have you considered just making a 2nd array? This might be a bit more efficient for you. You'll lose a disk worth of space, but it will be easier to maintain, backup, and restore in case of a problem.

Also, RAID 6 (or RAID a/d/g) is kind of a cool idea, but I don't think this exists for SATA.

I know raid6 is supported on the areca...
 
I must say that I'm impressed with this setup. Copying files to the array at around 45MB/sec. Heh, and I thought raid5 write speeds sucked.
 
Dew said:
I must say that I'm impressed with this setup. Copying files to the array at around 45MB/sec. Heh, and I thought raid5 write speeds sucked.
wow nice!
 
"Reiser also doesn't resize. "

Yes it does.

Also if you use linux, evms is really cool for software raid, it seems pretty safe to use to expand raid arrays (will even recover if power goes out while expanding- which I doubt is handled with most hardware cards, although i'm curious to find out). I've used it to expand raid5 (reiserfs too :p). Pretty much every file system you can expand, some cannot be shunk though.. also depends if you want to expand/shrink while the filesystem is mounted, most must be unmounted... actually:

http://evms.sourceforge.net/user_guide/#expandshrink

That sums it up pretty clearly.

Also I'm not sure how much it affects hardware raid cards, however play with blockdev, it seemed to really affect my arrays performance... just google it for alot more info.
 
Expanding and shrinking *volumes* is well-supported and safe. Expanding and shrinking regions isn't.

 
What do you mean? Its not possible or its not safe?

"B.5. Resizing MD regions

RAID regions can be resized in order to expand or shrink the available data space in the region."

--edit--

I'm confused, the link I pointed to was talking about expanding/shrinking volumes but I just refering to the table showing what resizing each filesystem supports. Its easier to resize a volume but you could still play with cfdisk and resize partition info and then use expand processes. For any array however it makes sense just to put a volumegroup on it...

I think i'm missing something lol :/
 
Because of the RR 2320 controller, my array looks like any normal sata/scsi drive to linux. I'll be adding another 320GB hdd later this week, then I just need to grow my XFS partition to fill the entire "drive".

Btw, here are my performance tests:
Sigh, of course after I do a bunch of work making my own tests, I discover a simple tool that does everything.

Gentoo5 ~ # bonnie -d /mnt/terabyte -s 4096
File '/mnt/terabyte/Bonnie.9689', size: 4294967296
Writing with putc()...done
Rewriting...done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
4096 33390 91.1 67527 21.4 30550 9.8 45328 79.9 127054 19.2 224.4 1.2


More Readable:

Code:
     -------Sequential Output--------
     -Per Char- | --Block--- | -Rewrite--
  MB K/sec %CPU | K/sec %CPU | K/sec %CPU
4096 33390 91.1 | 67527 21.4 | 30550  9.8

 ---Sequential Input-- | --Random--
 -Per Char- | --Block---  | --Seeks---
 K/sec %CPU | K/sec %CPU  | /sec %CPU
 45328 79.9 | 127054 19.2 | 224.4  1.2

Versus the 300GB Seagate SATA in the system:
Code:
Gentoo5 ~ # bonnie -d /mnt/tv -s 4095
File '/mnt/tv/Bonnie.9722', size: 4293918720
Writing with putc()...done
Rewriting...done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...
              -------Sequential Output-------- ---Sequential Input-- --Random--
              -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
         4095 40457 82.9 63555 32.3 22334 10.0 37844 84.8 58444 19.4 155.4  1.1

Better in just about every way.
 
Well, this thing in Raid0 with 5 drives is FAST.
Code:
Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
Fedora5          4G 41235  96 161390  43 46831  15 41161  97 239100  36 314.4   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  4361  26 +++++ +++  5249  28  4866  27 +++++ +++  3702  27
Fedora5,4G,41235,96,161390,43,46831,15,41161,97,239100,36,314.4,0,16,4361,26,+++++,+++,5249,28,4866,27,+++++,+++,3702,27

I'll edit my post later today with Raid5 with 5 drives.
 
Two things. First, as a rule of thumb, you'll want to use a size of at least 10*ram to get accurate results with bonnie++ - linux aggressively caches stuff into ram, and that can affect results quite a lot. The smallest I see in your sig is 768MB; call it an even 8GB.

Second, that's not fast for five drives. I got better write performance with a raid 0 over my 3 maxline 3's, and read performance nearly that. But I anxiously await raid 5 results; mine are about 50 MB/s both directions. A 3-disk and a 5-disk raid perform quite differently, though - there is a special case with the 3-disk that can lead to quite a bit more speed. I'm hoping to buy 5 more of these disks, and then run a bunch of tests.

 
As requested, 10x the ram:

Raid5, 5 drives
Code:
Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
Fedora5         10G 34637  90 71612  25 38922  15 38857  92 176716  28 193.7   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  6665  62 +++++ +++  5498  51  6996  67 +++++ +++  3825  42
Fedora5,10G,34637,90,71612,25,38922,15,38857,92,176716,28,193.7,0,16,6665,62,+++++,+++,5498,51,6996,67,+++++,+++,3825,42
[dew@Fedora5 root]$ df -B 1M | grep terabyte
/dev/sda1              1220475         9   1220467   1% /mnt/terabyte
 
Back
Top