How to setup disks for ESX?

mike938

Weaksauce
Joined
Jan 22, 2007
Messages
96
Hello,
I am in the process of setting up a home ESX test env. and was wondering what the recommended way to setup my disks would be. I have 12 320GB SATA disks to be run in an external linux box and used via FC for booting and iSCSI for probably everything else.
I was planning on setting up 2 of the 320GB in a raid 1 array for the linux install as well as the FC targets for ESX to boot from.
However, I wasn't sure what the best way to setup the remaining 10 disks would be; whether I should just setup a 10disk RAID 10, 4 and 6 disk RAID 10s, a 10disk RAID 6 autocarved to prevent logical disks from hitting 2TB or what.
If anyone has recommendations I'd greatly appreciate it.

Thanks,
Mike
 
I would suggest 2x Raid 10
Simple process of elimination
raid 0 is out for obvious reasons
raid 5/6 write performance can't handle vms

so you're left with raid 10 and with 2x arrays you will have reduced access time over single big raid 10 array.

I use 6x + 8x raid 10 at work and 6x raid 10 at home for my esx systems,
but I'm seriously contemplating getting bunch of ssds and trying out different configs.
 
I don't mean to be argumentative, but in a home test environment there is absolutely no problem with Raid5/6 as there just won't be enough load to create a performance problem.

In fact, I can imagine most production VMs also run on 5/6.
 
We run four HP blade centers fully populated with dual quad core 2.66 GHz Xeons and 16-24GB of RAM each. The SAN they're connected to on the back end is Raid 5. We have 400+ VMs running on this setup and have yet to see any kind of disk related I/O contention.

So yes, there is no problem running a home setup on RAID 5.
 
I'm curious how you're going to set up 320GB SATA drives in a Linux box as FC boot targets. I can confirm no problem with home implementations on RAID-5, but I would not personally recommend it for home use, without the addition of a good hardware XOR RAID card with a fast CPU, plenty of cache, and BBU. It can be done, just so there's no confusion. I would just carefully plan your implementation. I have 4 vms on the same 6 disk RAID-5 set, and if they were all pulling Windows Updates, or installing software, it brought the array to its knees.

At work we use RAID-5 sets, even down to just 3 spindles, but it's on enterprise level IBM FC RAID controllers with lots of cache in them, and they're 300GB 15k FC drives, not 7200RPM SATA desktop drives.
 
Hello,
I am in the process of setting up a home ESX test env. and was wondering what the recommended way to setup my disks would be. I have 12 320GB SATA disks to be run in an external linux box and used via FC for booting and iSCSI for probably everything else.
I was planning on setting up 2 of the 320GB in a raid 1 array for the linux install as well as the FC targets for ESX to boot from.
However, I wasn't sure what the best way to setup the remaining 10 disks would be; whether I should just setup a 10disk RAID 10, 4 and 6 disk RAID 10s, a 10disk RAID 6 autocarved to prevent logical disks from hitting 2TB or what.
If anyone has recommendations I'd greatly appreciate it.

Thanks,
Mike

What kind of budget you have to go the FC way ? You would need at least 2 HBA in order to go the FC way and those are most likely 1000$ each. I would like to add that the only way for you to go with a box of this kind would be opensolaris with comstar. It's actually the same software that will be running in the future updates of Sun's 7000 series.

You can find a guide to setup comstar here

This is an example of the finished product sun is selling Sun Storage 7410
 
is there software that can be run to make an FC HBA in a machine act as a target instead of an initiator? If there's readily available software, I'd love to check it out. I have FC gear at my disposal. Someone enlighten me?
 
on linux you can run SCST as a FC target. There is a guide on the scst website for it. I'm currently installing/setting it up as I type this. (Recompiling kernel)
I am only using 2Gb FC HBAs so they were very inexpensive. maybe $120 for the dual and 2 single port cards.

And yes, to even consider raid 5/6 I'd be using a hardware card. Not a chance of software RAID 5/6 for me.

Edit: Got the software FC target running. Wasn't too bad once the kernel compiled correctly.
 
Last edited:
I would suggest 2x Raid 10
Simple process of elimination
raid 0 is out for obvious reasons
raid 5/6 write performance can't handle vms

so you're left with raid 10 and with 2x arrays you will have reduced access time over single big raid 10 array.

I use 6x + 8x raid 10 at work and 6x raid 10 at home for my esx systems,
but I'm seriously contemplating getting bunch of ssds and trying out different configs.

Bullshit at 5/6 can't handle VMs. Maybe not a massive exchange server, or on home hardware at times, but in production environments I call shens :p
 
I don't mean to be argumentative, but in a home test environment there is absolutely no problem with Raid5/6 as there just won't be enough load to create a performance problem.

In fact, I can imagine most production VMs also run on 5/6.

Exactly.

I'm curious how you're going to set up 320GB SATA drives in a Linux box as FC boot targets. I can confirm no problem with home implementations on RAID-5, but I would not personally recommend it for home use, without the addition of a good hardware XOR RAID card with a fast CPU, plenty of cache, and BBU. It can be done, just so there's no confusion. I would just carefully plan your implementation. I have 4 vms on the same 6 disk RAID-5 set, and if they were all pulling Windows Updates, or installing software, it brought the array to its knees.

At work we use RAID-5 sets, even down to just 3 spindles, but it's on enterprise level IBM FC RAID controllers with lots of cache in them, and they're 300GB 15k FC drives, not 7200RPM SATA desktop drives.

See below :) It works pretty well. not supported, mind you. Solaris also works, IIRC.

FWIW, Updates/installs on more than a few vms will bring a high-end san down - I've seen a company have serious connectivity issues that was finally traced to a virus-update running company wide at the same time every day. It brought a set of clustered DMX-4's down, every night!

What kind of budget you have to go the FC way ? You would need at least 2 HBA in order to go the FC way and those are most likely 1000$ each. I would like to add that the only way for you to go with a box of this kind would be opensolaris with comstar. It's actually the same software that will be running in the future updates of Sun's 7000 series.

You can find a guide to setup comstar here

This is an example of the finished product sun is selling Sun Storage 7410

You're over thinking things, and not thinking about the used market either.

is there software that can be run to make an FC HBA in a machine act as a target instead of an initiator? If there's readily available software, I'd love to check it out. I have FC gear at my disposal. Someone enlighten me?

Yes. :) Google! Openfiler can do it too, if you politely ask them about it and cough up some moneys.

on linux you can run SCST as a FC target. There is a guide on the scst website for it. I'm currently installing/setting it up as I type this. (Recompiling kernel)
I am only using 2Gb FC HBAs so they were very inexpensive. maybe $120 for the dual and 2 single port cards.

And yes, to even consider raid 5/6 I'd be using a hardware card. Not a chance of software RAID 5/6 for me.

Edit: Got the software FC target running. Wasn't too bad once the kernel compiled correctly.

Bingo! :) Let me know how the performance is, I'm curious. What did you use for the RAID controller?

My only cost issue was with the switches - finding cheap FC switches is hard.
 
Bullshit at 5/6 can't handle VMs. Maybe not a massive exchange server, or on home hardware at times, but in production environments I call shens :p

Bullshit it can't handle massive Exchange server! Or even insane levels of simultaneous updates. I can throw together a configuration that'll not only do that, but beat the pants off much bigger and pricier ones any day of the week. (We really need to hook up with an overstocked lab so I can show you some good stuff, Iopoetve.) ;)
Cheap FC switches, look up the Qlogic SANbox 1400's. You can get them for under $2k brand spanking new, and sub-$1k used. There's no licensing, no "recertification" bull. They're literally scaled down full blown enterprise gear. Just amazing switches.

Oh. And:
*pimpslaps sabregen, repeatedly* You should damn well know better than to do 3 disk RAID5. That's amateur idiot configuration only.
 
Oh. And:
*pimpslaps sabregen, repeatedly* You should damn well know better than to do 3 disk RAID5. That's amateur idiot configuration only.

Hey, sometimes you only have so many disks available. We're a small company, and $ isn't flowing so well, these days. You'll be glad to know that it was only used in this configuration for a very short time, and was grown to a 6 spindle config within a few weeks + a HS.

You know, some of your responses are unnecessarily terse. I don't think it was necessary to reference a 3drive RAID-5 as the "amateur idiot configuration." My opinion, versus yours.
 
Bullshit it can't handle massive Exchange server! Or even insane levels of simultaneous updates. I can throw together a configuration that'll not only do that, but beat the pants off much bigger and pricier ones any day of the week. (We really need to hook up with an overstocked lab so I can show you some good stuff, Iopoetve.) ;)
Cheap FC switches, look up the Qlogic SANbox 1400's. You can get them for under $2k brand spanking new, and sub-$1k used. There's no licensing, no "recertification" bull. They're literally scaled down full blown enterprise gear. Just amazing switches.

Oh. And:
*pimpslaps sabregen, repeatedly* You should damn well know better than to do 3 disk RAID5. That's amateur idiot configuration only.

I've seen stuff that'd make your eyes water :p

I've seen plenty of times that RAID5 causes limitations for high-IO applications - simply too many VMs per spindle, especially with default settings, and especially when sharing storage between multiple VMs like people have a habit of doing with RAID5 luns, and on a san with other applications running on it :) Sure, in an ideal environment, but most of the time there are other cooks in the kitchen and other applications in the picture too. I'd rather see Exchange on RDM's on either RAID1 or even better, RAID10, on fast disks, dedicated path. Idiot proof the configuration.
 
FYI, anyone can be the "idiot" in the "idiot proof configuration." Sometimes, it might be me.
 
lol. I always prefer suggesting the simple, easy to work with solution first. :)
 
you're saying I'm not simple? When did this turn into a character attack?
 
I've seen stuff that'd make your eyes water :p

Nah. I've been to the happy land of IBM. 8 node SVC + DS8300 Turbo = "help me stuff this into my carry on."
The only thing I haven't seen that would make me drool, doesn't exist. Hitachi USP V class with user configured RAID3, using 300GB 15K data and FPGA based ZBT SRAM with BBU 600GB parity disks in a 8+3 configuration. (Estimated throughput? ~1-1.2GB/s per array, total throughput potential of 224GB/s.)

I've seen plenty of times that RAID5 causes limitations for high-IO applications - simply too many VMs per spindle, especially with default settings, and especially when sharing storage between multiple VMs like people have a habit of doing with RAID5 luns, and on a san with other applications running on it :) Sure, in an ideal environment, but most of the time there are other cooks in the kitchen and other applications in the picture too. I'd rather see Exchange on RDM's on either RAID1 or even better, RAID10, on fast disks, dedicated path. Idiot proof the configuration.

RAID10 is anything but idiot-proof. If anything, RAID10 is easier to blow up, because of cross-shelf loading or software RAID. Soon as they're out of space, they'll either expand the array across shelves - major no-no - or they'll create a duplicate array and software RAID it. Then set up the partition wrong. Really, RAID5 ends up being the most idiot-proof, simply because it prevents people from doing stupid things like that. Even if you get off optimal multiples, like 6+P, you're not going to take as bad a performance hit. And because you're expanding existing, you just enlarge partitions, preventing them from putting 4k blocks onto a 256k stripe.

But these are the same people who'll tend to try and cram 30 VMs onto a single RAID5 with say, 9 spindles. At which point you implement user correction measures with a baseball bat. Where I was at, it took quite some time to convince the VM guy that upgrading to 3.5+ would require unwinding the hack of many large LUNs and breaking out individual LUNs per VM and per application. If it wasn't for the SVC, the setup would have collapsed long ago. All applications and OS shared on old (dating back to 2.x!), large LUNs. The only thing saving it was the SVC putting it across 180+ spindles, and major disk eaters not being virtualized. I shudder to think what would have happened if Exchange had been dropped into that on the existing disk.
 
virtualization...allowing IT personnel to inadvertently deploy broken systems at a moments notice since 1968.

Even if you get off optimal multiples, like 6+P

There is no 6+P. It's like that line in the Matrix: "There is no spoon." RAID-6 is one of the following:

1.) Vertical RAID
2.) RAID-5 P+Q
3.) RAID-5 DP
4.) RAID-6

There's no 6+P...that'd be RAID-5 P+Q+P...which makes no sense. Not trying to be a dick, just saying.

I've seen 4k blocks into 512k stripes before. Luckily, it wasn't on my array, so I didn't have to shoot myself. They were clueless as to why their disk performance was hovering around 50MB/sec on a 10disk RAID-10. I was surprised to see it was that high.

Bottom line, we've all done dumb things, and a lot of places have no idea why they need an admin for each subsystem of their server room, and don't have the budget for it. So, the poor schmuck that gets everything lumped onto his plate is jack of all trades, master of none. I've been that guy. I try not to be him now, but sometimes...
 
Last edited:
virtualization...allowing IT personnel to inadvertently deploy broken systems at a moments notice since 1968.

Well, FreeBSD is largely idiot proof in that regard, but that's in no small part because it damn near requires kernel development experience to deploy functioning jails. Let's hear it for difficulty of management!

There is no 6+P. It's like that line in the Matrix: "There is no spoon." RAID-6 is one of the following:
There's no 6+P...that'd be RAID-5 P+Q+P...which makes no sense. Not trying to be a dick, just saying.

Oh, I know that. However, everybody's documentation writes it wrong, and everybody thinks it's that way, so I just gave up and write it that way. 6+2P or 6+P isn't an inaccurate way to write it; you have 6 data disks and 2 parity disks. Doesn't matter what type of parity it is, just that it's parity. We won't get me started on XIV; any array I can cause to immediately perform a forced offline and shutdown in less than 5 seconds with an expected normal occurrence isn't an array, it's a joke.

I've seen 4k blocks into 512k stripes before. Luckily, it wasn't on my array, so I didn't have to shoot myself. They were clueless as to why their disk performance was hovering around 50MB/sec on a 10disk RAID-10. I was surprised to see it was that high.

That's almost as bad as the presumption that RAID10 scales infinitely. RAID10 is the worst for ESM scaling, period. I won't even permit cross-shelf RAID10, and it should be a totally unsupported configuration, bluntly. I'm amazed they managed to get 50MB/s out of 4k into 512k though, that's just.. wow. How did they even manage to get 512k stripe? Sigh.

Bottom line, we've all done dumb things, and a lot of places have no idea why they need an admin for each subsystem of their server room, and don't have the budget for it. So, the poor schmuck that gets everything lumped onto his plate is jack of all trades, master of none. I've been that guy. I try not to be him now, but sometimes...

I decided to really piss them off, and end up master of most. The problem obviously being that I am honestly, bored out of my skull more often than not. Things cease to be interesting when you have nothing relevant to go learn or test or break. Giving people fits when you can accurately and correctly answer the questions about network, storage and systems without having to pester someone else though? Always priceless. :)
 
Back
Top