Quandary - user input fails on heavy load (SCSI RAID)

LadyJaqie

I Have No Title
Joined
Jun 9, 2002
Messages
1,422
ok first off, the system

Athlon XP barton 2500+
Gigabyte GA-7VA-C Socket A DDR 8x AGP motherboard
2x 512MB el cheapo PC2700 DDR RAM
ATI Radeon 9800 PRO 128MB Video
DFE-530TX+ 10/100 LAN
Sound Blaster Live! 5.1
el cheapo TV card
DPT PM2865U3 2 channel u160 cacheing SCSI card, 64 bit PCI-X, 64MB cache SODIMM installed (company was bought out by adaptec)
Sony 4x DVD+-RW Drive
8x Seagate ST39175LC 9.1GB 7200 RPM drives in hardware raid 50
2x SCA-40 racks (drives are running in 80[ultra2wide])
1.44 floppy (why even list this? I dunno *shrug*)
Enlight 5u server case
Nspire 250W ATX PSU
Real Power 230W AT PSU (SCSI drives)
Samsung SyncMaster 192N 19" LCD monitor
Windows 2000 Professional (all updates, service packs, etc installed)


I have installed all the newest drivers, firmware, BIOS, everything for all devices in the system. I have checked with everyone I could about this problem. I am at my wit's end about this.

What happens -

When I copy any large file, the hard drive use on the SCSI drives goes to almolst nil (a tiny blip per second or so), the system keeps running, any apps already running seem to run fine, the file copying goes extremely slow (of course), mouse will move, keyboard will let me change capslock etc on or off, but I cannot click on anything, cannot bring up task manager, cannot shut down with power button, or anything. sometimes I can stop the file operation instantly if its an actual windows file copy but not always... its almolst always big files but not always. If I let it continue when it gets to small files it starts flying again till it gets to another big file then back to same problem! Its happened while burning DVDs, while copying from an IDE 120GB I have to the system to the SCSI, while copying from DVDs, copying from one SCSI partition to another, you name it, almolst always with big files. small files it zooms through. I have changed the card slot, changed the IRQ the card runs at, low level formatted the drives, rebuilt the array. This is a clean install of win2kPRO. I have reinstalled it, also, with the same results. Ive even tried a different power supply. (the current ones give enough juice)

Ive run out of possible answers! Help? :confused:
 
RAID 50... There's your problem, the parity calculation is kicking you in the ass on writes to the array. Try copying a big file FROM the array to IDE; that'll probably NOT stall like the reverse, as parity only needs to be calculated on writes or when a drive in the array is failed.

EDIT: Switch to RAID 0 over 4 sets of RAID 1 arrays. You'll lose 1/3 of your usable space on the array, but you'll have MUCH better write performance AND also have better redundancy.

EDIT2: If that card doesn't do parity calculations onboard, then, not only is the write performance slow, it's also making your CPU bear the load of calculating parity... I'm guessing it doesn't, the way the rest of your system seems to bog down. Host based parity calculation sucks, as all the data to be written needs to pass through the CPU first, which not only puts extra load on the CPU, but bogs down the FSB with a load of extra traffic.
 
but then it wouldnt stop responding to user input, there is even a known issue with the card with another version of seagate drives where this exact thing happens... *sigh* I dont want to start over with this array. dangit... :(
 
... and the parity calculations are probably being done at a high priority, which makes it extremely difficult for anything else to happen.

If you want to see a verifiable example of high priority crapping on things, run the Prime95 torture test, and set the process priority to "Realtime"... be prepared for it to bog down hardcore, if you do that... and it takes a while to undo it...
 
you didn't even look at the link I gave you, did you? it has a dedicated processor onboard for parity calculations.

*Intel i960 RN I/O processor ultra-low host CPU utilization
*Dedicated SCSI RISC processor for maximum SCSI performance
 
Hmmm.... That's odd... Just for kicks, open Task Manager and check the CPU usage during a large transfer and see what happens...

Perhaps it might be an issue relating to the cache size... Small files will fit easily, large files won't...

Are you running RAID 5 over 4 pairs of RAID 0, or are you running RAID 0 on a pair of 4 drive RAID 5 arrays? If the second, then you're doing 2 parity calculations instead of one, though the calculations are half the size... don't know how much of a difference, if any, that makes...
 
jaqie said:
you didn't even look at the link I gave you, did you? it has a dedicated processor onboard for parity calculations.

And you didn't check the time of my post vs. yours, did you? You posted while I was writing.
 
oops. sowwies... :eek:

cpu utilization is near nil during SCSI transfers.

its 2x 4 disk raid 5 arrays... striped with raid 0
 
because I was under the impression that raid50 offered better performance then raid 5 with the same hardware...
 
The write performance of RAID 50 will suck at least as bad as RAID 5, as it's the parity calculations that are the bottleneck...
 
well thanks for your advice, thanks tons :) imma go to raid 10 when I get to it and see how things pan out.
 
RAID5 performance only sucks if you're doing it in software.
Using a controller that does RAID5 internally, or an external
storage array of some sort, is fast.
I haven't looked into it too deeply, but I think RAID50 is
only faster if you're maxing out multiple RAID5 controllers,
then striping across the controllers.
If you have hardware support, RAID5 is preferable to just
about anything else.
The i960 is a very fast processor.
 
shieldforyoureyes said:
Sounds like a firmware/driver/os bug to me.

I would have said Windows exploder first, but since it's working fine for transfers off the ATA units, this is not the case. Check for new firmware and drivers. It also could be a resource conflict of some sort, but it sounds a little fishy for that.
 
as I said in my original post, I already got the newest firmware and everything for everything in my system. Im a bit of a stickler for that, especially when things dont go right like now ;)
 
Have you tried copying the large files using command line to see if there is any performance difference?
 
I got my answer. one of the drives just failed. recovering my data right now then rebuilding after as a 6 disk array. grr. :mad: :mad: :mad:
 
Here are a few semi random shots in the dark - because I'm fairly baffled by this.

Could the SODIMM in the card be failing?
does the card have a battery? Does said battery need to be replaced/reconditioned?
cable going bad causing a drive to re-request data? Termination issue?
power issue?
how does a straight raid5/1/0 array work? (pick one, not two ;)) does the problem happen with two drives? three? four? (RAID 5 for the three drives before someone attempts asshattery)

That's all I can really think of. Good luck with it.
 
Back
Top