SAS topology error SMP function failed. LSI sas2, WD RE2

simplesam

Limp Gawd
Joined
Jul 1, 2009
Messages
136
Has anyone seen this error, after running for a month or so?
Controller ID: 0 SAS topology error: SMP function failed.

Then ALL virtual disks drop off of the raid controller. I've had this happen twice.
28 days apart, using two different sas expanders. After a reboot, the raid disks appear be to there and seem to be working properly.

Configuration:
Windows Server 2008 R2.
LSI 9280-8e (no battery), lsi sasx36 based expander.
VD's are all configured as raid 1 or raid 10.
Disk are western digital RE2 500gb and 750gb (WD5000ABYS-01TNA0 and WD7500AYYS-01RCA0)

MegaCli64 -PDList -aALL, reports 0 errors.
MegaCli64 -LDInfo -Lall -aALL, reports 0 errors.

There is a firmware update on the wd site, that talks about an error that could popup every 1-4 weeks. But it does not appear to apply to these drives.

Any insight, would be appreciated.

MegaCli64 -PDList -aALL, output looks like this for all drives.
Enclosure Device ID: N/A
Slot Number: 124
Device Id: 124
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 465.761 GB [0x3a386030 Sectors]
Non Coerced Size: 465.261 GB [0x3a286030 Sectors]
Coerced Size: 464.729 GB [0x3a175800 Sectors]
Firmware state: Online, Spun Up
SAS Address(0): 0x500605b000028006
Connected Port Number: 0(path0)
Inquiry Data: WD-WCAPW3033847WDC WD5000ABYS-01TNA0 12.01C01
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 3.0Gb/s
Link Speed: 3.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified

MegaCli64 -LDInfo -Lall -aALL, output looks like this for all virtual disks.
Virtual Drive: 3 (Target Id: 3)
Name :VD_D6_D7
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 464.729 GB
State : Optimal
Strip Size : 128 KB
Number Of Drives : 2
Span Depth : 1
Default Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Access Policy : Read/Write
Disk Cache Policy : Enabled
Encryption Type : None
 
Are all disks attached to the expander?
If yes, maybe it's the expander that is having the issues?

The two different expanders, same brand/model?

The 9280 has a new firmware/drivers that is about a week old, maybe it has some fixes relative to this issue?
 
Are all disks attached to the expander?
...
The two different expanders, same brand/model?

Yes, and same brand/model.

LSI 9280-8e--> cable--> LSI sasx36 expander--> cables--> backplane (BPN-SAS-846A)--> disks.

I have tried to make it fail by copying a bunch of large files from disk to disk, and running iops benchmarks. But it doesn't fail when I try to stress it. It just seems to fail at random times.
 
Back
Top