IBM Firmware Bug Could Cause Data Loss

BM today announced that a software bug in a firmware affecting hard drives of IBM based servers could cause data loss after a power cycle. After the power cycle, the SATA drive is no longer avaliable and becomes unresponsive. "Data may become inaccessible due to the drive not responding." the bulletin by IBM stated. The affected firmware is level BB10 on the following models:

ST31000340NS
ST3250310NS
ST3500320NS
ST3750330NS

To determine your drive model and firmware, use IBM ServeRAID Manager, MegaRAID storage Manager, the hard disk drive update utility, the BIOS utility of hard disk controllers, or the label on hard disk drives.

The issue is caused by a rare condition in the firmware that allows the drive's event log pointer to be set to an invalid location.

This condition is detected by the drive during power up, and the drive goes in to failsafe mode to prevent inadvertent corruption to or loss of user data. As a result, once the failure has occurred user data becomes inaccessible.

The condition only occurs after a power cycle and not during runtime. Therefore, avoiding or minimizing power cycles will greatly reduce the chances of SATA drives becoming inoperable after a power cycle.

To correct the issue you will need to install the newest firmware with the fix included. IBM says the target date for this release will be the first quarter of 2009. To stay on top of this issue, refer to the IBM Support Page H194623