Symptom:
Drive failure results in the Controller shutting down. Typically, you will first see drive failure events followed by a Controller shutdown event. Below is an example:
- Error 0x21081282 Enclosure Slot: 40 Drive failed. (reported by slot A)
- Error 0x22081682 Enclosure Slot: 40 Drive error detected (reported by slot A)
- Critical Error 0x32011102 Controller A shutdown due to an unexpected condition. (reported by slot B)
Diagnosis:
When a drive fails, the Input/Output (IO) operations directed to that drive become stuck. This IO congestion can cause significant delays or timeouts in the system. The firmware of the storage system attempts to handle the stuck IO operations, but conflicts within the firmware can lead to the Controller shutting down to prevent further issues.
Solution:
- Replace the Failed Drive:
Identify and replace the drive in the enclosure slot where the failure is reported. - Reinsert the Failed Controller:
- Remove the failed Controller from the system.
- Wait until all LEDs on the Controller are off.
- Reinsert the Controller into the system.
- Contact Technical Support:
- Reach out to technical support to obtain the firmware fix needed to address the issue.
- Contact technical support at: Infortrend Support
Following these steps should resolve the issue of drive failure causing the Controller to shut down.