Ferhat's Blog

There will be only one database

Archive for February, 2011

What happens when exadata has lost a disk?

Posted by fsengonul on February 5, 2011

We have experienced a disk failure today and changed it without any problem or manual commands.
This morning we have lost a disk in exadata. We got an alert and an email mentioning that “Hard disk status changed to predicative failure: critical” . There was also the drawing of the location of the corrupted disk in the email.
From the logs of the cell and asm, it can be easily seen that it has dropped the grid disks and started a rebalance operation in order to be sure that all the data has 2 copies.
We did not wait for the oracle/sun engineer to come and replace the disk. Our system admins has replaced the disk and exadata automatically recognized the new disk and started a new rebalance operation without any manual commands.


/* cell  triggers the drop operation */
Sat Feb 05 11:50:31 2011
Received subopcode 6 in publish ASM Query on 3 guids.
NOTE: Initiating ASM Instance operation: ASM DROP critical disk on 3 disks
DATA_CD_08_cel11 [00000xxxx-yyyy-zzzz-0000-000000000000]
RECO_CD_08_cel11 [00000xxxx-yyyy-zzzz-0000-000000000000]
SYSTEMDG_CD_08_cel11 [00000xxxx-yyyy-zzzz-0000-000000000000]


/* the corrupt disk has been replaced with the spare one */
Sat Feb 05 16:40:44 2011
Drop celldisk CD_08_cel11 (options: force, from memory only) - begin
Drop celldisk CD_08_cel11 - end
Sat Feb 05 16:40:44 2011
Open received invalid device name SYSTEMDG_CD_08_cel11
Sat Feb 05 16:40:44 2011
Open received invalid device name SYSTEMDG_CD_08_cel11
Sat Feb 05 16:42:44 2011
create CELLDISK CD_08_cel11 on device /dev/sdi
Sat Feb 05 16:42:44 2011
create GRIDDISK DATA_CD_08_cel11 on CELLDISK CD_08_cel11
Griddisk DATA_CD_08_cel11  - number is (248)
NOTE: Initiating ASM instance operation:
Operation: DROP and ADD of ASM disk for Grid disk guid=00000xxxx-yyyy-zzzz-0000-000000000000
Received subopcode 4 in publish ASM Query on 1 guids.
NOTE: Initiating ASM Instance operation: ASM DROP ADD disk on 1 disks
DATA_CD_08_cel11 [00000xxxx-yyyy-zzzz-0000-000000000000]

Storage Index Allocation for GridDisk DATA_CD_08_cel11 successful

Sat Feb 05 16:42:44 2011
create GRIDDISK RECO_CD_08_cel11 on CELLDISK CD_08_cel11
Griddisk RECO_CD_08_cel11  - number is (252)
NOTE: Initiating ASM instance operation:
Operation: DROP and ADD of ASM disk for Grid disk guid=00000xxxx-yyyy-zzzz-0000-000000000000
Received subopcode 4 in publish ASM Query on 1 guids.
NOTE: Initiating ASM Instance operation: ASM DROP ADD disk on 1 disks
RECO_CD_08_cel11 [00000xxxx-yyyy-zzzz-0000-000000000000]

Storage Index Allocation for GridDisk RECO_CD_08_cel11 successful

 



Sat Feb 05 16:42:44 2011
create GRIDDISK SYSTEMDG_CD_08_cel11 on CELLDISK CD_08_cel11
Griddisk SYSTEMDG_CD_08_cel11  - number is (256)
NOTE: Initiating ASM instance operation:
Operation: DROP and ADD of ASM disk for Grid disk guid=00000xxxx-yyyy-zzzz-0000-000000000000
Received subopcode 4 in publish ASM Query on 1 guids.
NOTE: Initiating ASM Instance operation: ASM DROP ADD disk on 1 disks
SYSTEMDG_CD_08_cel11 [00000xxxx-yyyy-zzzz-0000-000000000000]

Posted in Exadata, oracle | 7 Comments »