Replacing a failed disk with DiskSuite
Example one
In this example, the boot disk mirror c0t1d0 failed. All submirrors on c0t1d0
were placed in "maintenance" state, so no reads or writes were
occurring on the disk. The replacement disk has identical geometry of
boot disk c0t0d0.
1. Delete any state database replicas from the failed disk. A "W" in metadb output indicates replica device write errors.
# metadb
flags first
blk block count
a m p
luo
16
8192
/dev/dsk/c0t0d0s6
a p
luo
8208
8192
/dev/dsk/c0t0d0s6
W p
l
16
8192
/dev/dsk/c0t1d0s6
W p
l
8208
8192
/dev/dsk/c0t1d0s6
# metadb -d /dev/dsk/c0t1d0s6
# metadb
flags first
blk block count
a m p
luo
16
8192
/dev/dsk/c0t0d0s6
a p
luo
8208
8192
/dev/dsk/c0t0d0s6
2. Replace the failed disk.
3. Copy the partition table from the good disk to the replacement disk.
# prtvtoc /dev/rdsk/c0t0d0s2 | fmthard -s - /dev/rdsk/c0t1d0s2
4. Create state database replicas on the replacement disk.
# metadb -a -c 2 /dev/rdsk/c0t1d0s6
5. Determine which submirrors need to be resynchronized.
# metastat | grep 'Invoke: metareplace'
Invoke: metareplace d30 c0t1d0s0 <new device>
Invoke: metareplace d31 c0t1d0s1 <new device>
Invoke: metareplace d33 c0t1d0s3 <new device>
Invoke: metareplace d34 c0t1d0s4 <new device>
Invoke: metareplace d35 c0t1d0s5 <new device>
6. Resynchronize the submirrors.
# ./metareplace -e d30 c0t1d0s0
d30: device c0t1d0s0 is enabled
# ./metareplace -e d31 c0t1d0s1
d31: device c0t1d0s1 is enabled
# ./metareplace -e d33 c0t1d0s3
d32: device c0t1d0s3 is enabled
# ./metareplace -e d34 c0t1d0s4
d33: device c0t1d0s4 is enabled
# ./metareplace -e d35 c0t1d0s5
d34: device c0t1d0s5 is enabled
You can monitor the resynchronization progress with metastat.
Example two
In this example, two of the four slices on c1t1d0
have been placed in "maintenance" state. Because of the large number of
transport errors, I chose to replace the disk. Since two of the slices
are in "okay" state--reads and writes are occurring on the slices--they
have to be taken offline before replacing the disk.
# iostat -En
...
c1t1d0 Soft Errors: 1 Hard Errors: 1 Transport Errors: 28
Vendor: SEAGATE Product: ST373405FSUN72G Revision: 0638 Serial No: 0202K0ZZ0F
Size: 73.40GB <73400057856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 1 Recoverable: 1
# metastat d20 d21 d23 d25
d20: Concat/Stripe
Size: 3154560 blocks
Stripe 0:
Device
Start Block Dbase State
Hot Spare
c1t1d0s0
0 No Maintenance
d21: Concat/Stripe
Size: 8395200 blocks
Stripe 0:
Device
Start Block Dbase State
Hot Spare
c1t1d0s1
0 No
Okay
d23: Concat/Stripe
Size: 2106432 blocks
Stripe 0:
Device
Start Block Dbase State
Hot Spare
c1t1d0s3
0 No Maintenance
d25: Concat/Stripe
Size: 129672768 blocks
Stripe 0:
Device
Start Block Dbase State
Hot Spare
c1t1d0s5
0 No
Okay
Before replacing the disk, run metaoffline on the submirrors that are in "okay" state.
# metaoffline d1 d21
d1: submirror d21 is offlined
# metaoffline d5 d25
d5: submirror d25 is offlined
Delete the state database replaces on c1t1d0.
# metadb
flags first
blk block count
a
m p luo
16
1034
/dev/dsk/c1t0d0s4
a p
luo
16
1034
/dev/dsk/c1t0d0s6
a p
luo
16
1034
/dev/dsk/c1t1d0s4
a p
luo
16
1034
/dev/dsk/c1t1d0s6
# metadb -d /dev/dsk/c1t1d0s4 /dev/dsk/c1t1d0s6
# metadb
flags first
blk block count
a
m p luo
16
1034
/dev/dsk/c1t0d0s4
a p
luo
16
1034
/dev/dsk/c1t0d0s6
Replace the disk with a disk of identical geometry, and copy the partition table from the good disk to the replacement disk.
# prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2
fmthard: New volume table of contents now in place.
Re-create the state database replicas on the replacement disk.
# metadb -a -c 1 /dev/dsk/c1t1d0s4 /dev/dsk/c1t1d0s6
Run metareplace to replace the failed submirrors that were in "maintenance" state and metaonline to re-enable the submirrors that were in "okay" state.
# metastat | grep metareplace
Invoke: metareplace d0 c1t1d0s0 <new device>
Invoke: metareplace d3 c1t1d0s3 <new device>
# metareplace -e d0 c1t1d0s0
d0: device c1t1d0s0 is replaced with c1t1d0s0
# metareplace -e d3 c1t1d0s3
d3: device c1t1d0s3 is replaced with c1t1d0s3
# metaonline d1 d21
d1: submirror d21 is onlined
# metaonline d5 d25
d5: submirror d25 is onlined
Monitor the output of metastat as the submirrors resynchronize. If the submirrors re-enabled with metaonline are placed in "maintenance" state, use metareplace on these submirrors.
# metareplace -e d5 c1t1d0s5
d5: device c1t1d0s5 is replaced with c1t1d0s5
Back to brandonhutchinson.com.
Last modified: 2007/03/20