Wise people learn when they can; fools learn when they must - Arthur Wellesley

Thursday, 25 September 2014

Veritas Volume Manager -6



                                                                                
VXVM-6 (Volume’s)

What we will learn in Next Few Pages,
1.)     Recover a striped volume if any one of the disk fails
2.)     Recover mirrored volume if one of the disk fails
3.)     Recover raid5 volume if one of the disk fails


OK… Now consider a situation that u had made a striped volume and one of the disks got faulty,

root@pr-01:>/# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
c1t0d0s2     auto:none       -            -            online invalid
c2t2d0s2     auto:sliced     -            -            online
c2t3d0s2     auto:sliced     -            -            online
c2t4d0s2     auto:sliced     -            -            online
c2t5d0s2     auto:sliced     -            -            online
c2t6d0s2     auto:sliced     -            -            online
c2t7d0s2     auto:none       -            -            online invalid
c2t8d0s2     auto:none       -            -            online invalid
c2t9d0s2     auto:none       -            -            online invalid

root@pr-01:>/# vxdg init mydg cds=off d1=c2t2d0

root@pr-01:>/# vxdg -g mydg adddisk d2=c2t3d0


root@pr-01:>/# vxassist -g mydg make stvol maxsize layout=striped d1 d2

root@pr-01:>/# vxprint -htq
Disk group: mydg

dg mydg         default      default  23000    1411540596.55.pr-01

dm d1           c2t2d0s2     auto     69372    7778304  -
dm d2           c2t3d0s2     auto     67324    1890304  -

v  stvol        -            ENABLED  ACTIVE   3780608  SELECT    stvol-01 fsgen
pl stvol-01     stvol        ENABLED  ACTIVE   3780608  STRIPE    2/128    RW
sd d1-01        stvol-01     d1       0        1890304  0/0       c2t2d0   ENA
sd d2-01        stvol-01     d2       0        1890304  1/0       c2t3d0   ENA


root@pr-01:>/# mkfs -F vxfs /dev/vx/rdsk/mydg/stvol
    version 9 layout
    3780608 sectors, 1890304 blocks of size 1024, log size 16384 blocks
    rcq size 1024 blocks
    largefiles supported
root@pr-01:>/# mkdir /stvol-test

root@pr-01:>/# mount -F vxfs /dev/vx/dsk/mydg/stvol /stvol-test

OK… Now I had removed c2t3d0s2 / d2 from Openfiler

root@pr-01:>/# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
c1t0d0s2     auto:none       -            -            online invalid
c2t2d0s2     auto:sliced     d1           mydg         online
c2t3d0s2     auto:sliced     -            -            online
c2t4d0s2     auto:sliced     -            -            online
c2t5d0s2     auto:none       -            -            online invalid
c2t6d0s2     auto:none       -            -            online invalid
c2t7d0s2     auto:none       -            -            online invalid
-            -         d2           mydg         failed was:c2t3d0s2


Also check the volume is startable or not using below command

root@pr-01:>/# vxinfo -g mydg
stvol          fsgen    Unstartable

Also check status plex and sub disk using below command

root@pr-01:>/# vxprint -g mydg -htq
dg mydg         default      default  23000    1411540596.55.pr-01

dm d1           c2t2d0s2     auto     69372    7778304  -
dm d2           -            -        -        -        NODEVICE

v  stvol        -            DISABLED ACTIVE   3780608  SELECT    -        fsgen
pl stvol-01     stvol        DISABLED NODEVICE 3780608  STRIPE    2/128    RW
sd d1-01        stvol-01     d1       0        1890304  0/0       c2t2d0   ENA
sd d2-01        stvol-01     d2       0        1890304  1/0       -        NDEV

root@pr-01:>/# umount /stvol-test/

Ask the concern team to replace the faulty Lun, Or u r the lucky one then do it,

Suppose we got c2t5d0s2

c2t5d0s2     auto:none       -            -            online invalid

root@pr-01:>/# vxdisksetup -i c2t5d0 format=sliced

c2t5d0s2     auto:sliced     -            -            online




root@pr-01:>/# vxdiskadm

Volume Manager Support Operations
Menu: VolumeManager/Disk

 1      Add or initialize one or more disks
 2      Encapsulate one or more disks
 3      Remove a disk
 4      Remove a disk for replacement
 5      Replace a failed or removed disk
 ==============================
 23     Dynamic Reconfiguration Operations
 list   List disk information


 ?      Display help about menu
 ??     Display help about the menuing system
 q      Exit from menus

Select an operation to perform: 5

Replace a failed or removed disk
Menu: VolumeManager/Disk/ReplaceDisk
  Use this menu operation to specify a replacement disk for a disk
  that you removed with the "Remove a disk for replacement" menu
  operation, or that failed during use.  You will be prompted for
  a disk name to replace and a disk device to use as a replacement.
  You can choose an uninitialized disk, in which case the disk will
  be initialized, or you can choose a disk that you have already
  initialized using the Add or initialize a disk menu operation.

Select a removed or failed disk [<disk>,list,q,?] list

Disk group: mydg

DM NAME         DEVICE       TYPE     PRIVLEN  PUBLEN   STATE

dm d2           -            -        -        -        NODEVICE


Select a removed or failed disk [<disk>,list,q,?] d2
  The following devices are available as replacements:

        c2t3d0 c2t4d0 c2t5d0

  You can choose one of these devices to replace d2.
  Choose "none" to initialize another device to replace d2.

Choose a device, or select none
[<device>,none,q,?]  (default: c2t3d0) c2t5d0
  VxVM  INFO V-5-2-382
The requested operation is to use the initialized device c2t5d0
  to replace the removed or failed disk d2 in disk group mydg.

Continue with operation? [y,n,q,?]  (default: y) y

Use FMR for plex resync? [y,n,q,?]  (default: n) n
  VxVM  INFO V-5-2-282
Replacement of disk d2 in group mydg with disk device
  c2t5d0 completed successfully.

Replace another disk? [y,n,q,?]  (default: n) n

Volume Manager Support Operations
Menu: VolumeManager/Disk

 1      Add or initialize one or more disks
 2      Encapsulate one or more disks
 ===================================
 23     Dynamic Reconfiguration Operations
 list   List disk information


 ?      Display help about menu
 ??     Display help about the menuing system
 q      Exit from menus

Select an operation to perform: q

Goodbye.

Check the status of Volume and plex


root@pr-01:>/# vxprint -g mydg -htq
dg mydg         default      default  23000    1411540596.55.pr-01

dm d1           c2t2d0s2     auto     69372    7778304  -
dm d2           c2t5d0s2     auto     67324    1890304  -

v  stvol        -            DISABLED ACTIVE   3780608  SELECT    -        fsgen
pl stvol-01     stvol        DISABLED RECOVER  3780608  STRIPE    2/128    RW
sd d1-01        stvol-01     d1       0        1890304  0/0       c2t2d0   ENA
sd d2-01        stvol-01     d2       0        1890304  1/0       c2t5d0   ENA




root@pr-01:>/# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
c1t0d0s2     auto:none       -            -            online invalid
c2t2d0s2     auto:sliced     d1           mydg         online
c2t3d0s2     auto:sliced     -            -            online
c2t4d0s2     auto:sliced     -            -            online
c2t5d0s2     auto:sliced     d2           mydg         online
c2t6d0s2     auto:none       -            -            online invalid
c2t7d0s2     auto:none       -            -            online invalid

The next step is to bring the plex in stale state. For that we need to offline the plex and then bring the plex back online.

Below is to bring the plex offline.

root@pr-01:>/# vxmend -g mydg -o force off stvol-01

Check if the Plex is now offline

root@pr-01:>/# vxprint -htq
Disk group: mydg

dg mydg         default      default  23000    1411540596.55.pr-01

dm d1           c2t2d0s2     auto     69372    7778304  -
dm d2           c2t5d0s2     auto     67324    1890304  -

v  stvol        -            DISABLED ACTIVE   3780608  SELECT    -        fsgen
pl stvol-01     stvol        DISABLED OFFLINE  3780608  STRIPE    2/128    RW
sd d1-01        stvol-01     d1       0        1890304  0/0       c2t2d0   ENA
sd d2-01        stvol-01     d2       0        1890304  1/0       c2t5d0   ENA

Now bring the the plex in online state using below command.

root@pr-01:>/# vxmend -g mydg on stvol-01

Check the status of the plex if it is stale or not

root@pr-01:>/# vxprint -htq
Disk group: mydg

dg mydg         default      default  23000    1411540596.55.pr-01

dm d1           c2t2d0s2     auto     69372    7778304  -
dm d2           c2t5d0s2     auto     67324    1890304  -

v  stvol        -            DISABLED ACTIVE   3780608  SELECT    -        fsgen
pl stvol-01     stvol        DISABLED STALE    3780608  STRIPE    2/128    RW
sd d1-01        stvol-01     d1       0        1890304  0/0       c2t2d0   ENA
sd d2-01        stvol-01     d2       0        1890304  1/0       c2t5d0   ENA


Now we bring the plex in clean state, so that volume can be started.

root@pr-01:>/# vxmend -g mydg fix clean stvol-01

root@pr-01:>/# vxprint -htq |grep stvol-01
pl stvol-01     stvol        DISABLED CLEAN    3780608  STRIPE    2/128    RW
sd d1-01        stvol-01     d1       0        1890304  0/0       c2t2d0   ENA
sd d2-01        stvol-01     d2       0        1890304  1/0       c2t5d0   ENA


Start the volume, so that volume is completely recovered

root@pr-01:>/# vxvol -g mydg start stvol

root@pr-01:>/# vxprint -htq
Disk group: mydg

dg mydg         default      default  23000    1411540596.55.pr-01

dm d1           c2t2d0s2     auto     69372    7778304  -
dm d2           c2t5d0s2     auto     67324    1890304  -

v  stvol        -            ENABLED  ACTIVE   3780608  SELECT    stvol-01 fsgen
pl stvol-01     stvol        ENABLED  ACTIVE   3780608  STRIPE    2/128    RW
sd d1-01        stvol-01     d1       0        1890304  0/0       c2t2d0   ENA
sd d2-01        stvol-01     d2       0        1890304  1/0       c2t5d0   ENA


root@pr-01:>/# vxinfo -g mydg
stvol          fsgen    Started

Then need to run fsck, if lucky then fsck will work otherwise,

root@pr-01:>/# fsck -F vxfs -o full -y /dev/vx/rdsk/mydg/stvol

If unsuccessful, then remember mkfs ??

Create new file system with mkfs and mount it.


Since this is a striped volume, and if any one of the two disk's fail's, there is data loss. To avoid this type scenario, there introduces concept of mirroring


MIRRORED VOLUME RECOVERY,

root@pr-01:>/# vxassist -g mydg make mrvol maxsize d6 d7 layout=mirror

root@pr-01:>/# mkfs -F vxfs /dev/vx/rdsk/mydg/mrvol

root@pr-01:>/# mkdir /mrvol-test

root@pr-01:>/# mount -F vxfs /dev/vx/dsk/mydg/mrvol /mrvol-test/

root@pr-01:>/# cd /mrvol-test/

root@pr-01:>/mrvol-test# mkfile 10m file1
root@pr-01:>/mrvol-test# mkfile 10m file2
root@pr-01:>/mrvol-test# mkfile 10m file3

root@pr-01:>/mrvol-test# cd /

All set and looking good,


Let’s some mesh

Unmap one disk from openfiler,

Well… how to know that which disk has to be remove ?

root@pr-01:>/# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c1t0d0 <DEFAULT cyl 1563 alt 2 hd 255 sec 63>
          /pci@0,0/pci15ad,1976@10/sd@0,0
       1. c2t2d0 <DEFAULT cyl 1917 alt 2 hd 128 sec 32>
          /iscsi/disk@0000iqn.2006-01.com.openfiler%3Atsn.aeed9c1a441f0001,0
       2. c2t6d0 <DEFAULT cyl 957 alt 2 hd 64 sec 32>
          /iscsi/disk@0000iqn.2006-01.com.openfiler%3Atsn.aeed9c1a441f0001,6
       3. c2t7d0 <DEFAULT cyl 957 alt 2 hd 64 sec 32>
          /iscsi/disk@0000iqn.2006-01.com.openfiler%3Atsn.aeed9c1a441f0001,7
       4. c2t8d0 <DEFAULT cyl 957 alt 2 hd 64 sec 32>
          /iscsi/disk@0000iqn.2006-01.com.openfiler%3Atsn.aeed9c1a441f0001,8
       5. c2t9d0 <DEFAULT cyl 957 alt 2 hd 64 sec 32>
          /iscsi/disk@0000iqn.2006-01.com.openfiler%3Atsn.aeed9c1a441f0001,9
Specify disk (enter its number): ^C


See… the 0,6,7,8,9 those are LUN id’s… u will find the same in openfiler


root@pr-01:>/# vxprint -htq
Disk group: mydg

dg mydg         default      default  23000    1411540596.55.pr-01

dm d1           c2t2d0s2     auto     69372    7778304  -
dm d6           -            -        -        -        NODEVICE
dm d7           c2t7d0s2     auto     67324    1890304  -

v  mrvol        -            ENABLED  ACTIVE   1890304  SELECT    -        fsgen
pl mrvol-01     mrvol        DISABLED NODEVICE 1890304  CONCAT    -        RW
sd d6-01        mrvol-01     d6       0        1890304  0         -        RLOC
pl mrvol-02     mrvol        ENABLED  ACTIVE   1890304  CONCAT    -        RW
sd d7-01        mrvol-02     d7       0        1890304  0         c2t7d0   ENA

Gone… But that is mirrored volume… great na ?

Still the other plex is working

Well, I got a disk for replacement

root@pr-01:>/# vxdisksetup -i c2t8d0 format=sliced

root@pr-01:>/# vxdiskadm


Select an operation to perform: 5

Select a removed or failed disk [<disk>,list,q,?] list

Disk group: mydg

DM NAME         DEVICE       TYPE     PRIVLEN  PUBLEN   STATE

dm d6           -            -        -        -        NODEVICE


Select a removed or failed disk [<disk>,list,q,?] d6
  The following devices are available as replacements:

        c2t8d0

  You can choose one of these devices to replace d6.
  Choose "none" to initialize another device to replace d6.

Choose a device, or select none
[<device>,none,q,?]  (default: c2t8d0) c2t8d0
  VxVM  INFO V-5-2-382
The requested operation is to use the initialized device c2t8d0
  to replace the removed or failed disk d6 in disk group mydg.

Continue with operation? [y,n,q,?]  (default: y) y

Use FMR for plex resync? [y,n,q,?]  (default: n) n
  VxVM  INFO V-5-2-282
Replacement of disk d6 in group mydg with disk device
  c2t8d0 completed successfully.

Replace another disk? [y,n,q,?]  (default: n) n



root@pr-01:>/# vxprint -htq
Disk group: mydg

dg mydg         default      default  23000    1411540596.55.pr-01

dm d1           c2t2d0s2     auto     69372    7778304  -
dm d6           c2t8d0s2     auto     67324    1890304  -
dm d7           c2t7d0s2     auto     67324    1890304  -

v  mrvol        -            ENABLED  ACTIVE   1890304  SELECT    -        fsgen
pl mrvol-01     mrvol        ENABLED  ACTIVE   1890304  CONCAT    -        RW
sd d1-01        mrvol-01     d1       0        1890304  0         c2t2d0   ENA
pl mrvol-02     mrvol        ENABLED  ACTIVE   1890304  CONCAT    -        RW
sd d7-01        mrvol-02     d7       0        1890304  0         c2t7d0   ENA


Great, no off on for plexes…

Well… volume is already started man… so when we replaced the disk, plex was automatically recovered.

RAID5 VOLUME RECOVERY,

root@pr-01:>/# vxassist -g mydg make r5vol maxsize layout=raid5

root@pr-01:>/# mkfs -F vxfs /dev/vx/rdsk/mydg/r5vol
    version 9 layout
    3780608 sectors, 1890304 blocks of size 1024, log size 16384 blocks
    rcq size 1024 blocks
    largefiles supported
root@pr-01:>/# mkdir /r5vol-test
root@pr-01:>/# mount -F vxfs /dev/vx/dsk/mydg/r5vol /r5vol-test/
root@pr-01:>/# cd /r5vol-test/
root@pr-01:>/r5vol-test# mkfile 10m file1
root@pr-01:>/r5vol-test# mkfile 10m file2
root@pr-01:>/r5vol-test# cd /

root@pr-01:>/r5vol-test# vxprint -htq
Disk group: mydg

dg mydg         default      default  11000    1411564143.21.pr-01

dm d2           c2t2d0s2     auto     69372    7778304  -
dm d6           c2t6d0s2     auto     67324    1890304  -
dm d7           c2t7d0s2     auto     67324    1890304  -
dm d8           c2t8d0s2     auto     67324    1890304  -

v  r5vol        -            ENABLED  ACTIVE   3780608  RAID      -        raid5
pl r5vol-01     r5vol        ENABLED  ACTIVE   3780608  RAID      3/32     RW
sd d2-01        r5vol-01     d2       0        1890304  0/0       c2t2d0   ENA
sd d6-01        r5vol-01     d6       0        1890304  1/0       c2t6d0   ENA
sd d7-01        r5vol-01     d7       0        1890304  2/0       c2t7d0   ENA
pl r5vol-02     r5vol        ENABLED  LOG      2880     CONCAT    -        RW
sd d8-01        r5vol-02     d8       0        2880     0         c2t8d0   ENA

removed one disk, and

root@pr-01:>/# vxinfo -g mydg
r5vol          raid5    Started Degraded

root@pr-01:>/# vxprint –htq
v  r5vol        -            ENABLED  ACTIVE   3780608  RAID      -        raid5
pl r5vol-01     r5vol        ENABLED  ACTIVE   3780608  RAID      3/32     RW
sd d2-01        r5vol-01     d2       0        1890304  0/0       c2t2d0   ENA
sd d6-01        r5vol-01     d6       0        1890304  1/0       -        NDEV
sd d7-01        r5vol-01     d7       0        1890304  2/0       c2t7d0   ENA
pl r5vol-02     r5vol        ENABLED  LOG      2880     CONCAT    -        RW
sd d8-01        r5vol-02     d8       0        2880     0         c2t8d0   ENA





Well, Fortunately I got an error…

root@pr-01:>/# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
c1t0d0s2     auto:none       -            -            online invalid
c2t2d0s2     auto:sliced     d1           mydg         online dgdisabled
c2t7d0s2     auto:sliced     d7           mydg         online dgdisabled
c2t8d11s2    auto:sliced     -            -            online
c2t9d0s2     auto:sliced     d8           mydg         online dgdisabled
c2t10d0s2    auto:sliced     d6           mydg         online dgdisabled

root@pr-01:>/# vxprint –htq
nothing in o/p

root@pr-01:>/# vxinfo -g mydg
VxVM vxinfo ERROR V-5-1-607 Diskgroup mydg not found

root@pr-01:>/# vxdg deport mydg
VxVM vxdg ERROR V-5-1-584 Disk group mydg: Some volumes in the disk group are in use

root@pr-01:>/# df –kh
df: cannot statvfs /r5vol-test: I/O error

root@pr-01:>/# umount /r5vol-test

Deport the disk group using the following command:
# vxdg deport diskgroup
# vxddladm set namingscheme=ebn

Use the vxdarestore command to restore the failed disks, and to recover the
   objects on those disks:
# /etc/vx/bin/vxdarestore  
Re-import the disk group usingthe following command:
# vxdg import diskgroup


root@pr-01:>/# vxdiskadm

Select an operation to perform: 5

Select a removed or failed disk [<disk>,list,q,?] list

Select a removed or failed disk [<disk>,list,q,?] d6

[<device>,none,q,?]  (default: c2t9d0) c2t9d0

Continue with operation? [y,n,q,?]  (default: y) y

Use FMR for plex resync? [y,n,q,?]  (default: n) n

Replace another disk? [y,n,q,?]  (default: n) n

Select an operation to perform: q

root@pr-01:>/# vxinfo -g mydg
r5vol          raid5    Started Degraded
root@pr-01:>/# vxprint -htq
Disk group: mydg

dg mydg         default      default  11000    1411564143.21.pr-01

dm d2           c2t2d0s2     auto     69372    7778304  -
dm d6           c2t9d0s2     auto     67324    1890304  -
dm d7           c2t7d0s2     auto     67324    1890304  -
dm d8           c2t8d0s2     auto     67324    1890304  -

v  r5vol        -            ENABLED  ACTIVE   3780608  RAID      -        raid5
pl r5vol-01     r5vol        ENABLED  ACTIVE   3780608  RAID      3/32     RW
sd d2-01        r5vol-01     d2       0        1890304  0/0       c2t2d0   ENA
sd d6-01        r5vol-01     d6       0        1890304  1/0       c2t9d0   RCOV
sd d7-01        r5vol-01     d7       0        1890304  2/0       c2t7d0   ENA
pl r5vol-02     r5vol        ENABLED  LOG      2880     CONCAT    -        RW
sd d8-01        r5vol-02     d8       0        2880     0         c2t8d0   ENA


No comments:

Post a Comment