IBM Link: Resolving LVM and Hard Disk PVID Issues
Question
Technote that discusses various issues with PVIDs and how to resolve them.
Answer
The Logical Volume Manager (LVM) on AIX uses a Physical Volume Identifier (PVID) to keep track of disk drives that are part of an LVM volume group. This is a system-generated 16-digit number that is stored physically on the hard drive and read into the ODM. LVM also stores this PVID in the Volume Group Descriptor Area (VGDA) for the volume group. Each PVID on a system uniquely identifies one physical volume.
Problems with PVIDs
Duplicate PVID
The same PVID shows up on more than one disk.
Problem Diagnosis
1. Verify that you have a duplicate PVID on multiple disks.
$ lspv
or
If you know the name of an imported volume group having the problem use:
$ lspv | grep VGNAME
The lspv command reads out of the ODM, it does not read the PVID from disk. We need to check if the PVID listed in the ODM is the same as the PVID on disk.
2. To read the PVID off the disk, you must log in as root or su to it, and use:
# lquerypv -h /dev/hdisk7 80 10
Example output:
00000080 00021AC8 E561D594 00000000 00000000 |.....a..........|
The PVID of the disk is in columns 2 and 3.
Possible Cause and Solution 1: ODM and PVID on Disk Do Not Match
If the PVID on disk does not match the ODM output from lspv, use the chdev command to re-read the proper PVID in from disk and repopulate the ODM.
Example:
# lspv | grep hdisk24
hdisk24 deadbeefdeadbeef None
Check to see if the PVID written to the disk is correct:
# lquerypv -h /dev/hdisk24 80 10
00000080 00050A85 577D8A61 00000000 00000000 |....W}.a........|
So the ODM version does not match what is written onto the disk. Another way to verify this is to query the ODM directly:
# odmget -q "name=hdisk24 and attribute=pvid" CuAt
CuAt:
name = "hdisk24"
attribute = "pvid"
value = "deadbeefdeadbeef0000000000000000"
type = "R"
generic = "D"
rep = "s"
nls_index = 2
Force the system to re-read the PVID off disk and repopulate the ODM:
# chdev -a pv=yes -l hdisk24
hdisk24 changed
Now check that lspv shows the correct PVID:
# lspv | grep hdisk24
hdisk24 00050a85577d8a61 None
You can also double-check the ODM directly:
# odmget -q "name=hdisk24 and attribute=pvid" CuAt
CuAt:
name = "hdisk24"
attribute = "pvid"
value = "00050a85577d8a610000000000000000"
type = "R"
generic = "D"
rep = "s"
nls_index = 2
Possible Cause and Solution 2: The Disks Are Copies of Each Other
In recent years, SAN storage has become a popular way to attach new disks to systems using a fibrechannel network. Some of these SAN storage systems include a way to make a bit-for-bit copy of each disk (or LUN) in order to take that to another site or computer. The copy is exact, down to the VGDA and PVID areas of the LUNs If both of these disks, the original and the copy, are zoned to the same host and seen by cfgmgr, it will cause the system to show duplicate PVIDs in commands such as lspv, lsvg, and in the ODM.
Varyonvg (either called from importvg or varyonvg) will fail due to duplicate PVIDs:
# importvg -y datavg hdisk20
0516-1775 varyonvg: Physical volumes hdisk21 and hdisk20 have identical PVIDs (00050a85470c2eeb).
0516-780 importvg: Unable to import volume group from hdisk20.
A. Find out if these disks are indeed copies of each other. If a copy of the volume group on the second disk (or set of disks) is not needed, the VGDA and PVID areas can be wiped out, allowing the user to create a new volume group on the disks.
One word of warning, this is a permanent change to the disks.
First, wipe the PVID off the drive using:
# chdev -a pv=clear -l hdiskX (If using another path manager, the name may be different)
Next clear the VGDA off the drive:
# chpv -C hdiskX
Then you may use the mkvg command to create a new volume group on the disks, or add them to an existing volume group with extendvg.
B. If, however, the disks are copies of each other for use in a backup strategy, and BOTH the source and backup (or target) volume group need to exist on the same machine, then the target volume group will need to be imported using recreatevg. The benefit of using recreatevg is that it can change the PVIDs of the disks as it imports the volume group, and update the VGDA with those new PVIDs.
It should be noted at this point that AIX does not allow two logical volumes to have the same name, or two filesystems to have the same mount point. The recreatevg command by default will change the logical volume names and mount points, since it is meant for recreating an existing volume group.
By default recreatevg will change the existing logical volume names by prefixing them with the string “fs” and prefixing the filesystem mount point paths with “/fs”. These can be changed by using the -L and -Y flags to recreatevg.
For example, if I have a volume group “origvg” with a filesystem and log device, and a SAN copy of it on hdisk6:
# lsvg -l origvg
origvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
loglv01 jfs2log 1 1 1 closed/syncd N/A
fslv02 jfs2 256 256 1 closed/syncd /data
# lspv | grep origvg
hdisk5 00050a85ef3356ec origvg active
hdisk6 00050a85ef3356ec origvg active
I can clear the PVID off hdisk6 and then run recreatevg against it:
# chdev -a pv=clear -l hdisk6
hdisk6 changed
# lspv
hdisk5 00050a85ef3356ec origvg active
hdisk6 none None
# recreatevg -y copyvg hdisk6
copyvg
hdisk5 00050a85ef3356ec origvg active
hdisk6 00050a85ee6d1446 copyvg active
# lsvg -l copyvg
copyvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
fsloglv01 jfs2log 1 1 1 closed/syncd N/A
fsfslv02 jfs2 256 256 1 closed/syncd N/A
Possible Cause and Solution 3: PVIDs Exist on Individual Disk Paths Rather Than on Multipath Device
Example using SDD multipathing:
vpath34 Available Data Path Optimizer Pseudo Device Driver
This vpath is made up of he following hdisks.
hdisk138 Available 01-01-02
U789D.001.DQD35BK-P1-C2-T2-W5005076801402F1D-L22000000000000 FC 2145
hdisk139 Available 01-01-02
U789D.001.DQD35BK-P1-C2-T2-W5005076801402F2D-L22000000000000 FC 2145
hdisk140 Available 03-00-02
U789D.001.DQD35PG-P1-C6-T1-W5005076801302F2D-L22000000000000 FC 2145
hdisk141 Available 03-00-02
U789D.001.DQD35PG-P1-C6-T1-W5005076801302F1D-L22000000000000 FC 2145
———————————————————————
From lspv output:
hdisk138 none None
hdisk139 none None
hdisk140 none None
hdisk141 none None
vpath34 00cb3e924851eb12 oravg1
This is the correct configuration for a vpath, where the PVID is associated with the vpath device in the ODM.
If the PVID appears on the individual paths (hdisks) then it can be moved using one of two SDD commands:
hd2vp The SDD script that converts an hdisk device volume group to an
SDD vpath device volume group.
dpovgfix The command that fixes an SDD volume group that has mixed
vpath and hdisk physical volumes.
This can occur with any multipath driver that creates a pseudo-device to be used for access to the LUN. Another example of this is EMC PowerPath which uses an “hdiskpowerX” pseudo-device, which should have the PVID on it, while the individual paths are listed as “hdiskX” with no pvid or volume group association.
Missing PVID
The PVID from disk does not show up in the ODM or system commands.
Problem Diagnosis
If the disk does not show a PVID in “lspv”:
Solution
Wrong PVID
The PVID from disk shows up in “lspv”, but it is not the right PVID. Also “lspv” may not show the drive associated with a volume group.
Problem Diagnosis
If the disk shows a PVID but no VG:
lspv output:
hdisk11 00c7a967e5b40149 None
Check the PVID on the disk itself to determine if this matches the lspv output:
# lquerypv -h /dev/hdisk11 80 10
00000080 00C7A967 E5B40149 00000000 00000000 |...g...I........|
Now check the ODM to see what PVID is there. This usually will match lspv output, because that is where lspv gets it’s data from.
# odmget -q "name=hdisk11 and attribute=pvid" CuAt
CuAt:
name = "hdisk11"
attribute = "pvid"
value = "00c7a967e5b401490000000000000000"
type = "R"
generic = "D"
rep = "s"
nls_index = 2
So the ODM and PVID on disk match.
Now check the PVIDs listed in the VGDA on the disk:
# lqueryvg -Ptp hdisk11
Physical: 00c7a967cc88015c 2 0
Possible Cause and Solution 1: PVID on disk has changed
If the PVIDs in the VGDA on disk do not contain the PVID from the above lquerypv command, then someone may have changed the one on disk. This may also put the disk in a “missing” state. Another symptom is a vg that may not be able to import.
# importvg -y datavg hdisk11
0516-1939 : PV identifier not found in VGDA.
0516-780 importvg: Unable to import volume group from hdisk11.
Problem Diagnosis:
Check the PVID on disk
# lquerypv -h /dev/hdisk11 80 10
00000080 00C7A967 E5B40149 00000000 00000000 |...g...I........|
Compare that with the PVID in the VGDA:
# lqueryvg -Ptp hdisk11
Physical: 00c7a967cc88015c 2 0
So here the PVID written to disk is not what it should be.
Solution
This can be fixed using the recreatevg command. We will have recreatevg re-write all the PVIDs for each disk in the volume group, and update the VGDAs with those PVIDS so everything matches up again.
We will use the flags “-Y NA” and “-L /” so the logical volume names and filesystem mount points are not changed. Without these options recreatevg will take the default action of changing the logical volume names and adding a prefix to each mount point.
The names of all disks in the volume group must be included on the command line. This is different than the behavior of importvg, which requires the name of only 1 disk.
# recreatevg -y datavg -Y NA -L / hdisk11
datavg
Now check that everything is there.
# lspv | grep hdisk11
hdisk11 00c7a967e61887e7 datavg active
# lsvg -p datavg
datavg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk11 active 639 639 128..128..127..128..128
Possible Cause and Solution 2: PVID in ODM is wrong or gone
Problem Diagnosis:
In this case lspv shows nothing about the disk:
# lspv
hdisk4 none None
or it may show a PVID but no volume group associated:
hdisk4 abcdabcdabcdabcd None
Other symptoms can be that the volume group only shows the PVID for a disk, not a name:
# lsvg -p testvg
0516-304 : Unable to find device id 00c7a967e601c226 in the Device
Configuration Database.
testvg:
PV ID PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
00c7a967e601c226 active 639 639 128..128..127..128..128
If the PVIDs in the VGDA on disk matches the PVID from the above lquerypv command, but not the lspv output or ODM, then the correct PVID can be re-read into the ODM.
# lquerypv -h /dev/hdisk4 80 10
00000080 00C7A967 E601C226 00000000 00000000 |...g...&........|
This is the same PVID we see listed in the lsvg -p output above.
# lqueryvg -Ptp hdisk4
Physical: 00c7a967e601c226 2 0
So the PVID on disk is good, and matches the one in the VGDA, but the ODM is wrong.
Solution
We can fix the ODM using chdev to re-read the correct PVID in again:
# chdev -a pv=yes -l hdisk4
hdisk4 changed
Now verify that it’s fixed:
# lspv | grep hdisk4
hdisk4 00c7a967e601c226 testvg active
# lsvg -p testvg
testvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk4 active 639 639 128..128..127..128..128
# odmget -q "name=hdisk4 and attribute=pvid" CuAt
CuAt:
name = "hdisk4"
attribute = "pvid"
value = "00c7a967e601c2260000000000000000"
type = "R"
generic = "D"
rep = "s"
nls_index = 2