Recovery from Configuration Errors
Configuration errors occur when the configuration of the Primary and Secondary RVGs is not identical. Each data volume in the Primary RVG must have a corresponding data volume in the Secondary RVG of exactly the same size; otherwise, replication will not proceed.
Errors in configuration are detected in two ways:
- When an RLINK is attached for the first time, the configuration of the Secondary is checked for configuration errors. If any errors are found, the attach command fails and prints error messages indicating the problem.
- Changes that affect the configuration on the Primary or Secondary may cause the Secondary to enter the PAUSE state with the secondary_config_err flag set. The problem is fixed by correcting the configuration error, and then resuming the RLINK.
Errors During an RLINK Attach
When an RLINK is attached, VVR checks whether for each data volume associated to the Primary RVG, the Secondary RVG has an associated data volume of the same size that is mapped to its counterpart on the Primary. The following example illustrates an attempted attach with every possible problem and how to fix it. Before the attach, the Primary has this configuration:
TY
| Name
| Assoc
| KSTATE
| LENGTH
|
| STATE
| rv
| hr_rvg
| -
| DISABLED
| -
|
| EMPTY
| rl
| rlk_london_hr_rvg
| hr_rvg
| DETACHED
| -
|
| STALE
| v
| hr_dv01
| hr_rvg
| ENABLED
| 12800
|
| ACTIVE
| pl
| hr_dv01-01
| hr_dv01
| ENABLED
| 12800
|
| ACTIVE
| sd
| disk01-05
| hr_dv01-01
| ENABLED
| 12800
|
| -
| v
| hr_dv02
| hr_rvg
| ENABLED
| 12800
|
| ACTIVE
| pl
| hr_dv02-01
| hr_dv02
| ENABLED
| 12880
|
| ACTIVE
| sd
| disk01-06
| hr_dv02-01
| ENABLED
| 12880
|
|
| v
| hr_dv03
| hr_rvg
| ENABLED
| 12880
|
| ACTIVE
| pl
| hr_dv03-01
| hr_dv03
| ENABLED
| 12880
|
| ACTIVE
| sd
| disk01-07
| hr_dv03-01
| ENABLED
| 12880
|
| -
| v
| hr_srl
| hr_rvg
| ENABLED
| 12880
|
| ACTIVE
| pl
| hr_srl-01
| hr_srl
| ENABLED
| 12880
|
| ACTIVE
| sd
| disk01-08
| hr_srl-01
| ENABLED
| 12880
| 0
| -
|
The Secondary has the following configuration:
TY
| Name
| Assoc
| KSTATE
| LENGTH
|
| STATE
| rv
| hr_rvg
| -
| ENABLED
| -
| -
| ACTIVE
| rl
| rlk_seattle_hr_rvg
| hr_rvg
| ENABLED
| -
| -
| ACTIVE
| v
| hr_dv01
| hr_rvg
| ENABLED
| 12700
| -
| ACTIVE
| pl
| hr_dv01-01
| hr_dv01
| ENABLED
| 13005
| -
| ACTIVE
| sd
| disk01-17
| hr_dv01-01
| ENABLED
| 13005
| 0
| -
| v
| hr_dv2
| hr_rvg
| ENABLED
| 12880
| -
| ACTIVE
| pl
| hr_dv02-01
| vol2
| ENABLED
| 13005
| -
| ACTIVE
| sd
| disk01-18
| hr_dv02-01
| ENABLED
| 13005
| 0
| -
| v
| hr_srl
| hr_rvg
| ENABLED
| 12880
| -
| ACTIVE
| pl
| hr_srl-01
| hr_srl
| ENABLED
| 13005
| -
| ACTIVE
| sd
| disk01-19
| hr_srl-01
| ENABLED
| 13005
| 0
| -
|
Note that on the Secondary, the size of volume hr_dv01 is small, hr_dv2 is misnamed (must be hr_dv02), and hr_dv03 is missing. An attempt to attach the Primary RLINK to this Secondary using the attach command fails.
# vxrlink -g hrdg -f att rlk_london_hr_rvg
The following messages display:
VxVM VVR vxrlink INFO V-5-1-3614 Secondary data volumes detected
with rvg hr_rvg as parent:
VxVM VVR vxrlink ERROR V-5-1-0 Size of secondary datavol hr_dv01
(len=12700) does not match size of primary (len=12800)
VxVM VVR vxrlink ERROR V-5-1-3504 primary datavol hr_dv02 is not
mapped on secondary, yet
VxVM VVR vxrlink ERROR V-5-1-3504 primary datavol hr_dv03 is not
mapped on secondary, yet
To fix the problem, issue the following commands on the Secondary:
-
Resize the data volume hr_dv01:
# vradmin -g hrdg resizevol hr_rvg hr_dv01 12800
-
Rename the data volume hr_dv2 to hr_dv02:
# vxedit -g hrdg rename hr_dv2 hr_dv02
-
Associate a new volume, hr_dv03, of the same size as the Primary data volume hr_dv03.
# vxassist -g hrdg make hr_dv03 12800
# vxvol -g hrdg assoc hr_rvg hr_dv03
Alternatively, the problem can be fixed by altering the Primary to match the Secondary, or any combination of the two. When the Primary and the Secondary match, retry the attach.
On the Primary:
# vxrlink -g hrdg -f att rlk_london_hr_rvg
VxVM VVR vxrlink INFO V-5-1-3614 Secondary data volumes detected with rvg hr_rvg as parent:
VxVM VVR vxrlink INFO V-5-1-0 vol1: len=12800 primary_datavol=hr_dv01
VxVM VVR vxrlink INFO V-5-1-0 vol1: len=12800 primary_datavol=hr_dv02
VxVM VVR vxrlink INFO V-5-1-0 vol1: len=12800 primary_datavol=hr_dv03
Errors During Modification of an RVG
After the initial setup and attach of a Secondary RLINK, incorrect modifications such as adding, resizing, and renaming volumes can cause the affected RLINK to be PAUSED with the secondary_config_err flag set. This prevents replication to the Secondary until the problem is corrected.
Run the vxrlink verify rlink command at either node to check whether this has occurred. When the configuration error has been corrected, the affected RLINK can be resumed.
Missing Data Volume Error
If a data volume is added to the Primary RVG and the Secondary has no corresponding data volume, the RLINK state changes to PAUSED with the secondary_config_err flag set. Executing the vxrlink verify command produces the following:
On the Primary:
# vxrlink -g hrdg verify rlk_london_hr_rvg
RLINK REMOTE HOST LOCAL_HOST STATUS STATE
rlk_london_hr_rvg london seattle ERROR PAUSE
ERROR: hr_dv04 does not exist on Secondary (london)
On the Secondary:
# vxrlink -g hrdg verify rlk_seattle_hr_rvg
RLINK REMOTE HOST LOCAL_HOST STATUS STATE
rlk_seattle_hr_rvg seattle london ERROR PAUSE
ERROR: hr_dv04 does not exist on Secondary (local host)
To correct the problem, either create and associate hr_dv04 on the Secondary or alternately, dissociate vol04 from the Primary, and then resume the Secondary RLINK. To resume the Secondary RLINK, use the vradmin resumerep rvg_name command.
If hr_dv04 on the Primary contains valid data, copy its contents to hr_dv04 on the Secondary before associating the volume to the Secondary RVG.
Data Volume Mismatch Error
If a Primary data volume is increased in size, but the Secondary data volume is not, a configuration error results.
On the Primary:
# vxassist growby hr_dv04 100
# vxrlink -g hrdg verify rlk_london_hr_rvg
RLINK REMOTE HOST LOCAL_HOST STATUS STATE
rlk_london_hr_rvg london seattle ERROR PAUSE
ERROR: hr_dv04 too small (12800 blocks). primary is 12900
On the Secondary:
# vxrlink -g hrdg verify rlk_seattle_hr_rvg
RLINK REMOTE HOST LOCAL_HOST STATUS STATE
rlk_seattle_hr_rvg seattle london ERROR PAUSE
ERROR: hr_dv04 too small (12800 blocks). primary is 12900
To correct the problem, increase the size of the Secondary data volume, or shrink the Primary data volume:
# vradmin -g hrdg resizevol hr_rvg hr_dv04 12900
After resizing a data volume, resume the Secondary RLINK by issuing the following command on any host in the RDS:
# vradmin -g hrdg resumerep hr_rvg
Data Volume Name Mismatch Error
If a volume is renamed on the Primary but not on the Secondary, a configuration error results and the RLINK will be disconnected. Use the vxprint -lP command to view the RLINK flags. If the secondary_config_err flag is set, use one of the following commands to determine if there is a data volume name mismatch error.
On the Primary:
# vxrlink -g hrdg verify rlk_london_hr_rvg
RLINK REMOTE HOST LOCAL_HOST STATUS STATE
rlk_london_hr_rvg london seattle ERROR PAUSE
ERROR: hr_dv04 on secondary has wrong primary_datavol name (hr_dv04, should be hr_dv05)
On the Secondary:
# vxrlink -g hrdg verify rlk_seattle_hr_rvg
RLINK REMOTE HOST LOCAL_HOST STATUS STATE
rlk_seattle_hr_rvg seattle london ERROR PAUSE
ERROR: hr_dv04 on secondary has wrong primary_datavol name (hr_dv04, should be hr_dv05)
To fix this error, do one of the following:
- Rename either the Primary or Secondary data volume, and resume the RLINK using the vradmin resumerep rvg_name command.
OR
- Set the primary_datavol field on the Secondary data volume to refer to the new name of the Primary data volume as follows, and resume the RLINK using the vradmin resumerep rvg_name command.
On the Secondary:
# vxedit -g hrdg set primary_datavol=hr_dv05 hr_dv04
where hr_dv05 is the new name on the Primary
|