The purpose of this blog post is to show how to create and configure a SAP HANA HA/DR Hook. It should help you to to ensure that your SAP HANA system replication is in sync, before performing a takeover process using SAP Landscape Management. With this feature no data loss will occur during a takeover operation.
- SAP Landscape Management 3.0 SP07 PL2
- SAP Host Agent Release 7,21 Patch Number 33
- SAP Adaptive Extension Package 42
Creating SAP HANA Replication Status Repositories
Hint: To establish HA/DR capabilities using SAP HANA HA/DR provider hooks, both SAP HANA sites need to be aware of each other’s status. As the actual purpose of HA/DR is of course to provide continuity when one SAP HANA is “out of order”, a third party is required. To provide for this, an NFS is required in this process. When finished, both the SAP HANA HA/DR hooks on primary and secondary site will be capable of posting their status to this commonly accessible NFS share.
- You have detected a host, which you want to use as a SAP HANA Replication Status Repository.
You have to create a NFS share for the scripts of the SAP HANA HA/DR Hook, which will be installed with the SAP HANA Replication Status Repository configuration. In case you do not have a central NFS server or you only use SAN volumes, you should consider SAP Note 2628497 – Mount SAN volumes via NFS in SAP Landscape Management
The NFS share is mountable by all SAP HANA nodes configured for the hook.
The NFS share is writable by the SAP HANA sidadm.
Prepare the central host as described in SAP Note 2638848 – Installation of SAP Landscape Management HA/DR Provider Hook. Recommendation: Ensure that the central host is accessible by using different networks than the SAP HANA nodes.
Mount the NFS share to the central host. Recommendation: Mount the NFS share by using a different network than the SAP HANA system replication.This is for security, performance and availability reasons.
Go To: Infrastructure > Repositories > Add Repository
LaMa is managing the SAP HANA Replication Status Repository as a repository within the LaMa configuration to be able to mount and unmount it and to install the the scripts of the SAP HANA HA/DR Hook scripts within it.
You have to fill the following parameters:
- Name: Enter the Name of your SAP HANA Replication Status Repository.
- Host Name: Enter the Host Name, which you prepared for the SAP HANA Replication Status Repository.
- Repository Type: Select SAP HANA Replication Status
Click on Next.
Click on Retrieve Mount List.
Select the Mount Point you want to use for the Repository. It has to be a NFS share.
Click on Next.
Check the shown data on the summary screen. In case the field Storage Host Name is not filled on the summary screen you did not choose a correct NFS share and you won’t be able to save the repository. If you did not select a proper NFS share you will see the following error message.
Click on Save
Assigning Replication Status Repository
Before performing a takeover process using SAP Landscape Management, you can ensure that SAP HANA system replication is in sync.
- A replication status repository has been configured.
- You have permission Service, Resource(CriticalCustomOperation).
- SAP Adaptive Extensions have been installed with the minimum patch level 41.
- The primary master node status is Not Running or Running.
Go To Operations and Maintenance > Operations.> Choose Systems from the tabs.
Select the system for which you want to create the repository.
From the Operations dropdown for the primary master node, choose SAP HANA Processes > Assign Replication Status Repository.
All steps included in the process are displayed in a graphical form.
Note: To view more details or to change parameters and settings such as the log and operation mode for an individual step, select a process step.
You have to enter the following parameters:
- Specify the Maximum Accepted Age of Replication Status in seconds.
- Choose the replication status repository.
- Optional: Change default installation directory of the HA/DR provider scripts. The default directory should be available on all nodes of a scale-out cluster. Ensure that the scripts are installed and available on all nodes in this directory.
- Optional: To stop processing on the primary instance in case the replication status share is not available and replication is no longer in sync, set Strict Handling to True. With that you can avoid data loss. You can then no longer write on the primary instance.
- Optional: Enter the timeout while writing to the replication status share. This is the maximum amount of time the primary instance will wait for the hook in case of a stale NFS share.
- Optional: Enter the interval for the file system check in seconds.Ensure that the interval for the file system check is smaller than the accepted age of the replication status.
If the Pre-execution Log shows no error you can choose Execute.
Confirm the popup “Are you sure that you want to execute this custom process?” to execute the operation.
In the Monitoring you can see the progress of your activity. Once it is completed you can check the target status of your system. The result is that the SAP HANA Replication Status is now configured.
The Primary Database Instance is now in status Not Running. You have to start the Primary Database Instance.
After the primary instance is started it should run without any issues.
Before triggering a takeover operation, SAP Landscape Management now checks on a defined central share (repository) if the replication between the systems is in sync.
How the Replication Status Repository works
Whether you are impacted by data loss or not is discovered during run time when trying to execute the near Zero Downtime procedure. More on the actual execution (also on error) will be shown later on.
The status of the SAP HANA System Replication can be checked on the primary and on the secondary instance.
The primary instance shows the validation “ReplicationStatus”, which provides the current status of the System Replication.
The secondary instance shows two validations “ReplicationStatus” and “HDBRepStatusPrim”. The validation “HDBRepStatusPrim” is responsible for the SAP HANA Replication Status Repository information.
In case the replication status is not in sync you can see it in the validation of both instances. The primary database instance shows a validation error for the validation “ReplicationStatus”, e.g. “The replication status for SiteA on host (syncmem) is error: Communication channel closed.”.
In case the replication status is not in sync you can see it in the validation of both instances. The secondary database instance shows two validation errors for the validation “ReplicationStatus”, e.g. “The replication status for SiteB on host (primary) is error: Communication channel closed.”.
And also the validation “HDBRepStatusPrim” shows an error message, e.g. “Exception raised in validation. Cause:[…] Error occured when trying to LaMa_HADR the instance on host ‚host:1128′. […] Takeover might result in dataloss Exit Code 1 Replication status for SiteB on host(syncmem) retrieved from primary site on 19-07 12:40:36 is error: Communication channel closed.
A Near Zero Downtime Takeover Operation is now not possible anymore without data loss. SAP Landscape Management prevent you from executing a Near Zero Downtime Takeover by showing an error in the Pre-Execution Log for the process step “Ensure Replication Sync Status”. If you do want to execute a Near Zero Downtime Takeover anyway you have to set the parameter “Accept Potential Data Loss” to True.
In case of a disaster you are still able to trigger a normal Takeover Operation without any restrictions.