Library with common methods and default values for High Availability Extension (HA or HAE) tests.
$default_timeout: default scaled timeout for most operations with SUT
$join_timeout: default scaled timeout for ha-cluster-join
calls
$softdog_timeout: default scaled timeout for the softdog watchdog
$crm_mon_cmd: crm_mon (crm monitoring) command
exec_csync();
Runs csync2 -vxF
in the SUT, to sync files from SUT to other nodes in the cluster. Sometimes it is expected that the first call to csync2 -vxF
fails, so this method will run the command twice.
add_file_in_csync( value => '/path/to/file', [ conf_file => '/path/to/csync2.cfg' ] );
Adds /path/to/file to a csync2 configuration file in SUT. Path to add must be passed with the named argument value, while csync2 configuration file can be passed on the named argument conf_file (defaults to /etc/csync2/csync2.cfg). Returns true on success or croaks if command execution fails in SUT.
get_cluster_name();
Returns the cluster name, as defined in the CLUSTER_NAME setting. Croaks if the setting is not defined, as it is a mandatory setting for HA tests.
get_hostname();
Returns the hostname, as defined in the HOSTNAME setting. Croaks if the setting is not defined, as it is a mandatory setting for HA tests.
get_node_to_join();
Returns the hostname of the node to join, as defined in the HA_CLUSTER_JOIN setting. Croaks if the setting is not defined, as this setting is mandatory for all nodes that run ha-cluster-join
. As such, avoid scheduling tests that call this method on nodes that would run ha-cluster-init
instead.
get_ip( $node_hostname );
Returns the IP address of a node given its hostname, either by calling the host
command in SUT (which in turns would do a DNS query on tests using support server), or by searching for the host entry in SUT's /etc/hosts. Returns 0 on failure.
get_my_ip();
Returns the IP address of SUT or 0 if the address cannot be determined. Special case of get_ip()
.
get_node_number();
Returns the number of nodes configured in the cluster.
get_node_index();
Returns the index number of the SUT. This information is taken from the node hostnames, so be sure to define proper hostnames in the tests settings, for example alpha-node01, alpha-node02, etc.
is_node( $node_number );
Checks whether SUT is the node identified by $node_number. Returns true or false. This information is matched against the node hostname, so be sure to define proper hostnames in the tests settings, for example alpha-node01, alpha-node02, etc.
add_to_known_hosts( $host );
Adds $host to the .ssh/known_hosts file of the current user in SUT. Croaks if any of the commands to do so fail.
choose_node( $node_number );
Returns the hostname of the node identified by $node_number. This information relies on the node hostnames, so be sure to define proper hostnames in the tests settings, for example alpha-node01, alpha-node02, etc.
save_state();
Prints the cluster configuration and cluster status in SUT, and saves the screenshot.
is_package_installed( $package );
Checks if $package is installed in SUT. Returns true or false.
check_rsc( $resource );
Checks if cluster resource $resource is configured in the cluster. Returns true or false.
ensure_process_running( $process );
Checks for up to $default_timeout seconds whether process $process is running in SUT. Returns 0 if process is running or croaks on timeout.
ensure_resource_running( $resource, $regexp );
Checks for up to $default_timeout seconds in the output of crm resource status $resource
if a resource $resource is configured in the cluster; uses $regexp to check. Returns 0 on success or croaks on timeout.
ensure_dlm_running();
Checks that the dlm
resource is running in the cluster, and that its associated process (dlm_controld) is running in SUT. Returns 0 if process is running or croaks on error.
write_tag( $tag );
Create a cluster-specific file in /tmp/ of the SUT with $tag as its content. Returns 0 on success or croaks on failure.
read_tag();
Read the content of the cluster-specific file created in /tmp/ with write_tag()
. Returns the content of the file or croaks on failure.
block_device_real_path( $device );
Returns the real path of the block device specified by $device as shown by realpath -ePL
, or croak on failure.
lvm_add_filter( $type, $filter );
Add filter $filter of type $type to /etc/lvm/lvm.conf.
lvm_remove_filter( $filter );
Remove filter $filter from /etc/lvm/lvm.conf.
rsc_cleanup( $resource );
Execute a crm resource cleanup
on the resource identified by $resource.
ha_export_logs();
Upload HA-relevant logs from SUT. These include: crm configuration, cluster bootstrap log, corosync configuration, hb_report, list of installed packages, list of iSCSI devices, /etc/mdadm.conf, support config and y2logs. If available, logs from the HAWK test, from CTS and from HANA are also included.
check_cluster_state( [ proceed_on_failure => 1 ] );
Check state of the cluster. This will call $crm_mon_cmd to check the current status of the cluster, check for inactive resources and for partition with quorum in the output of $crm_mon_cmd, check the reported number of nodes in the output of crm node list
and $crm_mon_cmd is the same and run crm_verify -LV
.
With the named argument proceed_on_failure set to 1, the method will use script_run() and attempt to run all commands in SUT without checking for errors. Without it, the method uses assert_script_run() and will croak on failure.
wait_until_resources_stopped( [ timeout => $timeout, minchecks => $tries ] );
Wait for resources to be stopped. Runs $crm_mon_cmd until there are no resources in stopping state or up to $timeout seconds. Timeout must be specified by the named argument timeout (defaults to 120 seconds). This timeout is scaled by the factor specified in the TIMEOUT_SCALE setting. The named argument minchecks (defaults to 3, can be disabled with 0) provides a minimum number of times to check independently of the return status; this helps avoid race conditions where the method checks before the HA stack starts to stop the resources. Croaks on timeout.
wait_until_resources_started( [ timeout => $timeout ] );
Wait for resources to be started. Runs crm cluster wait_for_startup
in SUT as well as other verifications on newer versions of SLES (12-SP3+), for up to $timeout seconds for each command. Timeout must be specified by the named argument timeout (defaults to 120 seconds). This timeout is scaled by the factor specified in the TIMEOUT_SCALE setting. Croaks on timeout.
get_lun( [ use_once => $bool ] );
Returns a LUN from the LUN list file stored in the support server or in the support NFS share in scenarios without support server. If the named argument use_once is passed and set to true (defaults to true), the returned LUN will be removed from the file, so it will not be selected again. Croaks on failure.
check_device_available( $device, [ $timeout ] );
Checks for the presence of a device in the SUT for up to a defined timeout (defaults to 20 seconds). Returns 0 on success, or croaks on failure.
set_lvm_config( $lvm_config_file, [ use_lvmetad => $val1, locking_type => $val2, use_lvmlockd => $val3, ... ] );
Configures the LVM parameters/values pairs passed as a HASH into the LVM configuration file specified by the first argument $lvm_config_file. These LVM parameters are usually use_lvmetad, locking_type and use_lvmlockd but any other existing parameter from the LVM configuration file is also valid. Parameters that do not exist in the LVM configuration file in SUT will be ignored. Returns 0 on success or croaks on failure.
add_lock_mgr( $lock_manager );
Configures a $lock_manager resource in the cluster configuration on SUT. $lock_mgr usually is either clvmd or lvmlockd, but any other cluster primitive could work as well.
is_not_maintenance_update( $package );
Checks if the package specified in $package is not targeted by a maintenance update. Returns true if the package is not targeted, i.e., package name does not appear in the BUILD setting and the MAINTENANCE setting is active, or false in all other cases.
activate_ntp();
Enables NTP service in SUT.
calculate_sbd_start_delay();
Calculates start time delay after node is fenced. Prevents cluster failure if fenced node restarts too quickly. Delay time is used either if specified in sbd config variable "SBD_DELAY_START" or calculated: "corosync_token + corosync_consensus + SBD_WATCHDOG_TIMEOUT * 2" Variables 'corosync_token' and 'corosync_consensus' are converted to seconds.