Redundancy Hot Switching

The purpose of this test is to measure the failover switching time for 1,000 to 100,000 points in a redundant setup and to compare system behavior with and without Hot Switching enabled. The primary metric is the time required until the first update of all points is received after a failover event.

Scenario

Product / Version
- GENESIS 11.04 (11.4.625)
Environment
- Redundant configuration running on two virtual machines (VMs) with a common SQL server on one of them.
- Operating System: Windows 11
- VM Configuration (each node):
  - CPU: 6 cores – Intel Xeon Silver 4314 @ 2.4 GHz
  - Memory: 20 GB RAM
- Default cluster configuration - Cluster timeout: 5 seconds
Failover Trigger
- Failover was initiated by manual disconnection of the network adapter on the active node.

Methodology

A custom test utility was developed specifically for this test.
The utility connects to a defined number of points coming from the internal simulator located at:

\cluster\..
Each point represents a random integer value Example:

\clusterA\svrsim:random int 10 999999
Test procedure:
1. Manual disconnection of the network adapter on the active node
2. Manual start of the time measurement
3. Measurement of the time until the first update of all points is received
In addition to the utility, two points were also observed in TrendWorX Viewer (GraphWorX) to verify data updates.
For each configuration:
- Two measurements were performed
- Average switching time was calculated
CPU and RAM usage were monitored during the tests.

Results

# points	Hot Switching	time 1 (s)	time 2 (s)	avg time (s)	CPU avg %	CPU max %	RAM avg (MB)	RAM max (MB)
1000	✖	4.31	5.6	4.96	2.95	11.35	916	1248
1000	✔	4.05	3.5	3.78	2.62	66.17	1122	6599
5000	✖	5.43	5.32	5.38	2.95	11.35	916	1248
5000	✔	4.49	4.75	4.62	2.62	66.17	1122	6599
10000	✖	6.33	6.16	6.25	2.95	11.35	916	1248
10000	✔	5.07	4.61	4.84	2.62	66.17	1122	6599
30000	✖	10.17	10.38	10.28	27.60	63.15	3224	5752
30000	✔	4.36	4.33	4.35	2.62	66.17	1122	6599
50000	✖	13.75	11.6	12.68	27.60	63.15	3224	5752
50000	✔	4.5	6.17	5.34	2.62	66.17	1122	6599
100000	✖	26.62	27.81	27.22	27.60	63.15	3224	5752
100000	✔	6.21	6.25	6.23	2.62	66.17	1122	6599

Summary

Enabling Hot Switching results in stable and fast failover timing (close to failover timeout) across all tested point counts.
Failover switching time without Hot Switching increases noticeably with the number of points.
- The largest difference is observed at higher loads (30k–100k points), where Hot Switching significantly reduces recovery time.
CPU and memory values are included for reference only; average values in some test cases are influenced by the fact that all measurements with Hot Switching were captured in a single performance log file, while measurements without Hot Switching used 2 separate log files. Peak loads were very similar.

Conclusion

Hot Switching significantly improves failover switching time, especially for higher point counts.

Without Hot Switching, the switching time grows proportionally with the number of points. With Hot Switching enabled, the recovery time within a narrow range (close to failover timeout) even at 100,000 points.