Optimal configuration and synchronization interval
Posted: Thu Apr 13, 2017 4:00 pm
What time intervals should I use for my sensors and synchronizations? Is there a best practice?
I assume that I should make the sync intervals as short as possible to keep the peers identical. (eg: 5 seconds)
Re: Optimal configuration and synchronization interval
Posted: Thu Apr 13, 2017 4:15 pm
By design HAAst does not use disk mirroring (e.g. DRBD) or disk sharing (e.g. NFS, SMB, iSCSI). The reason is that file corruption caused by a failing peer will immediately corrupt files on the other peer (if using mirroring or shared disk), and bring down the entire cluster.
Instead, HAAst brings the peers into sync at regular intervals. These intervals should leave enough time for HAAst to detect if a peer is failing, and then prevent synchronization if a peer is unhealthy. Do not use short sync intervals to simulate a mirrored disk (that defeats the benefit of this design).
So your sensor intervals and sync intervals should work hand-in-hand. Keep in mind that HAAst's internal sensors run at 0.5 second intervals, but external sensors (which you define) can run anywhere from seconds to hours apart. If you are only using internal sensors then synchronization no less than 15 seconds apart is usually sufficient (but 15 seconds is unusually low/short). If you are using external sensors then set your synchronization intervals to no less than 1/2 the sensor interval time. As well, set your sync intervals to non-multiples of one another; for example, 1,2,4 seconds intervals are sub-optimal (as syncs will overlap), while 2,5,7 second intervals are better (less chance of sync overlap).
As well, set your synchronization intervals to match the value of the file/data being synchronized. For example, it doesn't make sense to synchronize a MySQL database every 10 seconds if it only holds configuration data that might be changed once per day.
Your interval settings need to balance the benefit of keeping both peers in sync quickly, with avoiding cluster failure in the event of file corruption. There are of course exceptions to every rule, but the above should serve as a guideline.
The bottom line is NO - don't make the interval as short as possible. When Telium is engaged to setup a cluster we usually set intervals in minutes (not seconds), other than for unusual circumstances.