AMI connection issue at startup

Installing HAAst, prerequisites, getting HAAst started, and troubleshooting performance and technical issues. As well, upgrades to HAAst and any underlying software.
Post Reply
fabryxy
Posts: 3
Joined: Fri Aug 10, 2018 7:10 am

AMI connection issue at startup

Post by fabryxy » Fri Aug 10, 2018 7:32 am

I installed FreePBX 14 a few weeks ago and so far it is working great. I am now looking into the HA part of the setup and I picked HAAst.

I am having a problem at the system startup, where HAAst takes some time to switch on Asterisk and in the HAAst log I see the following:

Code: Select all

Fri Aug 10 06:46:55 2018, 00001258, I, Asterisk Controller, Starting
Fri Aug 10 06:46:55 2018, 00010012, I, Cluster Controller, Setting initial state of local peer to standby due to local peer haast service start
Fri Aug 10 06:46:55 2018, 00000107, I, Cluster Controller, Local peer HAAst state changing to standby
Fri Aug 10 06:46:56 2018, 00010013, I, Cluster Controller, Initial state of local peer sucessfully set to standby
Fri Aug 10 06:46:56 2018, 00001901, I, Peer Link Server, PeerLinkServer listening on 12.34.56.78:3002
Fri Aug 10 06:46:56 2018, 00001800, E, Peer Link, Connection attempt to remote node failed with error 0.  Attempting reconnection in 5 seconds
Fri Aug 10 06:46:56 2018, 00001836, W, Peer Link, Connection to remote node failed.  Not requesting local node promotion due to local node snoozing
Fri Aug 10 06:47:00 2018, 00001836, W, Peer Link, Connection to remote node failed.  Not requesting local node promotion due to local node snoozing
Fri Aug 10 06:47:05 2018, 00001836, W, Peer Link, Connection to remote node failed.  Not requesting local node promotion due to local node snoozing
Fri Aug 10 06:47:10 2018, 00001805, W, Peer Link, Connection to remote node not established within 15 seconds of first attempt.  Requesting local node promotion
Fri Aug 10 06:47:10 2018, 00001836, W, Peer Link, Connection to remote node failed.  Not requesting local node promotion due to local node snoozing
Fri Aug 10 06:47:10 2018, 00010018, I, Cluster Controller, Promotion of local peer is beginning due to local peer PeerLink connection failure
Fri Aug 10 06:47:10 2018, 00000108, I, Cluster Controller, Local peer HAAst state changing to active
Fri Aug 10 06:47:10 2018, 00003327, I, Sensors, Resetting local peer sensor scores and state
Fri Aug 10 06:47:10 2018, 00010106, I, NIC, IP address '12.34.56.78' added to physical NIC device 'eth0' as new virtual NIC device 'eth0:haast'
Fri Aug 10 06:47:15 2018, 00001836, W, Peer Link, Connection to remote node failed.  Not requesting local node promotion due to local node snoozing
Fri Aug 10 06:47:22 2018, 00001836, W, Peer Link, Connection to remote node failed.  Not requesting local node promotion due to local node snoozing
Fri Aug 10 06:47:27 2018, 00001836, W, Peer Link, Connection to remote node failed.  Not requesting local node promotion due to local node snoozing
Fri Aug 10 06:47:32 2018, 00001836, W, Peer Link, Connection to remote node failed.  Not requesting local node promotion due to local node snoozing
Fri Aug 10 06:47:37 2018, 00001836, W, Peer Link, Connection to remote node failed.  Not requesting local node promotion due to local node snoozing
Fri Aug 10 06:47:42 2018, 00000721, E, System Command, Failed to start Asterisk (start) wrapper.  Run result 3; exitcode 0
Fri Aug 10 06:47:42 2018, 00010019, I, Cluster Controller, Promotion of local peer is complete
Fri Aug 10 06:47:42 2018, 00001256, W, Asterisk Controller, Connection attempt to AMI failed with error 0.  Requesting local peer demotion
Fri Aug 10 06:47:42 2018, 00010012, I, Cluster Controller, Demotion of local peer is beginning due to local peer lost/failed AMI connection
So I am sure that the password for AMI is correct, I cannot understand why there is such error and I cannot find any reference in the guide about the "error 0" at the same time I don;t get why the system is "snoozing" for so long.
Another thing I did is to add some delay to firewalld at the startup as sometimes I saw a "firewall rules corrupted" but in the log the only thing wrong I saw was that asterisk was not running.
About the peer link error, I didn't setup the backup server yet, I am going to do it after I figure this AMI error
User avatar
Telium Support
Posts: 173
Joined: Sun Nov 27, 2016 3:27 pm

Re: AMI connection issue at startup

Post by Telium Support » Sat Aug 11, 2018 1:46 am

You have a number of issues:

1. The following error in your event log indicates that HAAst is unable to successfully start Asterisk (and/or FreePBX services).

Code: Select all

Fri Aug 10 06:47:42 2018, 00000721, E, System Command, Failed to start Asterisk (start) wrapper. Run result 3; exitcode 0
What have you set your 'distribution' key to in the [asterisk] stanza of your haast.conf file? Are you running the fully open source version of FreePBX or proprietary (FreePBX 'distro')? If open source, on what Linux distro and architecture?
This error indicates Asterisk is not being started (properly), and also explains why you aren't getting an AMI connection error. (Error 0 means HAAst could not establish a connection to the AMI)

2. The follow errors in your event log suggest your PeerLinkServer is listening on the same IP that you are using for the dynamic HAAst created interface:

Code: Select all

Fri Aug 10 06:46:56 2018, 00001901, I, Peer Link Server, PeerLinkServer listening on 12.34.56.78:3002
Fri Aug 10 06:47:10 2018, 00010106, I, NIC, IP address '12.34.56.78' added to physical NIC device 'eth0' as new virtual NIC device 'eth0:haast'
Have you obfuscated the IP addresses in your posting? If not you will need to fix your IP addresses (which may also prevent Asterisk from Starting depending on the FreePBX settings).

3. The reason you see snoozing messages relates to the promotesnoozetime and demotesnoozetime keys in the [cluster] stanza of haast.conf. This has nothing to do with your problem - in fact I suggest you leave those settings at default until you understand HAAst a little better. As background, these settings prevent the cluster from flipping back and forth in the event that it takes Asterisk some time to stabilize. (See http://www.telium.ca/pages/forums/viewt ... p?f=7&t=17 for more info).

4. HAAst does not interact with firewalld or iptables, so that symptom isn't directly related to HAAst. However, beware that Songoma calls fail2ban a 'security system' - when in fact it's not. Actually Digium warns companies NOT to use fail2ban as a security system (see http://www.telium.ca/pages/forums/viewt ... p?f=9&t=65 for more info). I suspect FreePBX is trying to start fail2ban and that is giving you the error messages. If you are anything but a small/SOHO business have a look at the SecAst pages on this site to understand the different between SecAst and fail2ban.

With more info we can provide more suggestions but hopefully that clarifies the meaning of the messages. I suspect all of the above relate to starting FreePBX.
fabryxy
Posts: 3
Joined: Fri Aug 10, 2018 7:10 am

Re: AMI connection issue at startup

Post by fabryxy » Sat Aug 11, 2018 6:09 am

1. I am using the proprietary "FreePBX Distro" version 14. In HAAst I am using option 11. I tried option 8 with the same result, option 2 doesn't generate any error but obviously I have other kind of problem with my distro choice.
2. I have obfuscated the ip address ;), I am using two different ip for Management and voip.
3. This is clear now. I didn't realize that "snooze" was related to the [cluster] stanza.
4.I will look into this as soon as I have HAAst running smoothly.

At this point I am not sure why there is this error

Code: Select all

Fri Aug 10 06:47:42 2018, 00000721, E, System Command, Failed to start Asterisk (start) wrapper. Run result 3; exitcode 0
As per the install guide the correct option is 11.
User avatar
Telium Support
Posts: 173
Joined: Sun Nov 27, 2016 3:27 pm

Re: AMI connection issue at startup

Post by Telium Support » Sat Aug 11, 2018 10:18 am

Distribution type 11 would be correct for your setup, so that's not the problem. I would focus solely on ensuring HAAst can start your FreePBX instance properly. The problem could even be that FreePBX is starting but shutting down with an error code, or is getting severely delayed (see Sangoma notes on very slow FreePBX startup) causing HAAst to treat the startup as failed.

I would suggest you look at:
  1. Try to start/stop the FreePBX service using systemctl from the commandline. Does Asterisk start/stop properly with it?
  2. Confirm permissions on FreePBX and Asterisk resources, relative to who you are logged in as (and relative to root)
  3. Start FreePBX using the commands within the systemd service script directly to look for errors
  4. Look for FreePBX startup errors
  5. Look for severe FreePBX startup delays (and follow Sangoma wiki instructions on correcting very slow startup if necessary)
fabryxy
Posts: 3
Joined: Fri Aug 10, 2018 7:10 am

Re: AMI connection issue at startup

Post by fabryxy » Sun Aug 12, 2018 6:24 am

The problem was a custom rule that I had in the firewall. It was creating a long delay at the startup of FreePBX, and HAAst gave up waiting for FreePBX.

I fixed my firewall and now all starts perfectly.

Time to look into SecAst for security

Thank you
Post Reply