Cisco ASA protected SSH-connection hangs - [Fixed]
Thursday, March 13. 2014
Couple of my users were complaining, that their SSH-connection dies when idling for a while. The client does not notice, that server is gone. It cannot regain communications and dies only after a transmission is attempted, failed and timed out.
My initial reaction was, that a firewall disconnects any "non-used" TCP-connections. The non-used may or may not be true, but the firewall thinks that and since it can make the decision, it disconnects the socket. There is one catch: if the TCP-socket is truly disconnected, both the server and the client should notice that and properly end the SSH-session. In this case they don't. For those readers not familiar with the details of TCP/IP see the state transition diagram and think half-closed connection as being ESTABLISHED, but unable to move into FIN_WAIT_1 because firewall is blocking all communications.
Googling got me to read a discussion thread @ Cisco's support forums titled SSH connections through asa hanging. There Mr. Julio Carvaja asks the original poster a question: "Can you check the Timeout configuration on your firewall and also the MPF setup. What's the Idle time you have configured for a TCP session?" So I did the same. I went to the box and on IOS CLI ran the classic show running-config, which contained the timeout values:
timeout conn 1:00:00 half-closed 0:10:00 udp 0:02:00 icmp 0:00:02
From that I deduce that any TCP-connection is dropped after one hour of idling. It is moved into half-closed state after 10 minutes of idle. The 10 minutes is in the time range of my user complaints. One hour is not. So essentially Cisco ASA renders the TCP-connection unusable and unable to continue transmitting data.
In the discussion forum there is suggestion to either prolong the timeout or enable SSH keepalive. I found the way of defining a policy for SSH in the ASA. There is an article titled ASA 8.3 and Later: Set SSH/Telnet/HTTP Connection Timeout using MPF Configuration Example, which describes the procedure in detail.
However, I choose not to do that, but employ keepalive-packets on my OpenSSHd. I studied my default configuration at /etc/ssh/sshd_config and deduced that keepalive is not in use. In the man-page of sshd_config(5) I can find 3 essentially required configuration directives:
- TCPKeepAlive: The master switch to enable/disable the mechanism.
- This is on by default, but this alone does not dicatate if the keepalive will be used or not
- ClientAliveInterval: The interval [in seconds] at how often a keepalive packet is being transmitted
- As default, this is 0 seconds, meaning that no packets will be sent.
- ClientAliveCountMax: The number of packets that a client did not respond to before declaring the connection dead
- As default this is 3. Still, no packets are sent ever, thus a client is never declared M.I.A. based on this criteria.
So to fix the failing SSH-session problem, the only thing I changed was to set a client alive interval. Since after 10 minutes of idling (600 seconds), the Cisco ASA will mess up the connection, I chose half of that, 300 seconds.
After restarting the sshd, opening a connection and idling for 5 minutes while snooping the transmission with Wireshark, I found out that my SSH server and client exchanged data after ever 300 seconds. The best thing about the fix is that it works! It solves the problem and SSH-connection stays functional after long period of idling.