Ssh client not working after update to 21.0.4 (Ornara)

emartin · 17 May 2021 16:43

Hi everyone,

After updating my OS to version 21.0.4 (Ornara) from 20.2.1 (Nibia), my ssh client has stopped working. I cannot connect to any ssh server, getting always a “Connection timed out” message. I suspect it is an OS related problem because:

I can use ssh client from other computers on the network, so no ISP blocking port 22 or router related issues here.
I have dual boot in this computer, and I can perfectly use Windows ssh client. This just double checks the previous point, but I wanted to be sure since all the related topics I can find seem to point to these type of problems.
ufw is inactive
I had the same problem a couple of weeks ago but as I had no time to investigate and I needed ssh to be working on this machine, I restored a Timeshift backup from March and it went back to normal again. I also had installed some other applications along with the update, so I blamed them for the problem but it seems I was wrong. I’m pretty sure that if I go back to the same backup, it will work again, but I would like to be able to keep my OS updated

ssh -vvv output is:

OpenSSH_8.6p1, OpenSSL 1.1.1k  25 Mar 2021
debug1: Reading configuration data /etc/ssh/ssh_config
debug2: resolve_canonicalize: hostname xx.xx.xx.xx is address
debug3: expanded UserKnownHostsFile '~/.ssh/known_hosts' -> '/home/myuser/.ssh/known_hosts'
debug3: expanded UserKnownHostsFile '~/.ssh/known_hosts2' -> '/home/myuser/.ssh/known_hosts2'
debug3: ssh_connect_direct: entering
debug1: Connecting to xx.xx.xx.xx [xx.xx.xx.xx] port 22.
debug3: set_sock_tos: set socket 3 IP_TOS 0x48
debug1: connect to address xx.xx.xx.xx port 22: Connection timed out
ssh: connect to host xx.xx.xx.xx port 22: Connection timed out

(I’ve removed my server IP address and username)
I would say that the request is not even “leaving” my machine, but I don’t know how to verify it. Should I install Wireshark or any similar tool?

Thanks in advance, and please let me know if I have to provide more information. Cheers!

omano · 17 May 2021 16:51

Is this the default SSH client? It doesn’t seem to be to me. ssh -vvv doesn’t work for me. I can SSH to remote servers without issue.

[omano@omano-nvme ~]$ ssh -V  
OpenSSH_8.6p1, OpenSSL 1.1.1k  25 Mar 2021

Last similar thread I replied to, user was banned from remote server (fail2ban).

xabbu · 17 May 2021 16:52

At this point the SSH Server needs to sends the ACK back. But your client never receives something from the SSH Server.

You can check with Wireshare what packages your client sends.

xabbu · 17 May 2021 16:54

It is

ssh -vvv username@servername

Replace username with your username and servername with the IP or Domain.

emartin · 17 May 2021 18:31

Yes, it is. As xabbu pointed, the full command is ssh -vvv username@server, sorry for not indicating it in my question.

emartin · 17 May 2021 18:34

I don’t think I’m banned from the server, since my public ip is the same when I connect from MacOS and Windows, and even Manjaro pre-update. Also, I can’t connect via ssh to any server, not just one.

emartin · 17 May 2021 18:36

I will try tomorrow with Wireshark and will get back to you guys with the info

Thanks both for your help!

PS: sorry about the multiple replies, I just read the suggestion about replying several questions in the same reply. Will do that in the future

emartin · 18 May 2021 06:52

I’ve made a capture of the sequence and after filtering it by the server address with ip.addr filter, so both outgoing and incoming packets should be showing, and realizing that it’s been a LONG time since I used Wireshark, what I got is the SYN packet and its 6 retransmissions (sorry, I cannot paste images or share links yet).

As @xabbu predicted, I never get the SYN-ACK back, but from this capture, I’m not even sure that this is actually being sent. The only weird thing that I can think of, and probably it isn’t even wrong, is that both the sequence number and length of the SYN packet are 0. I barely remember my ‘doing-network-stuff’ days, but is that correct?

Riggs · 18 May 2021 19:28

Erase the certificate to re-build for a new one (know_host)
Open cd /home/youruser/.ssh/know_host
And delete the older ssh

freggel.doe · 18 May 2021 19:59

Check if access to $host:22 (or $ip:22) works in principle:

$ telnet $host 22

or

$ telnet $ip 22

edit: you’re not using a BCM4331 with b43 by any chance?

emartin · 19 May 2021 06:59

Thanks for the answers. Let’s see:

Riggs
Erase the certificate to re-build for a new one (know_host)
Open cd /home/youruser/.ssh/know_host
And delete the older ssh

I’ve deleted the known_hosts file, but same result: “Connection timed out”.

freggel.doe
Check if access to $host:22 (or $ip:22 ) works in principle:

It’s not working either, the terminal output is as follows:

~ >>> telnet xx.xx.xx.xx 22
Trying xx.xx.xx.xx...
Connected to xx.xx.xx.xx.
Escape character is '^]'.
SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
Connection closed by foreign host.

I know it seems like it’s the server who closes the connection, but it happens with every server. I’ve noticed though that the 3 servers I’m trying with have OpenSSH version 7.x and my machine has 8.x. I can’t find any known incompatibilities after some Google search, but could that be the root of the problem? Is it safe/wise to try to downgrade my OpenSSH?

Cita
edit: you’re not using a BCM4331 with b43 by any chance?

I’m over a wired network. I’m using the motherboard’s (MSI b450m pro) integrated network card. I’m also going to try to:

find another network card and try with it
install proprietary drivers for this network card, if they exist

Thanks again for your help!

linux-aarhus · 19 May 2021 08:32

On successful connection you would normally get something like this

debug2: resolving "ssh.domain.tld" port 22
debug3: ssh_connect_direct: entering
debug1: Connecting to ssh.domain.tld [IPv4_ADDR] port 22.
debug3: set_sock_tos: set socket 3 IP_TOS 0x48
debug1: Connection established.

The way I read your above log snippet - the ssh daemon on the remote system does not answer the call. My systems are using the exact same version of openssh so I’d say SSH is working as expected, this gives me reason to believe this is a local issue on your system - and most likely not created by an update.

I must assume you have done the obligatory network troubleshooting - if not please use this article to layout the groundwork - [root tip] [How To] Basic network know-how and troubleshooting

The will most likely give you a better idea of where to look.

I recall an issue on my systems - no timeframe remembered - but this issue manifested in a message stating I had too many failed logins. Now this was strange because I didn’t even try to login.

Long story short - I have a large number of ssh-identities and apparently my system was trying all those identities to login to the specific server even though I had a specific identity file associated with the server in my local config - resulting in a failed connection.

After some research I found a way to stop the behavior - by adding the following at the top of my ~/.ssh/config

Host *
  IdentityAgent none

emartin · 19 May 2021 09:28

I think I’ve done most of the basics, and just to clarify, I’m connected to the Internet without having detected any other issues (I’m writing this from the poor broken machine). I cannot get a ping response from my servers, but I can traceroute them. As they are digital ocean droplets I think maybe ping is disabled by default. Remember that their configuration has’t changed, and they are accessible via ssh from apparently any other computer in the world .

It sounds like it may be related but, is that route correct? I don’t have a file named config in my .ssh folder. I naively tried to create the file with that content, but to no avail.

EDIT: also, I forgot to mention that I usually connect using -i parameter and indicating a private key file, but I’ve also tried without it and it doesn’t make a difference, I got “Connection timed out” without even asking for username and password

linux-aarhus · 19 May 2021 10:17

Just scrolled down the topic one more time - to absorb what ever I missed - sum it up so to speak.

Fact 1: Your DO instances reponds to SSH -p22 on all other machines in your local network except one Manjaro system which worked normally back in March.

Fact 2: The issue is not reproducible on other systems running the same version of SSH thus it is with reason to assume this is local to your Manjaro instance.

Fact 3: It is evident from your logging on the local Manjaro system - the connection attempt times out - and since the same connection works otherwise - it is safe to assume the service never receives the connection attempt.

Fact 4: A local software firewall’s default setting - if enabled - only blocks incoming traffic - never outgoing.

Using a process of elimination

does nslookup, host or a similar command return the correct combination of hostname and IP address?
did you previously use ufw or gufw to configure iptables, possibly in an attempt to troubleshoot the issue?

does the command iptables list return any result other than this

$ sudo iptables --list
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

is the iptables.service disabled?
have you tried changing the SSH port on your remote service? e.g. port 11000? then restart the service and connect using the -p11000 argument?

emartin · 19 May 2021 13:30

Thanks! I will clarify some of the Facts, though (English is not my native language, so I’m sorry I’m notas clear as I should from the beginning).

All correct, except the Manjaro system worked normally until this Monday the 17th when I updated it. My last Timeshift image is from March (yeah, that’s on me too…).

I hadn’t checked the other system’s ssh version, but I just did on the laptop I’m using for the task since Monday, and it’s using OpenSSH_7.9p1, which is kind of reinforcing my previous idea of a problem with my Manjaro system OpenSSH_8.6p1 version.

EDIT: a coworker has version OpenSSH_8.1p1 and he can connect, so my theory may not be that correct…

This is also my conclusion, yes.

Very true. On top of that I checked again and it’s inactive.

I have a web application running in the remote machine and by means of nslookup I get the remote ip address correctly. In my previous attempts I was trying to ssh login by ip, not hostname, but I’ve tried just now with the same results. It resolves the address correctly, though.

No, I didn’t.

I get a very different output because this is a development machine and I have Docker installed, which seems to do a lot of things with iptables:

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy DROP)
target     prot opt source               destination         
DOCKER-USER  all  --  anywhere             anywhere            
DOCKER-ISOLATION-STAGE-1  all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
DOCKER     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
DOCKER     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
DOCKER     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
DOCKER     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         

Chain DOCKER (4 references)
target     prot opt source               destination         

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target     prot opt source               destination         
DOCKER-ISOLATION-STAGE-2  all  --  anywhere             anywhere            
DOCKER-ISOLATION-STAGE-2  all  --  anywhere             anywhere            
DOCKER-ISOLATION-STAGE-2  all  --  anywhere             anywhere            
DOCKER-ISOLATION-STAGE-2  all  --  anywhere             anywhere            
RETURN     all  --  anywhere             anywhere            

Chain DOCKER-ISOLATION-STAGE-2 (4 references)
target     prot opt source               destination         
DROP       all  --  anywhere             anywhere            
DROP       all  --  anywhere             anywhere            
DROP       all  --  anywhere             anywhere            
DROP       all  --  anywhere             anywhere            
RETURN     all  --  anywhere             anywhere            

Chain DOCKER-USER (1 references)
target     prot opt source               destination         
RETURN     all  --  anywhere             anywhere

I believe if I restore my Timeshift backup, and I probably will during the weekend, I will get the same output, though.

I can’t do that, at least not in the servers I’m trying to reach now. I can try to find another server to make tests in.

Thank you very much for your help!

linux-aarhus · 19 May 2021 13:58

To get an idea of what pacman updated or installed

grep '\[2021-05-17' < /var/log/pacman.log

As outgoing traffic destined for port 22 is getting derailed - I suspect, that something is happening with respect to the iptables and the docker container.

Does your container have port 22 defined in the docker config?
Does your container have any relation to the services you are trying to connect to?

A couple of things stands out to me compared with my test systems - I don’t run docker - I don’t run iptables - so I am not able to create the same environment - which could affect my results.

You could - temporarily - disable the iptables.service - thus checking if this gives a hint.

If ssh then can connect to your remote service we are pointed to iptables and possibly your docker instance - if not - we must think harder.

Disabling any docker related service may also help in tracking down the issue.

The issue is not always located where you expect it - where you encounter the issue may be a mere symptom of another issue.

emartin · 20 May 2021 07:54

Well, yesterday I finished work early and I jumped directly into a fresh install of Manjaro 21.0.5. So there is no docker and no other possible conflicts. And it’s still not working, same old “Connection timed out” message.

This leaves me even with more doubts. Maybe the issue comes from an incompatibility with my network card in this version. Not necessarily with OpenSSH_8.6p1, but something included in this Manjaro version.

omano · 20 May 2021 08:10

Try an old ISO to confirm your theory.

emartin · 20 May 2021 09:00

I took the easier route of using another network card (an ASUS USB network card) and Manjaro detected it flawlessly, connected to the Internet flawlessly, but ssh still not working. I feel kind of clueless now…

I will definitely try to go back to the Nibia backup this weekend, but any suggestion is welcome.

Thanks to all of you for your support!.

omano · 20 May 2021 09:03

Kinda lazy to read thread again, but did you try to look into your router? Maybe you blocked something from here?