Analyzing SSH Login Attempts with Splunk

Tim Wick
10 min readDec 20, 2020

I left an internet facing server with an open SSH port up to see what kind of login attempts I would see. I made modifications to OpenSSH to log passwords and imported the logs to Splunk to do some analysis. I was looking for a personal project to get some more practice with a common tool in the infosec industry and start putting something back out into the world. This was a good starter project and I enjoyed seeing firsthand what an SSH port opened to the internet sees. The setup is below if you want to try it yourself!
I realize this is nothing groundbreaking and has been done before but we all have to start somewhere!

After around two weeks my server saw over 229,000 SSH login attempts. Yikes. I did some light analysis with splunk and looked at some of the most common usernames, passwords, and IPs.
Unsurprisingly ‘root’ was the most common username with 159,607 login attempts followed by ‘test’ and ‘admin’ with a little over a thousand each. Thinking on the part of an attacker, if I am scanning the entire internet ‘root’ has the highest chance of existing on the most number of systems so it has the best potential bang for my CPU cycles. For passwords ‘123456’ was the most common with 4,999 attempts followed by other classics like ‘password’ and ‘admin.’ Again, not surprising as these are typical default credentials that have the highest chance of being successful. There were 9,809 different usernames and 48,129 different passwords used. What I find interesting is the disparity between the number of times ‘root’ was used vs the most common password of ‘123456’. I think there is some opportunity here to increase the difficulty of guessing login credentials by choosing a unique username in addition to unique password. It is easy to do and adds significant difficulty to brute forcing your login to an internet facing service.

Sorting by IP there are a few top hits that constitute many of the login attempts and the IPs were heavily concentrated in China and the US. Granted the geographic distribution doesn’t mean much with proxies, TOR, etc so it is more of the last hop location. There were 2,646 different IP addresses that attempted logins. I find it interesting but not surprising that there are machines seemingly dedicated to guessing usernames and passwords on exposed systems. I’m curious how the targets get identified and how long each one will receive login attempts for.

Combine all this together and we have a nice little dashboard displaying our top login attempts and where they are coming from. Unfortunately the skew to the US/China is so heavy other sources don’t even show up in the Splunk map.

I wanted to dive a bit deeper on something more specific. I was curious if certain IPs were trying to login repeatedly with the same credentials or if they were varied. The top IP, 143.110.185.170, which came back as a Digital Ocean server in India had 22,886 hits over a 5 day period. Presumably this is a VM spun up by someone and used to scan and be destroyed a few days later. Kind of like what I used my server for, but the opposite. This IP used only root for its login attempts and repeated some passwords, but a majority of the 22,886 attempts were unique values. The second most common IP, 49.88.112.111 was active over the entire period and again only used the username of root and had very similar passwords with most of them being unique values. These both appear to be simple dictionary attacks using the most common username and as many passwords as possible. On the opposite side of the spectrum there are many IPs with 1 or only a few login attempts. The difference seems similar to brute forcing one account vs password spraying one set of credentials across many systems. It could also be that these attackers are only interested in one type of system with a known set of default credentials and they’re not interested in trying any other combinations. This also ties back in to there being far more passwords attempted; there were a few IPs trying thousands of passwords with one username that made up the majority of the login attempts.

The next question is how to defend against this? These are low hanging fruit (excuse my management speak) attacks and there are low hanging fruit defense mechanisms. First and foremost we need strong passwords whether it is the root account or another account. The passwords I saw were seemingly a dictionary attack and simply increasing the amount of attempts needed to brute force the password to a ridiculous level will mitigate most issues. Second, we should have another user other than root to login to and disable root login in sshd_config. This stops over 50% of the attempts that are attempting to login as root with just the username change. Lastly, we can disable password login all together in sshd_config. This way we can only use ssh keys to login. Assuming the device we’re logging in from isn’t compromised, logins will be secured. Optionally we can also change the SSH port to something other than 22. Daniel Miessler has a blog post on this subject and personally it is something I did on the server hosting my website, timjwick.com. It’s not perfect, but it adds another layer of effort a potential attacker has to put in to even find what port my server is accepting SSH connections on. It avoids the internet wide scanning of port 22 and saves my server some CPU cycles and network bandwith. On a limited basis this works great and it is easy to put an alias in .bashrc to not even have to remember which port you put SSH on. Obviously in an enterprise changing SSH ports would take far more effort, but in those cases it should be blocked at the edge anyway.

One last thing I wanted to do was apply something I learned recently in John Strand’s SOC Core Skills class and see if there was any suspicious network connections on the server that might indicate it was actually compromised. I could check the logs for successful logins and be pretty sure, but this is more fun. Using lsof -i I can see any open network connections. Everything seems run of the mill; ntp, localhost connections for Splunk, and my own SSH connection. The one interesting connection is python3.7 listening on localhost:8065, since it is localhost probably nothing suspicious but my curiosity was piqued. By using lsof-p 962 I can see there are many Splunk related processes, so nothing actually suspicious. Super simple example, but I like being able to apply things I’m learning wherever I can.

The Setup:

The following is the setup I went through if this is something you want to do yourself!

There are options for creating fully fledged honeypots such as Cowrie, however I wanted to focus solely on username, password, and IP address. Since auth.log does not have the actual password used in login attempts (good practice) I needed to modify OpenSSH to log passwords (bad practice). Making these code changes causes legitimate login passwords to be logged in plaintext and could introduce other vulnerabilities so this should not be done on any live system used for more than the purpose of capturing said passwords. Since this is a server spun up in the cloud (I use Digital Ocean) that will be destroyed after this experiment, the risk is low.
Thanks to a blog post from 2011 that did exactly this same thing I was trying to do I was able to figure out how to do this relatively quickly. I first attempted this using Ubuntu but had issues getting the .deb package to build and install so I eventually switched to a Debian 10 installation which made the installation smoother.

The first step is to spin up a server in your cloud provider of choice and gain access to it, whether through a built in console like in Digital Ocean or via SSH. Next we make a directory for the OpenSSH source code and cd into it. There are a few pre-requisites to being able to build the source later on, so this is a good time to do this here. Next we get the source code and cd into the main OpenSSH folder, in my case this was openssh-7.9p1, and vi into auth-passwd.c to start making changes. Side note; I was logged in as root so if you are logged in as a user, you will need to use sudo for some of these commands.

apt install dpkg-dev
apt-get build-dep openssh-server
apt-get source openssh-server
cd openssh-7.9p1
vi auth-passwd.c

PAM is the default authentication method in Debian so we need to edit the portion of the code that deals with PAM authentication. Starting on line 114 (enter :set number in vi to get line numbers) between the #ifdef USE_PAM and #endif we can put some code to log the username, password, and IP address and return as it normally would. The code I used is as follows. Luckily this .c file already had everything we needed imported to use.

#ifdef USE_PAM
if (options.use_pam) {
logit("SSHLog: Attempted login from IP: %s with username: %s and password %s", ssh_remote_ipaddr(ssh)), authctxt->user, password);
return (sshpam_auth_passwd(authctxt, password) && ok);
}
#endif

Next we build the package, install, and restart ssh. The dpkg-buildpackage command should be run within the openssh-7.9p1 directory and it builds to the parent directory one level up. cd back to the parent directory and install the .deb file that was created, in my case it was openssh-server_7.9p1–10+deb10u2_amd64.deb. Then restart open-ssh to make sure the new build is being used.

dpkg-buildpackage -rfakeroot -uc -b
dpkg -i openssh-server_7.9p1-10+deb10u2_amd64.deb
systemctl restart ssh

If everything goes to plan, we can cat /var/log/auth.log and start seeing login attempts like the one below. One thing to remember is to make sure your firewall allows traffic to port 22 if you’ve set one up and you’re logged in through a console.

When trying to do this in Ubuntu initially I ran into a few issues. First, when trying to install the source code I was getting the error message “E: You must put some ‘deb-src’ URIs in your sources.list”. To resolve this I un-commented the first deb-src line in /etc/apt/source.list “deb-src http://us.archive.ubuntu.com/ubuntu/ focal main restricted” and ran apt-get update to enable getting the source files after that.
Secondly, when downloading the source files in Ubuntu they did not all unzip and I was getting the error “cannot open ‘debian/changelog” when trying to build the file. Unzipping all the files allowed the build to work. Then trying to install the .deb file caused issues since the openssh-server version wasn’t matching the openssh-agent version. At this point I tried Debian and everything worked the first time so I didn’t look back.

Alright! Now it is time to setup Splunk… A Medium user by the name of SmUrF3R5 made a very straightforward tutorial on installing Splunk on Ubuntu Server and it worked the same on Debian. Initially I had installed this on a separate VM with the intent of forwarding data between the servers. However I had issues with network connectivity on the VPC in digital ocean. Ping tests between the nodes would fail randomly and I could not figure it out. ARP tables pointed correctly to each other’s MAC addresses and I even tried static routes but communication would still fail a few minutes after the nodes were booted and not recover. Not wanting to waste too much time on troubleshooting I moved on, but it is something I want to come back to and figure out.

After the install I changed the Splunk port, assuming the default port would get hammered similar to what port 22 was experiencing. Splunk commands can be entered in the /opt/splunk/bin directory with./splunk and the command to change the port is ./splunk set web-port xxxx. Now we can log in to the GUI with https://IP:port and add our /var/log/auth.log directory to be monitored. Again, we need to make sure there is a firewall rule allowing communication to this port. To add data: Settings->Add Data->Monitor->Files & Directories and add “/var/log/auth.log” click “Next” and go through the rest of the options, I left everything else as default. The last thing we need to do for setup is extract the somewhat custom fields to allow searching on and manipulating data related to those fields. From a search: left column->Extract New Files->I prefer to write the regular expression myself and add the following regex one at a time. These work for if you use the exact same log format as I did above, otherwise the regex will be different. Splunk has a great guide on writing regex.

IP: (?<ip>\d+\.\d+\.\d+\.\d+)
Username: username:\s(?<username>\w+)
Password: password\s(?<password>w+)

That’s it! Now time to let it site for awhile and analyze it.

--

--