Adventures in being a bandwidthaholic

I've been sharing a remotely hosted server at Rackshack.net (which became EV1) with friends for over a year now and it's run amazingly well. The account started with 700Gb of montly bandwidth and after the unfortunate SCO license flap, we got upped to 1 terabyte of monthly bandwidth, with seemingly no network speed cap. For the past year, the server's pushed out a couple Gb of bandwidth a day, tops, from all the sites it hosts. Even when I put a bunch of music online last spring, it hardly made a dent.

This month I figured I'd see just how much a terabyte was. It started when I offered to host the Beatallica songs.  After a day the bandwidth jumped to 10-15Gb and it was humming along nicely. Then it hit Pitchfork's news page, and the bandwidth skyrocketed. The box was pushing out 20Mbit/sec and after a a couple days I had to tell the gang to de-link songs as my monthly bandwidth total reached 100Gb just a few days into April.

I was pretty impressed that the box held up ok (after Chris limited the site to 1 download per user) and was amazed at the traffic a site like Pitchfork could generate from a tiny news blurb. I thought to myself "wow, aside from slashdot I couldn't imagine a blog ever generating this kind of traffic and demand for files."

Then Cory linked my 66Mb file of a Jon Stewart interview over at BoingBoing, and it completely blew away the previous bandwidth numbers. In about 12 hours of the link being directed at the box, the network throughput jumped to almost 60Mbit/sec, and it pushed out 131Gb of data in half a day. The box served up all the other sites fine but as I watched my monthly bandwidth allottment reach 40% of the total before the first half of the month was even over, I took it offline and Andy put it up on his tracker, where it is being downloaded like crazy, but off-loaded to everyone's personal connection sharing the load.

Here's a cool graph of the network utilization on a weekly, 30-minute moving average (click to see the full image):

You can see the initial rise from a bunch of blogs linking to Beatallica, then the peak is the pitchfork hit, which subsided after song links were eliminated. Then a few days of relative calm and Boingboing is the huge peak, which only lasted half a day. I grabbed this right after I started redirecting folks to the torrent.

I've learned a few things from these large bandwidth experiments:

- Ridiculous amounts of bandwidth is out there for a cheap price (the server is only $100/month, shared among people using it). If you're paying $30 a month and getting hit with bandwidth overage bills that go into the hundreds of dollars, find a friend that knows some linux server administration, get one of these leased boxes, and never worry about bandwidth again.

- A thousand gigabytes is a ton of bandwidth and it's nice to have around when you want to share large files with friends or the general public. I host my ten years site there and don't really care about the size of photos or the number of people pulling down the RSS feeds with large images embedded.

- That said, when you get hit with a huge amount of traffic, bandwidth is still going to be a problem. Most colocation hosts cap your line at 10Mbit/sec and I was surprised to see the box creeping up near 60Mbit/sec yesterday. It's still a problem to host one giant file for a ton of people, even with an absurd amount of bandwidth available to you. Bittorrent is the savior here, Andy tells me even though he seeds all the files on his server (which means the original file's still on his server being downloaded if no one else is sharing it), his bandwidth is a fraction of what it'd be if it was just a direct download. The best part is the more popular the file (like the boingboing traffic hit), the more people download it from each other instead of your server.

- Setting up your own bittorrent server still a pain in the butt. This needs to be as difficult as setting up apache on a windows desktop. I want to see a BT server exe I click, install, then seed files easily using a web or desktop front-end (yay! Andy sent this and this). Or make an apache module. Also, build BT support into Mozilla, right now. BT is a great technology that solves a fundamental problem we all face everyday, but we have to walk people through how to download the clients first. In some of the data I saw on the Lessig book downloads, only about 5% of users opted to use BT to download, the rest just got it off the server directly. We need more regular folks using BT, by having it built into browsers.