Archive for amazon s3

Solving The Scaling Riddle

// September 27th, 2008 // 1 Comment » // amazon ec2, amazon s3, chesscube, Openfire, programming, XMPP

Monday is a very big day for ChessCube. We are launching version 3 of our online chess playing client. Along with a redesigned interface, we have made some significant changes under the hood.

Version 2 of ChessCube was suffering under immense load created by the increasing popularity of the site. Under this load we were experiencing messages getting lost, very long login times, client interface crashes and, the worst of all, dreaded lag spikes. Lag can be described as the amount of time a message takes to go from the Flash client sitting in the browser to the server. A lag spike is when the overall lag in the system jumps for all users that are on the system. After hunting down the cause of these lag spikes we came to the conclusion that the system was unable to cope with the sheer amount of messages being sent and received between clients and the server. Putting our server software on a larger server would, in light of the steady growth of the site, just buy us a few more months. So the decision was taken to cluster the server software.

The chess playing component of ChessCube, we call it Chat internally, uses a protocol called XMPP. Now before your eyes glaze over and you go back to checking your mail, let me explain very simply what XMPP is. If you’ve ever used Google Talk or Facebook Chat, you would have used XMPP. It can simply be explained as a set of rules used to allow for chatting between people connected to a central server. For ChessCube, we chose Openfire as our XMPP server, for the reasons that its Java-based (a language the team predominantly uses), is easily extensible, due to its comprehensive plugin framework, and is Open Source.

Openfire is great for supporting a small chat community, but as soon as you need to scale above 5000 simultaneously online users it becomes very slow. This is where clustering comes in. Clustering is a term used to refer to a group of computers concurrently working together to spread load across them, thereby improving performance and in some architectures, removing a single point of failure. The company that supports Openfire does offer a clustering plugin but its prohibitively expensive – charges are on a per-user-basis rather than a per-server-basis.

In comes our trusty homemade architecture. Since each game of chess in Chat is played in a separate room we could distribute these rooms off the main Openfire server. So now we have a main Openfire server to handle all the stuff related to presence and chat; and smaller game servers that handle games. We can now have multiple game servers all communicating with the Openfire server about the status of games being played on them and the Openfire server in turn can distribute the games evenly over the game servers.

Distributing the load across game servers is handled with Amazon S3. Each game server writes its status to S3 and the Openfire server polls S3 to see which game servers are available and how much load each server is under. The Openfire server can then send clients to whichever server it feels is under the least amount of load. We can also do cool tricks with routing clients to servers that are nearest to them geographically. E.g. If two players from Europe want to play a game we can put them on a server in our German data centre. Lag is minimized and everybody is happy.

We have also created a customized instance image for game servers on Amazon EC2. Under extraordinary load we are able to bring new game servers online and running games in a matter of minutes.

This version of Chat goes live on Monday with four game servers, running on Amazon EC2. Hope to see you there.

Working Remotely – A Programmer’s Guide

// September 18th, 2008 // 2 Comments » // amazon s3, fourhourworkweek, programming, rsync, scp, ssh, working remotely

Being confined to working in an office is not ideal. Especially if you want to be more creative and have more freedom to spread your time out during the day. I have been working away from the office on Thursdays and Fridays. By “away from the office”, I mean I have been working from home.

The reasons for working from home or remotely are pretty obvious. Firstly, you don’t have to commute to work. This saves me around 1 hour of driving in the car not too mention a whole bunch of money every month in fuel costs. The other cool thing about working from home is that you never feel rushed to finish something, you can sit back relax and get it done. Don’t get me wrong though – its not about shirking your work or being lazy. It just means you don’t feel as though you have to finish whatever you are doing so that you can hop in your car and rush home at the end of the day.

There are other good things that come from being away from the office. I have found new energy and creativity in my work because I feel more relaxed. I can also go for a run/cycle during the day if I feel I need to clear the cobwebs or if I am stuck on a problem. I also get to spend more time with my girlfriend – who works from home as well.

Today I decided to go one step further and got on a plane to Johannesburg to visit my parents for my dad’s birthday. After convincing the boss that my work would not be affected, I had to make sure I had the right tools and environment so that I was not left stranded without an Internet connection – the most dreaded of scenarios in my line of work. Here are some of tips for working remotely – especially if you have a job where you rely on the Internet to get your work done.

1. Travelling

If you are travelling somewhere, try to make sure you waste as little of the middle of the day as possible. I flew at 6:15am which meant I arrived in Johannesburg at 8:10am. I could then have a shower when I got to my mom and dad’s place and start working. If you are travelling further, try to fly in the evenings.

2. Equipment

Equipment is important. If you are not prepared you are asking for trouble. I use a Mac – but I also recommend using Linux. Why not Microsoft Windows you ask? Well, I do a lot of work on remote servers, uploading and downloading files, copying between remote servers and zipping and unzipping files. This is a real pain if you don’t have a bash terminal. You can use a combination of Putty and WinSCP on Windows. If you take the time to learn bash commands you will save time in the long run. More on this later.

The other really essential device is a 3G equipped cellphone with a medium-sized data plan. (This obviously depends on how long you are going away for and how much data you expect to use.) Don’t be fooled, if you are going somewhere that has Internet connectivity, you may find that its either too slow to be usable or just downright unreliable. HSDPA cellphones are preferable as they have faster connection speeds.

Other things I like to take with me are a good set of earphones and a mouse. The earphones are a good idea in case you have to work in a noisy environment – airports, hotel lobbies etc. I take a mouse because I tend to get neck pain if I have to use the touchpad on my Mac for prolonged periods. I like the Logitech VX Revolution mouse. It has a little compartment that you can store the cordless receiver inside and is very comfortable.

3. Software, Tools & Tips

Obviously the software everyone uses is different depending on your job. I am a programmer so I have to be able to get updates to the codebase from my teammates and be able to update them with my changes. We use Subversion for this and there is a nifty little feature in the Subclipse plugin for Flex Builder and Eclipse whereby you can relocate your repository. This means that I can use the external subversion URL when I am away from the office and the internal subversion URL when I am in the office.

I also have to be able to move files around e.g. copying new plugins of our server software to our test server. To copy files I use the scp command in terminal. So if I want to copy a file to our German server I type something like:

scp filename.txt

Copying large files repetitively with scp can be onerous however, not to mention expensive, so I use another command called rsync. Rsync works much the same way as scp:

rsync filename.txt

The cool thing about rsync is that it synchronizes the files or directories you are overwriting on the remote machine and therefore only copies the differences between the two files making the transfer time faster and the transfer size smaller.

These are obviously very simple examples. If you check out the links you can learn how to do more complicated tasks. Like copy a file to a specific directory or get more detailed information while copying.

How to use rsync.
How to use scp.

Another cool tip if you need to get a large file to your boss or a client or whoever, is to use Amazon’s S3 Service. If you have an Amazon account, this is very simple. Visit and sign up for their S3 service. S3 stands for Simple Storage Service and its a very easy way to upload and store files online, as well as make them accessible from the Internet. Here is a whirlwind guide on how to get going:

  1. Sign up for the Amazon S3 Service
  2. Install a Firefox plugin called S3 Organizer – if you don’t have Firefox you can get it here
  3. Open S3 Organizer from the Tools menu in Firefox and click on Manage Accounts
  4. Take the access key and secret key you received from Amazon after registration and put it in the corresponding fields in S3 Organizer
  5. On the right-hand-side of S3 Organizer click on the Create Directory button – the directory name must be unique for the whole of Amazon S3, not just for you. So use something obscure if you need to.
  6. Select the directory you have just created and click Edit ACL. You can then share the directory you have just created based on the options provided. Sharing by e-mail address is the easiest.
  7. Now you can copy files from the left hand window – which is your local computer to the directory you just created in Amazon S3.
  8. Finally, for your boss, client, colleague to access the file you just uploaded – right click the file in the right hand window and select Copy URL to Clipboard. You can then paste the URL into an e-mail and they will be able to download it from their web browser.

Bear in mind that Amazon does charge you for the amount of data you upload and download from S3, but its minimal. Around US$0.12 per gigabyte.

4. Instant Messenger Advice

Using an Instant Messenger is great if you need to interact with your colleagues. The are however, downsides. It takes a lot longer to explain a concept or ask a complicated question if you have to type it out. I try to minimize the amount of questions I ask and really try to solve the problem myself before asking a colleague.

Another downside, is that it can be distracting if you are grappling with a difficult problem and your Instant Messenger starts going berserk with incoming messages and pinging noises. A simple fix for this is not to stay logged in all the time and only log in at specific periods during the day.

All in all, working remotely seems to be working out well. It is taking some getting used to, but I think the benefits are worth it in the end. I have managed to find time to start this blog and also have a more regular exercise regimen. I also don’t feel stressed after spending time in traffic. If you are having trouble working out a strategy to get your boss to allow you work from home, I highly recommend reading the chapter on it in the book The Four Hour Work.