Table of Contents
The Golden Rule
Scaling is the art of designing an infrastructure that's able to grow along with usage. Premature optimization, however, can waste valuable cycles and cripple a project's success.
What This Tutorial Doesn't Cover
All apps are different. This tutorial doesn't cover edge cases, and assumes that your app is on a LAMP (Linux + Apache + MySQL + Python/Perl/PHP) architecture. Many of these concepts will work in other environments, but I can only speak for LAMP.
If you haven't tweaked your code yet...
Head First Design Patterns
A great introduction to code architecture and common design patterns. This book uses examples in Java, but can be easily applied to your language of choice.
Refactoring: Improving the Design of Existing Code
If you're working from an existing code base and not sure where to start, Refactoring is the perfect guide. It gets a little dry in places, but even a quick glance over the book can help a great deal.
When to Scale
However, it is crucial that you always think ahead so that at any given time you know, and are prepared for, the next step in scaling your app. But don't pull the trigger until you need to. Once you start noticing trouble, it's time to scale to the next level as quickly as possible. Don't wait, because scaling problems have a tendency to grow exponentially. So here's the rule: Don't scale until the first signs of trouble, but then scale to the next level as quickly as possible. Always stay one step ahead (not three or four).
Step 1: Dedicated Server
If you're new to server administration, it would be wise to select a host who will help you troubleshoot web or database server configuration problems. Obviously, you will have to pay more for a host that does (this is commonly called Managed Hosting).
Another route is to use a managed grid system like Mosso or MediaTemple, but because of the lack of flexibility I'm going to assume that this is not an option for your app. If it is, by all means consider it.
For your first server, don't worry about getting one that has tons of RAM (2 GB should be enough). If data integrity is a priority, installing a RAID 1 hard drive config will give you added protection against hard drive failure (albeit at the cost of decreased performance). More on this in a minute.
Backups
You'll want to backup everything required to rebuild your server from scratch. If you haven't started already, consider maintaining a folder on each server with archives of every software package you've installed, along with notes on how they were compiled.
Backup this folder along with any config files needed to run your system. Most hosting providers provide an off-site backup service, but if not you could consider using Amazon S3 or a number of other third party storage providers.
Most importantly, don't forget to backup your database. If you're using InnoDB tables in MySQL, a binary file backup of your data directory is not enough. Luckily, there's an excellent Perl script to make MySQL backups completely painless. I've used this script for years and it has never let me down.
A Quick Intro to RAID
At Squidoo, we've found that the sweet spot is RAID 5 for our application servers and RAID 10 for our database servers.
- RAID 0 is all about performance. Data is striped, or partitioned, across two or more drives. When a disk seek is made, the first drive to find the corresponding data replies right away. Available storage is 100% of total hard drive capacity.
- RAID 1 is about redundancy. No matter what your situation, you should be taking data backups daily. But what happens when your hard disk becomes corrupted an hour before your next backup is scheduled? With RAID 1, your data is mirrored on a second drive. Although you only get the storage capacity of a single drive, RAID 1 is crucial for data integrity. RAID 1 decreases performance because all data must be copied to a second drive.
- RAID 1 + 0, or RAID 10 as it is commonly known, is a combination of the above configurations. It is the security of data integrity without the performance issues. RAID 10 is also the most expensive to implement because it requires at least 4 drives. Available storage is only 50% of total capacity.
- When you can't afford the four drives required for RAID 10, a RAID 5 configuration affords you limited fault tolerance and decent performance. Raid 5 gives you (size of smallest drive * (number of drives - 1)) performance.
Step 2: Database Server
Once your dedicated server starts seeing performance bottlenecks, it might be time to configure a separate database server. This will allow you to tune each of your servers for their respective tasks.The database server should be more powerful than your web server, as it is one of the more difficult elements to change later on, requiring you to take your entire site down. Lots of RAM and a RAID 10 hard drive configuration are desirable.
RAID 10 gives you the data integrity benefits without as much performance sacrifice as a RAID 1 by itself. See the RAID section above for more details.
No matter what hard disk configuration you choose, make sure you give yourself enough storage capacity to grow for a while. Migrating to a new array of hard disks is no fun.
Step 3: Server Tuning
Now that you've got two servers, it's time to tweak them for the specific tasks they were born to do.Run the command 'ps aux' and pay attention to any non-essential applications lurking on your servers. For your web server, this should be anything not related to basic system function, security, apache, or mail. For the database server, it should be anything not related to basic system function, security, or mysql. The startup scripts for these applications are generally located in the /etc/init.d directory. Stop the program by running '/etc/init.d/[appname] stop'. Then delete the symbolic link to it in the /etc/rc.d/rc3.d directory to prevent it from starting up again the next time the system boots. You can quickly identify which is the symbolic link by running 'ls -al /etc/rc.d/rc3.d'. Disabling unused applications frees up RAM (which we'll need in just a second) and might even make your server more secure.
Next, tune Apache on your web server (psst, by now Apache shouldn't be running on your database server at all!). Begin by opening the httpd.conf file and commenting out/disabling all non-essential Apache modules. If you're unfamiliar with a particular module, try Googling it. If you're still unsure, try disabling modules one by one, as opposed to all at once. Since your web server is now left alone to perform one primary task, up the minimum number of Apache servers started and the minimum number of spare servers. This will significantly increase RAM usage, but hopefully you were able to free some up above.
You can often make your web site quicker by enabling gzip compression. Just about all modern browsers support the ability to compress output on the web server and send the compressed version to the browser. This results in faster download times for your users, and saves bandwidth on your end. There's an excellent article on enabling gzip compression with mod_deflate on HowToForge.
Step 4: Static File Hosting
The Ins and Outs of KeepAlives
Another thing to keep in mind is Apache KeepAlives. The Apache KeepAlive setting spares Apache servers by keeping a single connection open for a browser while it downloads all the external images, CSS, Javascript, and other static content associated with a page. Ordinarily this is great for performance, but in a high volume environment it can become tricky. Here's the situation.
When a surfer visits your app in a web browser, an Apache instance is started to fulfill the request. While processing the request, the Apache instance's memory usage grows to the size demanded by your application (let's say 300 virtual MB, for example). Once the main app is rendered, the instance's memory usage does not shrink, however. It will stay the same size, or even grow, as long as the connection is open, which results in Apache allocating 10 times (or more) the amount of memory required to serve the miscellaneous static content associated with your app.
The best way around this is to use a special lightweight web server dedicated to hosting only static content. lighthttpd (pronounced "lighty") works great for this.
Configure the static server to run on a different port, or on another physical machine altogether. Create a subdomain like static.yourdomain.com and use it to host all your CSS includes, Javascript source files, and images. Make sure this new server is configured to use gzip compression for all text-based files.
Finally, edit your Apache's httpd.conf file and disable KeepAlives for your primary web server. Then monitor the logs to ensure that Apache is no longer server up static files.
Step 5: Caching
From now on, almost all of your scaling problems will be database-related. Since most applications have a high volume of reads compared to writes, you can usually relieve most of the strain on your database using an object cache like Memcache. Once integrated with your app, Memcache can keep your most commonly accessed data right in memory, where it can be retrieved much faster and with less overhead.
All Apologies
This lens isn't quite done yet, but I promise to finish it soon. Thanks for taking the time to read it so far, and please let me know if you have any questions.
Comments
Have a question or comment about this guide? Post it here.
-
Reply
- awilensky awilensky Jan 31, 2008 @ 9:17 pm
- You hardly touched upon MySql write saturation, which is a vexing problem that has never been solved, even for clusters.
-
Reply
- YourBuddy YourBuddy Jan 29, 2008 @ 12:00 pm
- Great info written in simple words. I like such style:)
Concerning the growing of S, it seems that in the next two years you should look into the area of distributed maps like amazon s3:)
Thank you for the story. It's very pleasant to read.
-
Reply
- pacsafe pacsafe Jan 24, 2008 @ 1:56 am
- Thanks that was a good read and will come in very useful
-
Reply
- Music-Resource Music-Resource Jan 21, 2008 @ 9:22 pm
- Hi Gil, I'm really into systems analysis and design. I'm also into project management. I love to plan things - trees and forest - carefully and in detail ahead of coding. I like best coding practices and similar. Very kewl Scalability lens. Excellent read. ~Music Resource~
-
Reply
- charlino charlino Oct 10, 2007 @ 1:02 pm
- You get 5 stars from me. Wonderful teaching guide.
-
Reply
- jackclee jackclee May 17, 2007 @ 2:55 pm
- Great info here into the inner workings of Squidoo. Perhaps you could write a squidbook on this topic and other Squidoo technicals. Come and join our group -
Squidbook group
