How We Engineered Squidoo

1 - I can do better 2 - Jury's out 3 - Pretty darn good 4 - Splendiferous 5 - Awesometastic by 104 people | Log in to rate

Ranked #10,631 in How-To, #112,615 overall

About Squidoo

Founded in mid 2005, Squidoo launched in early 2006 and quickly gained traction as a platform for sharing ideas online. On Squidoo, everyday experts (yes, that means you) can handcraft individual web pages (just to be difficult, we call them lenses) about hobbies and interests and stuff for sale and ideas worth spreading.

Compete recently reported that Squidoo was the 14th fastest growing website during 2007 in the entire world. According to Alexa, Squidoo has ranked in the top 1,000 most visited sites on the web for at least the past year. Quantcast currently ranks us in the mid 300s.

As of January 2008 Squidoo has over 150,000 users authoring almost 400,000 pages. We currently see over 7 million monthly visits, and at least twice as many pageviews. Factoring in Ajax, RSS, and API calls, our servers currently process about 45 application requests per second.

With just a tiny full-time team of four five, not to mention all our volunteers and Citizens, we think it's pretty incredible to keep such a high traffic site online while also managing to release new features and bug fixes on a regular basis. And so I thought it'd be interesting to share a little background on our engineering efforts , the technical challenges we've faced, and how we got to where we are today. Enjoy!

The Starting Point 

Summer, 2005

Here we were—sitting in a small, sparsely furnished room, discussing how we were going to build a new platform (and ultimately, an ecosystem, a community) which we dreamed would change the way people find things online.

From the beginning we knew the traditional approach wouldn't work. We had a tight schedule, yet we had no VC money or plans for hiring a massive staff. How would we accomplish such lofty goals with such a tiny budget?

Where It All Started 

Our hip, modern office is located just a few feet away from the mighty Hudson River, 10 short miles from New York City in a 100% authentic town called Irvington, NY.

It's named after Washington Irving, although I try not to stick around late enough at night to find out if there really is a headless horseman.

curated content from Flickr

Choosing a Platform 

Squidoo's first stage of development began in the fall of 2005. At the time, Ruby on Rails was starting to emerge as a dominant force in what was only recently being dubbed Web 2.0. Although we considered using Rails, it was still a relatively immature and unproven technology platform (boy, a lot has changed since then), and we chose PHP in its place. Since PHP goes hand in hand with Linux, Apache, and MySQL, adopting the entire LAMP platform was a no-brainer.

Choosing PHP—with its extensive documentation, ubiquitous server support, abundance of third party libraries and API wrappers, and optimizable speed—was a reasonable decision. PHP has its detractors, but with the right architecture and proper discipline in place, it has no shortage of potential for driving high-powered web applications.

Speaking of architecture, ours might appear very familiar to a Rails developer—featuring a clear distinction between actions, models, helpers, and so on. We even have our own dispatcher, which came in handy when one of our main requirements was designing a human-friendly URL structure.

Techies Take Note

This lens is just a story. If you're looking for a more detailed account of the technical stuff, check out my Scaling a Web App lens. Sorry, it's not quite done yet, but I promise to finish it soon.

Beginning Development 

Setting up a team

To keep our staff as lean as possible, we considered several options, ultimately leading to a parnership with Viget Labs, who helped us build version 1.0 of Squidoo.

Hiring in-house employees vs an outside consulting firm can be a difficult choice for any startup. On one hand, hiring employees can be a great (and sometimes inexpensive) way to ensure dedication to your project, but the security of fixed cost development and no long-term employee commitments proved to be the right choice for us.

Working closely with Viget's designers and developers, we created and managed a constantly evolving wiki that served as our team's development guidelines. The wiki technology actually turned out to be a great asset during development, allowing our team to iterate through new design ideas without waisting valuable time reconstructing a more formal specifications document. The same wiki system is still in use today, and remains a hub for our project planning and software specs.

We quickly established a habit of 2 week development iterations. On every third week our aim was to lock down a feature list for the next iteration as quickly as possible, reserving the latter part of the week for deploying updates, testing code, and fixing bugs. This approach allowed us to prototype new features and release them into the wild with lightning speed.

Servers, servers, servers! 

We couldn't afford to hire a dedicated server administrator (or two). In startup mode most employees wear multiple hats, and my role included not only development, but designing and configuring our server installation as well.

While I had maintained my own servers for years, I had relatively little experience with high traffic, high availability server clusters. To be on the safe side, we figured it wise to find a partner who could help fill in the gaps when it came to tough technical issues. That partner became Rackspace, a managed hosting provider well-respected for their focus on "fanatical support". Although their services come at a significant premium, Rackspace has since helped us navigate effortlessly through a number of technical hurdles.

With several high profile launch announcements, we expected traffic to grow quickly. To make sure we were properly prepared, our initial setup included a four server configuration: 1 staging server, 2 web/app servers, and 1 database server. Although a software-based load balancer was considerably cheaper, we chose a Cisco hardware appliance to split the load between our web servers—wary of the downtime required during the inevitable upgrade from a software to hardware-based solution once traffic demanded it.

In retrospect, configuring two web servers may have been borderline premature optimization. Although our usage quickly grew to full capacity on both servers, the effort involved in synchronizing the two systems was tedious and error-prone.

With two web servers, we began running into all sorts of filesystem-related issues. We installed Unison (a two-way rsync) to make sure the avatars our forum users uploaded were available on both servers, and we invested in Zend Platform for multi-server persistent sessions, code acceleration, and aggregated error logs. Unfortunately, in a totally dynamic, user-centric application, the content caching feature of this commercial product isn't quite as useful.

Launching Our Beta 

December 2005

I screwed up. After our first few encounters with the blogosphere, we managed to compile a list of enough interested participants to begin a limited beta. Sure, our product was still a little buggy, and definitely needed some improvement, but it was finally ready to be tested by real, live humans. The mailing list was compiled, the email copy was written, and the beta invitations were sent....three times!

It was as public a statement as we could make that what we were offering was truly a beta product. And although we quickly learned from the mistake and added "take more care" as our mantra for potentially dangerous tasks, to this day I still haven't recovered from the anxiety of emailing large groups of people. If you've ever wished you could take back an email you wrote, you know what I mean. Daily Candy and Photojojo, I don't know how you do it.

A full page spread in the local paper!

Making the Transition 

Finally, in March 2006, our beta was complete. The world was ours—to infinity and beyond we marched!

Our contract with Viget had just ended, and it was a difficult time for us. Suddenly it was just down to one designer (Corey) and developer (myself). Our platform wasn't anywhere near complete, and we had ginormous goals to extend it with a plethora of new features.

Although I worked closely with Viget while developing our beta, there was still a significant learning curve in getting up to speed with the system (at that point it was approaching nearly 50,000 lines of code). Luckily, their team was very supportive during this process.

Hat tip to Ben for putting up with my hourly IM conversations, which usually went something like "So I'm building an Ajax call for saving tags from the profile page. Which controller do I put that in again?" Frustrating for him, but it didn't take me long to get the hang of things.

The larger issue was how to balance new development with resolving bug reports, responding to our community, recruiting freelance developers (for smaller projects), and, last but not least, marketing!

We struggled initially, but over time we were able to find a development process that worked for us, and things were just peachy...for a while.

Server Gremlins 

One day, out of the blue, all of our servers just crashed. I couldn't login from the command prompt, and we called Rackspace to manually reboot them. Once rebooted, the site ran normally, and I could find no trace of what happened.

Everything was fine for a few days, but then it happened again. And again. And again. It seemed we could only keep our servers online for a few hours at a time without requiring a hard reboot. At this point, it's safe to say I was seriously considering the benefits and lack of responsibility that come along with delivering pizzas for a living.

We soldiered on, and with a few days of effort and the help of a guru from Rackspace, we were able to trace the problem. A few new Squidooers had unwittingly added RSS modules which linked to the RSS feeds of their own lenses. When a surfer visited one of these corrupt lenses, the RSS module would update by visiting the lens, which would update the RSS module, which would visit the lens, which would update the RSS module...ok I'm going to stop now, because all this recursion is making me dizzy.

Once we found the culprit it was an easy fix. The RSS module now carefully screens for self-referencing feeds, preventing the same disaster and keeping the servers online so Squidooers can do their thing.

Like a Slug 

Our luck was turning, and the Squidoo platform was picking up steam. We'd managed to develop quite an extensive list of features (considering the size of our team), and traffic was growing exponentially. But with more features and more users came more congestion. Our database, which was growing at about a gigabyte every week, soon started to lag.

Although we had already spent some time optimizing the database layer (by being picky about the data we selected, creating indexes, etc), the database just couldn't keep up. Squidoo was crawling. We weren't database experts, but once again Rackspace came to the rescue. It turns out that even though we had a dedicated MySQL server, only small portions of its available resources were being used.

With a few quick tweaks (to the key buffer, innodb buffer, table cache, and a few others), Squidoo was flying again. Even better, our database server was a good sport and didn't even complain about the extra load. Thanks, db1!

Squidoo accepts an award at SXSW 2007

Spam: The Internet's Worst Nightmare 

By June 2007, fresh from winning an award for "best community site" at SXSW, Squidoo was in great shape. Our traffic was continuing to grow, and, if I got lucky, at least one person at every party I went to had at least heard of us (note to new startup founders: perfect your elevator pitch, because you're going to be using it a lot).

Things were going so well, in fact, that spammers started to notice. Within the course of a few short weeks, an enterprising spammer began massively polluting the blogosphere with links pointing to their Squidoo lenses. The spammer exploited the fact that we allowed <iframe> tags for more flexibility. In this case it proved too flexible, and the effect was that clicking on a link from a blog would bring you to a Squidoo lens, but immediately redirect you to a site that hosted—you guessed it—pornography.

The method used by the spammer made it appear that Squidoo was somehow affiliated, and shockwaves rippled throughout the blogosphere. We got a fair amount of heat, but did our best to remedy the situation as quickly as possible. Soon after, I posted followup notes explaining our position on spam and the new steps we were taking to keep Squidoo spam free. But the damage was done.

Lesson learned. Spam is a major problem on the internet, and in an age where reputation is everything, organizations can't afford not to have an aggressive anti-spam policy.

How We Handle Customer Service 

With so many users, it's only natural that customer service is going to be a big issue.

On the technical end, more users means more opportunities for running into obscure bugs, and when major ones hit it means responding can take a long time.

Our "release early and often" philosophy means that we get plenty of feedback from users on new features (sometimes within minutes of deployment).

In addition, keeping spam at bay is a major concern. Although we have a number of automated tools designed to block spam before it starts, we also rely on the community to report suspect activity.

All this leads to lots and lots of feedback. To save our sanity, all feedback is filtered through FogBugz, where our wonderfully dedicated customer service supervisor helps with common customer service issues, filtering and prioritizing any remaining bug reports, which then get sent to our development team. We also have a dedicated anti-spam editor who reviews spam reports (and other suspicious activity) and takes action as necessary to keep Squidoo a safe, well-lighted community.

Breaking It Down 

The key to creating a scalable application is designing an architecture that can be broken down into bite-sized chunks.

When our application servers started working overtime, we offloaded images and other static content to a lightweight web server (lighttpd) and optimized it for caching. No longer having to serve up a dozen or more static files with each request, our app servers gained a little breathing room.

When our Javascript codebase kept growing, we minified it into one tiny, compressed file. Now there are fewer web requests and less code to download.

When our database started slowing down, we installed memcached and scrutinized our model classes to cache everything possible. The result was a faster Squidoo, and a happier db1!

Where We're At Today 

Although some of the challenges chronicled above seem quite scary, we've learned a lot along the way and have had virtually no downtime in the past 6 months. We continue to release new features on a regular basis, and do our best to extinguish bugs as soon as we hear about them.

Our sister site, SquidU, features a lively discussion forum where lensmasters share tips, get help with common issues, and keep up to date with the newest features. The forum has over 70,000 posts and grows by a few hundred posts each day.

We have a number of exciting new projects just ahead, and we're currently looking for a new developer to help us make things happen.

Now Accepting Your Feedback 

Did I miss something? If you have any questions or comments please feel free to comment below.

submit

by giltotherescue

Gil Hildebrand, Jr. is an experienced software developer based in New York City. He is currently running things as the Chief Engineer of Squidoo, and... (more)
Create a Lens!