The Amazing Adventures of Building Squidoo

Ranked #3,525 in Education, #81,216 overall | Donates to Squidoo Charity Fund

About Squidoo

Founded in mid 2005, Squidoo quickly gained traction as a platform for sharing ideas online. Squidoo makes it easy to share your passions—hobbies and interests and stuff for sale and ideas worth spreading. As of January 2012 we're a Quantcast Top 100 site reaching over 1.5 million people every day.

With just a tiny full-time engineering team of three, we think it's pretty incredible to keep such a high traffic site online while also managing to release new features and bug fixes on a regular basis. And so I thought it'd be interesting to share a little background on our engineering efforts , the technical challenges we've faced, and how we got to where we are today. Enjoy!

The Starting Point

Summer, 2005

Here we were—sitting in a small, sparsely furnished room, discussing how we were going to build a new platform (and ultimately, an ecosystem, a community) which we dreamed would change the way people find things online.

From the beginning we knew the traditional approach wouldn't work. We had a tight schedule, yet we had no VC money or plans for hiring a massive staff. How would we accomplish such lofty goals with such a tiny budget?

Choosing a Platform

Squidoo's first stage of development began in the fall of 2005. At the time, Ruby on Rails was starting to emerge as a dominant force in what was only recently being dubbed Web 2.0. Although we considered using Rails, it was still a relatively immature and unproven technology platform (boy, a lot has changed since then). We chose PHP in its place. Since PHP goes hand in hand with Linux, Apache, and MySQL, adopting the entire LAMP platform was a no-brainer.

Choosing PHP—with its extensive documentation, ubiquitous server support, abundance of third party libraries and API wrappers, and optimizable speed—was a reasonable decision. PHP has its detractors, but with the right architecture and proper discipline in place, it has no shortage of potential for driving high-powered web applications (just ask Facebook).

CakePHP, Zend, and other PHP MVC frameworks were just a glimmer in a developer's eye when we began building Squidoo, so we had to roll our own. Today our custom framework supports all of the features a modern MVC framework should: routing, logging, caching, APIs, robust templating, etc.

MVC Diagram
Diagram of MVC (Model View Controller) architecture

Beginning Development

Setting up a team

To keep our staff as lean as possible, we considered several options, ultimately leading to a parnership with Viget Labs, who helped us build version 1.0 of Squidoo.

Hiring in-house employees vs an outside consulting firm can be a difficult choice for any startup. On one hand, hiring employees can be a great (and sometimes inexpensive) way to ensure dedication to your project, but the security of fixed cost development and no long-term employee commitments proved to be the right choice for us.

Working closely with Viget's designers and developers, we created and managed a constantly evolving wiki that served as our team's development guidelines. The wiki technology actually turned out to be a great asset during development, allowing our team to iterate through new design ideas without waisting valuable time reconstructing a more formal specifications document. The same wiki system is still in use today, and remains a hub for our project planning and software specs.

We quickly established a habit of 2 week development iterations. On every third week our aim was to lock down a feature list for the next iteration as quickly as possible, reserving the latter part of the week for deploying updates, testing code, and fixing bugs. This approach allowed us to prototype new features and release them into the wild with lightning speed.

2 Week Iteration Diagram
What a two week iteration looks like

Servers, servers, servers

In startup mode most employees wear multiple hats, and my role included not only development, but designing and configuring our server installation as well.

While I had maintained my own servers for years, I had relatively little experience with high traffic, high availability server clusters. To be on the safe side, we figured it wise to find a partner who could help fill in the gaps when it came to tough technical issues. That partner became Rackspace, a managed hosting provider well-respected for their focus on "fanatical support". Although their services come at a significant premium, Rackspace helped us navigate effortlessly through a number of technical hurdles.

With several high profile launch announcements, we expected traffic to grow quickly. To make sure we were properly prepared, our initial setup included a four server configuration: 1 staging server, 2 web/app servers, and 1 database server. Although a software-based load balancer was considerably cheaper, we chose a Cisco hardware appliance to split the load between our web servers—wary of the downtime required during the inevitable upgrade from a software to hardware-based solution once traffic demanded it.

In retrospect, configuring two web servers may have been premature optimization. Although our usage quickly grew to full capacity on both servers, the effort involved in synchronizing the two systems was tedious and error-prone.

With two web servers, we began running into all sorts of filesystem-related issues. We installed Unison (a two-way rsync) to synchronize our session data and lensmaster-uploaded images.

Launching Our Beta

December 2005

I screwed up. After our first few encounters with the blogosphere, we managed to compile a list of enough interested participants to begin a limited beta. Sure, our product was still a little buggy, and definitely needed some improvement, but it was finally ready to be tested by real, live humans. The mailing list was compiled, the email copy was written, and the beta invitations were sent....three times.

It was as public a statement as we could make that what we were offering was truly a beta product. And although we quickly learned from the mistake and added "take more care" as our mantra for potentially dangerous tasks, to this day I still haven't recovered from the anxiety of emailing large groups of people. If you've ever wished you could take back an email you wrote, you know what I mean. Daily Candy and Photojojo, I don't know how you do it.

A full page spread in the local paper 

Making the Transition

Finally, in March 2006, our beta was complete. The world was ours.

Our contract with Viget had just ended, and it was a difficult time for us. Suddenly it was just down to one designer (Corey) and developer (myself). Our platform wasn't anywhere near complete, and we had enormous goals to extend it with a plethora of new features.

Although I worked closely with Viget while developing our beta, there was still a significant learning curve in getting up to speed with the system (at that point it was approaching nearly 50,000 lines of code). Luckily, their team was very supportive during this process.

Hat tip to Ben for putting up with my hourly IM conversations, which usually went something like "So I'm building an Ajax call for saving tags from the profile page. Which controller do I put that in again?" Frustrating for him, but it didn't take me long to get the hang of things.

The larger issue was how to balance new development with resolving bug reports, responding to our community, recruiting freelance developers (for smaller projects), and, last but not least, marketing!

We struggled initially, but over time we were able to find a development process that worked for us, and things were just peachy...for a while.

Server Gremlins

One day, out of the blue, all of our servers just crashed. I couldn't login from the command prompt, and we called Rackspace to manually reboot them. Once rebooted, the site ran normally, and I could find no trace of what happened.

Everything was fine for a few days, but then it happened again. And again. And again. It seemed we could only keep our servers online for a few hours at a time without requiring a hard reboot. At this point, it's safe to say I was seriously considering the benefits and lack of responsibility that come along with delivering pizzas for a living.

We soldiered on, and with a few days of effort and the help of a guru from Rackspace, we were able to trace the problem. A few new Squidooers had unwittingly added RSS modules which linked to the RSS feeds of their own lenses. When a surfer visited one of these corrupt lenses, the RSS module would update by visiting the lens, which would update the RSS module, which would visit the lens, which would update the RSS module...ok I'm going to stop now, because all this recursion is making me dizzy.

Once we found the culprit it was an easy fix. The RSS module now carefully screens for self-referencing feeds, preventing the same disaster and keeping the servers online so lensmasters can do their thing.

Like a Slug

Our luck was turning, and the Squidoo platform was picking up steam. We'd managed to develop quite an extensive list of features, especially considering the size of our team, and traffic was growing exponentially. But with more features and more users came more congestion. Our database, which was growing at about a gigabyte every week, soon started to lag.

Although we had already spent some time optimizing the database layer (by being picky about the data we selected, creating indexes, etc), the database just couldn't keep up. Squidoo was crawling. We weren't database experts, but once again Rackspace came to the rescue. It turns out that even though we had a dedicated MySQL server, only small portions of its available resources were being used.

With a few quick tweaks (to the key buffer, innodb buffer, table cache, and a few others), Squidoo was flying again. Even better, our database server was a good sport and didn't even complain about the extra load. Thanks, db1.

Squidoo accepts an award at SXSW 2007 

Spam: The Internet's Worst Nightmare

By June 2007, fresh from winning an award for "best community site" at SXSW, Squidoo was in great shape. Our traffic was continuing to grow, and, if I got lucky, at least one person at every party I went to had at least heard of us (note to new startup founders: perfect your elevator pitch, because you're going to be using it a lot).

Things were going so well, in fact, that spammers started to notice. Within the course of a few short weeks, an enterprising spammer began massively polluting the blogosphere with links pointing to their Squidoo lenses. The spammer exploited the fact that we allowed <iframe> tags for more flexibility. In this case it proved too flexible, and the effect was that clicking on a link from a blog would bring you to a Squidoo lens, but immediately redirect you to a site that hosted—you guessed it—pornography.

The method used by the spammer made it appear that Squidoo was somehow affiliated, and shockwaves rippled throughout the blogosphere. We got a fair amount of heat, but did our best to remedy the situation as quickly as possible. Soon after, I posted followup notes explaining our position on spam and the new steps we were taking to keep Squidoo spam free. But the damage was done.

Lesson learned. Spam is a major problem on the internet, and in an age where reputation is everything, organizations can't afford not to have an aggressive anti-spam policy.

How We Fight Spam Today

A Fused Approach

Over the years we've discovered a number of signals that indicate a high likelihood of spam. The catch is that any one signal can also include legitimate content. To raise our confidence level for identifying spam, we've developed a complex system called FUSE that combines all of these signals into a spam likelihood score.

Content believed to contain spam is never allowed to publish at all. While the algorithmic approach is highly effective, we also have a team of human reviewers scouring Squidoo daily.

How We Handle Customer Service

With so many users, it's only natural that customer service is going to be a big issue.

On the technical end, more users means more opportunities for running into obscure bugs, and when major ones hit it means responding can take a long time.

Our "release early and often" philosophy means that we get plenty of feedback from users on new features (sometimes within minutes of deployment).

In addition, keeping spam at bay is a major concern. Although we have a number of automated tools designed to block spam before it starts, we also rely on the community to report suspect activity.

All this leads to lots and lots of feedback. To save our sanity, all feedback is filtered through FogBugz, where our wonderfully dedicated customer service supervisor helps with common customer service issues, filtering and prioritizing any remaining bug reports, which then get sent to our development team. We also have a dedicated anti-spam editor who reviews spam reports (and other suspicious activity) and takes action as necessary to keep Squidoo a safe, well-lighted community.

Breaking It Down

The key to creating a scalable application is designing an architecture that can be broken down into bite-sized chunks.

When our application servers started working overtime, we offloaded images and other static content to a lightweight web server (lighttpd) and optimized it for caching. No longer having to serve up a dozen or more static files with each request, our app servers gained a little breathing room.

When our Javascript codebase kept growing, we minified it into one tiny, compressed file. Now there are fewer web requests and less code to download.

When our database started slowing down, we installed memcached and scrutinized our model classes to cache everything possible. The result was a faster Squidoo.

Batch Processing & Queues

With our growing data size, some computations are just too slow to be performed in real time. Tasks like updating our search index, recalculating LensRank, and processing payment, are now performed on independent servers with more resources to spare.

A queueing system allows us to perform slower tasks (like purging our caching servers after publish) in the background without slowing down lensmasters.

Now Accepting Your Feedback

Did I miss something? If you have any questions or comments please feel free to comment below.

submit

by

giltotherescue

I'm Gil, the Chief Engineer at Squidoo.

Feeling creative? Create a Lens!