Ajax Crawling
Ranked #2,192 in Internet, #128,622 overall
Yes, Bing does the "ajax crawling"
Almost two years ago google proposed the "hash-bang" standard to make JavaScript rich websites crawlable. This was great news for websites making heavy use of JavaScript to augment user experience. Bing is late to the party, but anyway it's here now. Say "hello" to the little spider!
Contents at a Glance
Hard core JavaScript sites
A modern approach to leverage user-agent's processing power
Pulling the content in data-only form (json, xml) and using JavaScript to build the HTML representation brings many advantages:
to mention just a few.
There are downsides too. Besides some technical challenges there was an obstacle of epic proportions: JavaScript generated content was invisible for the search engines condemning that shiny new website to the island of isolation and oblivion.
The problem lies in the way JavaScript rich site fakes URL history and in-site links. The only way to prevent browser reloading the whole page (until HTML 5) was to add content behind "#" which was intended for bookmarking the page. This region of url is called "the fragment" and is never sent to the server.
http://si.draagle.com/#!browse/group/root/
and
http://si.draagle.com/#!source/est/drug=esu&fact=est_cyy
are pointing to the same page from servers (and crawler's) perspective.
The obvious solution would be to build a different tree just for search engines and deliver them some stripped down version of the content. Unfortunately this tactic is commonly used by spammers and is brutally penalised by most search engines. Don't do it at home!
- the amount of information transferred through the wire is minimal,
- new content can be added to the existing one without the need to reload the page,
- different representational forms can be rendered from the same data without touching the server over and over and
- user interaction with the page can be refined far beyond what can be done in plain or modestly JavaScript assisted HTML
to mention just a few.
There are downsides too. Besides some technical challenges there was an obstacle of epic proportions: JavaScript generated content was invisible for the search engines condemning that shiny new website to the island of isolation and oblivion.
The problem lies in the way JavaScript rich site fakes URL history and in-site links. The only way to prevent browser reloading the whole page (until HTML 5) was to add content behind "#" which was intended for bookmarking the page. This region of url is called "the fragment" and is never sent to the server.
http://si.draagle.com/#!browse/group/root/
and
http://si.draagle.com/#!source/est/drug=esu&fact=est_cyy
are pointing to the same page from servers (and crawler's) perspective.
The obvious solution would be to build a different tree just for search engines and deliver them some stripped down version of the content. Unfortunately this tactic is commonly used by spammers and is brutally penalised by most search engines. Don't do it at home!
Do ya speak #!?
Google's solution
Google proposed a different deal. Site owners should modify their ajax link to include #! instead of just #. Whenever googlebot sees a #! in the URL, it considers it an ajax crawlable link and converts #! temporally to ?_escaped_fragment=, thus:
(1) http://si.draagle.com/#!drug/kxi/?sub=10
becomes
(2) http://si.draagle.com/?_escaped_fragment=drug%2fkxi%2f%3fsub%3d10
Note, how the fragment part got URL escaped.
Your part of a contract is to generate exactly the same content when goolgebot ask for (2) as the browser would generate for (1).
(1) http://si.draagle.com/#!drug/kxi/?sub=10
becomes
(2) http://si.draagle.com/?_escaped_fragment=drug%2fkxi%2f%3fsub%3d10
Note, how the fragment part got URL escaped.
Your part of a contract is to generate exactly the same content when goolgebot ask for (2) as the browser would generate for (1).
Bing me too
Copycat was shameful too long
For almost two years the only big search engine providing ability to search ajax content was Google's and considering it's market dominance it wasn't such a big deal. Still, we must admit, Bing is a great search tool in many ways comparable to the big G and it offers sort of second opinion. It's market share has grown noticeably in last year and the new paradigms such as smart phones invasion and social searching might shuffle market shares of the search engines even more.Few days ago, I noticed an important change in Bing's Webmaster tools, namely the Configure your site to have bingbot crawl escaped fragmented URLs containing #! check box.
The dilemma having your site in most of search engine's index vs providing superior user experience by applying the heavy artillery of JavaScript is diminishing fast. And this is good news equally for users and web builders.
Explore more
- Google's proposal for ajax crawling
- Consise instructions on how to make your site ajax crwalable.
- draagle.com the hash-bang pioneer
- draagle.com uses #! to expose it's content to the SE bot for ages ;-)
Guestbook Comments
-
-
Runnn
Sep 8, 2011 @ 10:11 am | delete
- You could do better. Looking forward to see more lens from you.
-
-
-
daria369
Aug 3, 2011 @ 4:14 pm | delete
- Great info, keep up the good work!! :)
-
-
-
dellgirl
Jul 22, 2011 @ 12:18 am | delete
- Very nicely done, thanks for sharing this. You really did a good job of explaining this.
-
-
-
pramodbisht
Jul 20, 2011 @ 4:50 am | delete
- thanks for sharing nice lens
-
-
-
aka_sakabato
Jul 18, 2011 @ 7:52 pm | delete
- I never exactly know how these search engines work, thanks for the explainations
-
-
-
jseven
Jul 18, 2011 @ 6:53 pm | delete
- It's pretty foreign to me, but I'm glad we have people like you to explain. :)
-
-
-
Tolovaj
Jul 18, 2011 @ 6:23 am | delete
- Interesting info, I do not understand half of it, but it seems very useful. I have to come back later... Thanks for sharing:)
-
-
-
pheonix76
Jul 17, 2011 @ 6:24 pm | delete
- Interesting -- I had never thought about this before! Nice lens.
-
-
-
mensday
Jul 17, 2011 @ 2:50 pm | delete
- nice
-
-
-
BFuniv.com
Jul 17, 2011 @ 11:48 am | delete
- useful, thanks
-
-
-
reasonablerobinson
Jul 17, 2011 @ 11:47 am | delete
- Blimey a dark art to a layman like me!
-
-
-
sukkran Jul 17, 2011 @ 11:33 am | delete
- thanks for the info. very informative lens.
-
-
-
VinkoA
Jul 17, 2011 @ 5:48 am | delete
- Nice :)
-
-
-
kRRt1979
Jul 17, 2011 @ 5:15 am | delete
- nice insight, well written!
-
Great Amazon Books
every geek shoud consider reading
by aleskotnik
Passionate about computers for as long as I can remember, heavily engaged in early tribal wars know as C64 vs Spectrum and Amiga vs Atari. Religious s... more »
- 1 featured lens
- Winner of 7 trophies!
- Top lens » Ajax Crawling
Feeling creative?
Create a Lens!
Explore related pages
- How to Submit your URL to Search Engines, Google How to Submit your URL to Search Engines, Google
- Yahoo Axis - The Game-Changing Search Engine Yahoo Axis - The Game-Changing Search Engine
- How To Promote Your Lens: Website Promotion for Squidoo Lensmasters How To Promote Your Lens: Website Promotion for Squidoo Lensmasters
- Search Engines - Love 'Em or Hate 'Em Search Engines - Love 'Em or Hate 'Em
- Optimasi Artikel Optimasi Artikel
- Linkbuilding: So wichtig ist professioneller Linkaufbau Linkbuilding: So wichtig ist professioneller Linkaufbau