Find or Create a Custom Search Engine

1 - I can do better 2 - Jury's out 3 - Pretty darn good 4 - Splendiferous 5 - Awesometastic by 0 people | Log in to rate

Ranked #30,668 in Tech & Geek, #637,746 overall

www and web portals

Search engines help to minimize the time required to find information and the amount of information which must be consulted, akin to other techniques for managing information overload. From the stateless server of hypertext (http) to hypermedia (URL.)

A web portal (i.e. iGoogle, Live.com) is a site that functions as a point of access to information. The the different applications prefer to have a single access point to all of them over the Internet. portals reach out to the widespread diaspora across the world.

Other kinds of search engines include Enterprise search engines: intranets, personal search engines, and mobile search engines. Google says that about 500 million of its 2 billion records are unindexed Web documents. These are URLs for Web pages (or for other file types such as PDF or PowerPoint files) which Google has not crawled and indexed.

Rewrite Engines and Dynamic Content 

A URL is easier to use if it is short but descriptive

A rewrite engine is a piece of web server software used to modify URLs before fetching the requested item, for a variety of purposes. This technique is known as URL rewriting. In Java, the term "URL rewriting" is sometimes used to describe a Web Application Server adding a session id to a URL when cookies are not supported.

URL (Uniform Resource Locator), an address that points to a particular document or other resource on the Internet, used most frequently on the World Wide Web (WWW).

Keeping URIs so that they will still be around in 2, 20 or 200 years is clearly not as simple as it sounds. Web crawlers in an automated manner are bots performing URL normalization in order to avoid crawling the same resource, hyperlinks, and seedlings more than once and through endless combinations of relatively minor scripted changes in order to retrieve unique content.

Basics List 

Web cache - Wikipedia
Web caching is the caching of web documents (e.g., HTML pages, images) in order to reduce bandwidth usage, server load, and perceived "lag". A web cache stores copies of documents passing through it; subsequent request
List of HTTP status codes (404 Not Found) - Wikipedia
List of HTTP status codes From Wikipedia; HTTP response status codes and standard associated phrases, intended to give a short textual description of the status. Others are unstandardised but commonly used; 404 Not Found; Request received, continuing process; Servers 100-510.
Search Engines by Search Features
This page compares Internet search engines by their search features and provides a listing of search engines by the search features.
Search Engine Preferences
Preferences: use the options to customize your default search settings. Sorting by Search Engine (Web Search) or Source (Images, Audio and Video Search) will display results grouped by provider.
Alphabetical List of SearchTools Product Reports
Search Filter: Results pages will display a message noting that the Search Filter is On. You can also limit to entries added within one of the following specified periods of time. Complete listing of search tools for creating indexes and search engines for Web sites, Intranets and topical portals. Includes products for Unix, Mac, and Windows servers; Perl and Java; commercial and open source; remote search services; code libraries, SDKs, APIs and toolkits.

More about URL and HTML 

'What are those "%20" codes in URLs?

Only alphanumerics [0-9a-zA-Z], the special characters
"$-_.+!*'(),"

[not including the quotes - ed]and reserved characters used for their reserved purposes may be used unencoded within a URL.

HTML, on the other hand, allows the entire range of the ISO-8859-1 (ISO-Latin) character set to be used in documents - and HTML4 expands the allowable range to include all of the Unicode character set as well. In the case of non-ISO-8859-1 characters (characters above FF hex/255 decimal in the Unicode set), they just can not be used in URLs, because there is no safe way to specify character set information in the URL content yet [RFC2396.]

URLs should be encoded everywhere in an HTML document that a URL is referenced to import an object (A, APPLET, AREA, BASE, BGSOUND, BODY, EMBED, FORM, FRAME, IFRAME, ILAYER, IMG, ISINDEX, INPUT, LAYER, LINK, OBJECT, SCRIPT, SOUND, TABLE, TD, TH, and TR elements.)

Google Logos 

Gmail Beta Labs by 4braham

Gmail Beta Labs

helvetas by euforic

helvetas

logo by nDevilTV

logo

Yahoo! 4th Of July Logo 2009 by search-engine-land

Yahoo! 4th Of July L...

Ask.com 4th Of July Logo 2009 by search-engine-land

Ask.com 4th Of July...

Google 4th Of July Logo 2009 by search-engine-land

Google 4th Of July L...

Search Engine Roundtable 4th Of July Logo 2009 by search-engine-land

Search Engine Roundt...

Dogpile 4th Of July Logo 2009 by search-engine-land

Dogpile 4th Of July...

Bing 4th Of July Logo 2009 by search-engine-land

Bing 4th Of July Log...

Logo adSense by vlima.com

Logo adSense

Bookmark, Favorites, and Favicons 

Making a urlicon (favicon)

Once you've designed an icon, reduce the size to a square 16 x 16 pixel size. Second, download an icon editor such as IconEdit Pro. Third, open your file in the icon editor and tweak it until it is perfect. Save with 16 colors as favicon.ico and upload it to each of the directories in your website. Now every time an MSIE 5.0+ user bookmarks a webpage in a directory containing favicon.ico your icon will show up in their bookmark file.

A favicon or an urlicon, is an icon associated with a particular website or webpage. A bookmark is used to describe a method of marking a page so it can be referenced at a later time without having to remember the address. Browsers that support favicons may display them in the browser's URL bar, next to the site's name in lists of bookmarks. Another common problem is that the favicons may disappear if the browser's cache is emptied.

Bookmark List 

Eureka - The Language and Translation Search Engine
Language: Include results only written in the selected language. Yahoo! has no case sensitive searching. Eureka is a search engine for the foreign word, classifying language and translation related web sites, software and useful information.
Keotag generator
Keotag - tag search multiple engines, tag generator and social bookmark links generator
Rollyo: Roll your own search engine
The RollBar Bookmarklet lets you search whatever site you're on, or use any of your Searchrolls from anywhere. You can also add sites to existing Searchrolls, and create new ones on the fly. You can take it with you...

Bookmarking 

Bookmarks and Favorites save Web addresses so you can return to them quickly, without having to retype them. Whether you are using Mozilla Firefox, Netscape Navigator or Internet Explorer, the procedure is similar.

In your netscape window click on bookmarks and select edit bookmarks. A window similar to the oneon the left should appear. The items in this window represent your bookmarks. If you use Internet Explorer 5.0 or later releases, click on the Favorites button on the toolbar to open the Favorites window.

Del.icio.us was purchased by Yahoo! and is probably the most popular social bookmarking tools online; magnolia.com takes you to Exxon-Mobil; Furl is one of the oldest social bookmarking tools online and is owned by the Looksmart company; Many librarians use Furl and the system's users are very loyal; Markaboo is open source, under a Creative Commons license. You can take photos from a mobile device and SMS them... Traditional means of organizing information elements have generally relied on well-defined and pre-declared schemas ranging from simple controlled vocabularies to taxonomies to thesauri to full-blown ontologies.

Bookmarking 

Magnolia by ncomment

Magnolia

Ma.ture Technology by ncomment

Ma.ture Technology

Auto-Blogger Floor Pedal by Mike Licht, NotionsCapital.com

Auto-Blogger Floor P...

Digital Sharing by Dave Duarte

Digital Sharing

New Controls in Google Reader by Yandle

New Controls in Goog...

Primary Search Engines 

Internet search engines are categorized by topic, searchable directory of general and specialty search engines, resources and tools for exploring the deep web, performing advanced research, and for learning about using search engine tools and technology.

Categories and Web search don't often go together. No search engine contains everything available since crawl rates use term suggestion tools as from the Google Directory results appear incorporated into result sets from the main database. Or primary crawls only accesses HTML material when the Google spider retrieves the URLs that appear on the HTML pages in a second crawl. AllTheWeb uses a specialized crawl of the Web to build catalogs to primary search result sets.

Real Time 

Instead of ranking potentially useful pages against all other pages in the database, Teoma ranks results against other pages in the same "community" of pages. These "communities" are built dynamically, in real time.

An Advanced site search and code, sitemap, and navigation technology can be added to your website. In some cases, the content updates more frequently than general engines and should increase precision and lower recall.

Finding Primary Search Engines 

Google has language, domain, date, filetype, and adult content limits. Multiple search terms are processed as an AND operation by default. Phrase matches are ranked higher.
Archives of Dead Web Pages: Wayback, Cache, and More
Lists and compares Internet current awareness services.
Search Engine Bookmarklets
Shortcuts to search with these search engines bookmarklets.
Custom Search Engines Reviewed
Search Engine Showdown reviews of custom search engines and build your own search engine options.
The WWW Virtual Library
The WWW Virtual Library; Quick search: If you maintain a superlative guide to a specialised
area of the Web, the Virtual Library would be pleased to consider a request to add your 'library' to the WWW Virtual Library.
Live Search
Sometimes called just Live.com or Windows Live Search, this is the Microsoft Web search engine. Launched in September 2006, it uses its own, unique database. Live Search uses its own Web database and also has separate News, Images, Questions and Answers (QnA), Local, Video, Feeds, and Academic databases. Before the launch of the New MSN Search using its own, unique database on Feb. 1, 2005, it used an Inktomi database from Yahoo!.
AOL Search with Google
The AOL Search engine delivers great search results, enhanced by Google, plus relevant multimedia results delivered on a single page-so you can search less and discover more.
Google Mini Search Appliance
The Google Mini's relevant search results and customizable user interface make it a perfect solution for website search. Integration with Google Sitemaps also makes it easy to submit your website for inclusion in Google.com search results. The Google Mini works with over 220 different file formats.

Search Engine 

Trading a Peruvisn drink for a drawing from @gapingvoid by magerleagues

Trading a Peruvisn d...

@piscosf secret outdoor distillery #crunchup by magerleagues

@piscosf secret outd...

Classic @ijustine & @shiralazar by magerleagues

Classic @ijustine &a...

This guy did a pretty good drawing of me! by magerleagues

This guy did a prett...

Heavy traffic and lots of sun #crunchup by magerleagues

Heavy traffic and lo...

Guess how many nerds are at the realtime #crunchup? #oneriot photo  quiz by magerleagues

Guess how many nerds...

Riding to Menlo with @mrtweet in the back seat by magerleagues

Riding to Menlo with...

Fox Theatre balcony geeks #techcrunch by magerleagues

Fox Theatre balcony...

Bday bbq by numb3r

Bday bbq

Google celebrates Tesla's birthday by ronin691

Google celebrates Te...

Directories 

Subject directories include human-selected Internet resources and are arranged and classified in hierarchical topics. Most search engines and portals have a subject directory component or partner.
Yahoo!
To see all the new Yahoo! home page has to offer, please upgrade to a more recent browser. The Yahoo! directory results display the site title, description, URL, and the category name. Yahoo! provides results in six categories. The first listed results under Web are from the search engine, currently Google, and are sorted in Google's relevance order. Second is a link to Google's image database.
LookSmart Vertical Search
LookSmart vertical search makes it easy for you to find what you need. User history: save what you like and share what you want. Entries are all selected by the editorial team of almost 200 editors. Common words are ignored. LookSmart and LookSmart Live has over 2,300,000 unique URLs according to the company with 250,000 categories. No phrase or proximity searching, field searching, is available. No limits are available.
Encyclopaedia Britannica
Encyclopedia Britannica Online, featuring the complete Encyclopedia Britannica, Britannica Concise Encyclopedia, Merriam-Webster's Dictionary & Thesaurus, videos, web sites, and magazines. Britannica had over 125,000 entries in 1999. In the advanced search, a title search is available.

SEO (Search Engine Optimization) 

It helps a lot to pay attention to detail

Once a keyword relating to the products or services offered is keyed into a search engine, a direct link to your web site should emerge on top of the list as soon as the user hits the Enter button or clicks on 'Search.'

A person might find it to difficult use the scroll down option so you should also provide hyperlinks which are accessible to them. The domain name may be followed by the path, a list of additional names that identify subdirectories within that domain. The path may be followed by a specific document name. URLs are case-sensitive, which means that uppercase and lowercase letters are considered different letters.

More about URL 

Make your website more search engine and user friendly, helping you to appear in the top listings in search engines.

Spiders can read the following items on a web page, thus making the item highly relevant to search engines: text, page titles, meta tags, meta descriptions, code, tables.

HTML hyperlinks are life-giving food for Search Engines' hungry spiders. They lead to major inside pages or web site sections.

More about Unicode 

The number of operating systems and applications that understand Unicode is the successor to ISO 8859-1 as the base character set used in HTML

Query results in search engines, or the matches found by a search engine to the keywords entered by the user, are important because of the fact that these are the most effective tools through which users try to access information on the Internet.

There is a shift occurring in computer text representation. Traditionally, text is represented by a single character of data (1 byte or 8 bits) at its lowest level. This allows for 256 possible distinct characters. In languages where the entire character set exceeds this range (such as in Far East languages) two characters are used to represent a single character.

The Unicode standard was developed to greatly reduce all this fracturing of languages into conflicting character sets. All major character sets of the world can be represented using a total of only about 35,000 of these character code points in the unicode set. The most popular version of this translation mechanism in use is UTF-8.

Yahoo! Alpha 

Chris Henny Saqib by Martin Kliehm

Chris Henny Saqib

Chris Henny Saqib by Martin Kliehm

Chris Henny Saqib

Chris Henny Saqib by Martin Kliehm

Chris Henny Saqib

 by AmandaLouise

Jawa Moped Engine by J.Smith831

Jawa Moped Engine

3563544126_ea7213bcfe_b Leatherchrissy by ChrissyLeather

3563544126_ea7213bcf...

035b Leatherchrissy by ChrissyLeather

035b Leatherchrissy

file_13274- F Leatherchrissy by ChrissyLeather

file_13274- F Leathe...

092b Leatherchrissy by ChrissyLeather

092b Leatherchrissy

162b Leatherchrissy by ChrissyLeather

162b Leatherchrissy

Leatherchrissy by ChrissyLeather

Leatherchrissy

Leatherchrissy by ChrissyLeather

Leatherchrissy

067b Leatherchrissy by ChrissyLeather

067b Leatherchrissy

3398429463_0106181fc0_o Leatherchrissy by ChrissyLeather

3398429463_0106181fc...

039b Leatherchrissy by ChrissyLeather

039b Leatherchrissy

Query Search Bots 

To build a personalized search robot such as an instant messaging chatterbot and send it out to search the web by tag, colour, mood, photo or location. Ask it questions and watch it grow up and learn about the world as you do.

On the Internet, the most ubiquitous bots are the programs, also called spiders or crawlers, that access Web sites and gather their content for search engine indexes. A bot is a piece of software that compromises a computer. Keep your operating systems and applications fully patched. Check your browser preferences and downloads and turn Javascript on. The backdoors installed by bot software aren't easily removed and require that a computer be rebuilt.

Spambot 

The largest use of bots is in web spidering, in which an automated script fetches, analyses and files information from web servers at many times the speed of a human. Each server can have a file called robots.txt, containing rules for the spidering of that server that the bot is supposed to obey.

A spambot is an internet bot that attempts to spam large amounts of content on the Internet. Bots are also used to buy up good seats for concerts, particularly by ticket brokers who resell the tickets.

Bots are often used to farm for resources that would otherwise take significant time or effort to obtain. Another learning experience is trying to get your bot to grasp and understand natural human language and respond to it appropriately. The combined work of the entire community may prove to exceed what has already been done.

Search Engine Blog Posts 

Bing is Now the 13th Most Visited Website
So while we are all recovering from Google's announcement of the Chrome OS, let's get back to one of...
Google Launches My Location for Google Maps
If you've ever used Google Maps on your iPhone or other mobile device, you're probably familiar with...
Google Product Search for Mobile Updated for More Languages and ...
When Google Product Search for Mobile was launched, it was only for the iPhone and Android phones. N...
.tel search engine Qwista « AltSearchEngines
Qwista is a search engine that opens up new opportunities to search through global online-catalogues...

Favorites 

Reader Feedback 

Like this lens? Want to share your feedback, or just give a thumbs up? Be the first to submit a blurb!

Search Engine RSS 

ClipSearch Results

Loading Fetching RSS feed... please stand by

Great Stuff on Amazon 

Googlepedia: The Ultimate Google Resource

Amazon Price: (as of 07/11/2009) Buy Now

Mining Google Web Services: Building Applications with the Google API

Amazon Price: $26.99 (as of 07/11/2009) Buy Now

The Extreme Searcher's Internet Handbook: A Guide for the Serious Searcher

Amazon Price: $16.47 (as of 07/11/2009) Buy Now

Google: The Missing Manual

Amazon Price: $16.49 (as of 07/11/2009) Buy Now

Google Power: Unleash the Full Potential of Google

Amazon Price: $22.49 (as of 07/11/2009) Buy Now