วันจันทร์ที่ 17 ตุลาคม พ.ศ. 2554

chapter 8

How Search Engines Work
The term "search engine" is often used generically to describe both crawler-based search engines and human-powered directories. These two types of search engines gather their listings in radically different ways.

Crawler-Based Search Engines

Crawler-based search engines, such as Google, create their listings automatically. They "crawl" or "spider" the web, then people search through what they have found.

If you change your web pages, crawler-based search engines eventually find these changes, and that can affect how you are listed. Page titles, body copy and other elements all play a role.

Human-Powered Directories

A human-powered directory, such as the Open Directory, depends on humans for its listings. You submit a short description to the directory for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted.

Changing your web pages has no effect on your listing. Things that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site.

"Hybrid Search Engines" Or Mixed Results

In the web's early days, it used to be that a search engine either presented crawler-based results or human-powered listings. Today, it extremely common for both types of results to be presented. Usually, a hybrid search engine will favor one type of listings over another. For example, MSN Search is more likely to present human-powered listings from LookSmart. However, it does also present crawler-based results (as provided by Inktomi), especially for more obscure queries.
The Parts Of A Crawler-Based Search Engine

Crawler-based search engines have three major elements. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site. This is what it means when someone refers to a site being "spidered" or "crawled." The spider returns to the site on a regular basis, such as every month or two, to look for changes.

Everything the spider finds goes into the second part of the search engine, the index. The index, sometimes called the catalog, is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with new information.

Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been "spidered" but not yet "indexed." Until it is indexed -- added to the index -- it is not available to those searching with the search engine.

Search engine software is the third part of a search engine. This is the program that sifts through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant. You can learn more about how search engine software ranks web pages on the aptly-named How Search Engines Rank Web Pages page.


Major Search Engines: The Same, But Different

All crawler-based search engines have the basic parts described above, but there are differences in how these parts are tuned. That is why the same search on different search engines often produces different results. Some of the significant differences between the major crawler-based search engines are summarized on the Search Engine Features Page. Information on this page has been drawn from the help pages of each search engine, along with knowledge gained from articles, reviews, books, independent research, tips from others and additional information received directly from the various search engines.

Now let's look more about how crawler-based search engine rank the listings that they gather.

5 top Search Engines

1. Duck Duck Go

At first, DuckDuckGo.com looks like Google. But there are many subtleties that make this spartan search engine different. DuckDuckGo has some slick features, like 'zero-click' information (all your answers are found on the first results page). DuckDuckgo offers disambiguation prompts (helps to clarify what question you are really asking). And the ad spam is much less than Google. Give DuckDuckGo.com a try... you might really like this clean and simple search engine.

Visit DuckDuckGo here

2. Ask (aka 'Ask Jeeves')

The Ask/AJ/Ask Jeeves search engine is a longtime name in the World Wide Web. The super-clean interface rivals the other major search engines, and the search options are as good as Google or Bing or DuckDuckGo. The results groupings are what really make Ask.com stand out. The presentation is arguably cleaner and easier to read than Google or Yahoo! or Bing, and the results groups seem to be more relevant. Decide for yourself if you agree... give Ask.com a whirl, and compare it to the other search engines you like.

3. The Internet Archive

The Internet Archive is a favorite destination for longtime Web lovers. The Archive has been taking snapshots of the entire World Wide Web for years now, allowing you and me to travel back in time to see what a web page looked like in 1999, or what the news was like around Hurricane Katrina in 2005. You won't visit the Archive daily, like you would Google or Yahoo or Bing, but when you do have need to travel back in time, use this search site.

4. Yippy (formerly 'Clusty')

Yippy is a Deep Web engine that searches other search engines for you. Unlike the regular Web, which is indexed by robot spider programs, Deep Web pages are usually harder to locate by conventional search. That's where Yippy becomes very useful. If you are searching for obscure hobby interest blogs, obscure government information, tough-to-find obscure news, academic research and otherwise-obscure content, then Yippy is your tool.

5. Yahoo!

Yahoo! is several things: it is a search engine, a news aggregator, a shopping center, an emailbox, a travel directory, a horoscope and games center, and more. This 'web portal' breadth of choice makes this a very helpful site for Internet beginners. Searching the Web should also be about discovery and exploration, and Yahoo! delivers that in wholesale quantities.
 

ไม่มีความคิดเห็น:

แสดงความคิดเห็น