Indices

An index is a search tool that uses automated programs that are capable of following the hyperlinks found on a web page to other web pages and in that way move through a large number of web sites with no human intervention. While doing their wandering, these "web crawlers" can also create a database of keywords from each web page they encounter. A search engine can then be designed to take advantage of the specific ways that the database is indexed making searching the index for all occurrences of your keywords extremely fast. Because crawlers can process enormous numbers of web pages each day, older links are constantly being dropped or updated as the new URLs are added making the maintenance of the database automated as well.

One of the most advanced index search engines is Alta Vista, created and run by Digital Equipment Corporation. Within two months of its public debut on December 15th, 1995, Alta Vista had grown to include 21 million web pages, 10 billion words, and was handling over 4 million request per day. Since that time its automated crawler has been examining approximately 2.5 million web pages daily.

Alta Vista has one blank entry where you can enter your keywords. It will generally behave as if the OR Boolean operator is in effect, giving you the total number of web pages found containing each word and then a total of pages containing all the words and a listing of the top 10 best matches. Thus it does both an AND and OR search automatically. Putting a "+" sign in front of a word causes it to be required, somewhat analogous to using the AND operator. Likewise, a "-" sign can be put in front of a word to indicate the it should NOT be included. Finally, phrases can be placed in quotes so that the words are not searched individually. For instance if you searched for Robert Redford, without quotes, you would get entries for all pages that include Robert and Redford anywhere in the page such as ones containing the names Robert Jones and Lynn Redford. If, however, you use "Robert Redford," in quotes, then you would only get pages which have the two names side by side as in the famous actor's name. It should be pointed out though, that when calculating a pages "score" to come up with the 10 best matches, Alta Vista does consider proximity, so sites with the actor's name in them would score higher than the example site with two separate names and therefore be more likely to appear at the top of your list. While it only initially lists the top 10 pages, there is a link at the bottom of the results page that will allow you to see more results your search turned up.

Activity:
Go to the Alta Vista home page and try a search. If you need more information on using Alta Vista, click on the Help graphic at the top of the screen.

Try using both lower and upper case letters in your search. For example, Mayan Civilization might produce different results from mayan civilization.