Using robots.txt file to Allow or Deny Search Engines
This article briefly covers how a robots.txt file can deny or allow certain files from being indexed by Search Engines.
If you have portions of a website that you do not wish for search indexes to see, you can protect them with a "robots.txt" file dictating which search engines are allowed or disallowed from seeing specific folders/files.
There are many options which you can specify in a robots.txt file to explicitly deny or allow specific search-bots to index certain folders or files.
The simplest robots.txt file uses two rules:
User-agent: the robot the following rule applies to
Disallow: the URL you want to block
These two lines are considered a single entry in the file. You can include as many entries as you want. You can include multiple Disallow lines and multiple user-agents in one entry.
Please see the following articles which explains how a robots.txt file works, and how to configure one.
http://www.robotstxt.org/
http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=40360
http://en.wikipedia.org/wiki/Robots_Exclusion_Standard