Meta tags were originally intended as a proxy for information about a website’s content. Several of the basic meta tags are listed below, along with a description of their use.
The Meta Robots tag can be used to control search engine crawler activity (for all of the major engines) on a per-page level. There are several ways to use Meta Robots to control how search engines treat a page:
- index/noindex tells the engines whether the page should be crawled and kept in the engines’ index for retrieval. If you opt to use “noindex,” the page will be excluded from the index. By default, search engines assume they can index all pages, so using the “index” value is generally unnecessary.
- follow/nofollow tells the engines whether links on the page should be crawled. If you elect to employ “nofollow,” the engines will disregard the links on the page for discovery, ranking purposes, or both. By default, all pages are assumed to have the “follow” attribute.
Example: <META NAME=”ROBOTS” CONTENT=”NOINDEX, NOFOLLOW”>
- noarchive is used to restrict search engines from saving a cached copy of the page. By default, the engines will maintain visible copies of all pages they have indexed, accessible to searchers through the cached link in the search results.
- nosnippet informs the engines that they should refrain from displaying a descriptive block of text next to the page’s title and URL in the search results.
- noodp/noydir are specialized tags telling the engines not to grab a descriptive snippet about a page from the Open Directory Project (DMOZ) or the Yahoo! Directory for display in the search results.
The X-Robots-Tag HTTP header directive also accomplishes these same objectives. This technique works especially well for content within non-HTML files, like images.