Web Traffic 101


Web traffic, the performance metrics of the 21st century.

Crawlers, Robots, or Spiders

See also: Web Browsers; Crawlers, Robots, or Spiders

Many terms are used to label the software programs used to index websites. Most refer to them as web crawlers, web robots, webbots, or spiders. Web crawlers, robots, and spiders are used by many companies to retrieve information about your website. They are often automated and are programmed to simply follow hyperlinks throughout the web. Search engines are the most well known users of spiders, but others use them as well. All in all, crawlers, robots and spiders make up a significant number of the "website visitors" for a well established site.

The Google robot, known as Googlebot, will visit a well established site several times a day to check for updates. Of all the spiders who traverse the sites for which we have web log access, Googlebot is by far the most active, on average around twice as active as the Yahoo! crawler and six times more active than the new msnbot spider. On their website, Google refers to Googlebot as "Google's web crawler" and as "Google's web-crawling robot."

Another common web crawler is the ia_archiver, which crawls sites in order to create and maintain a complete historical backup copy of the web, located at Archive.org, and also is used to provide screenshot thumbnail images for Alexa.com, which uses the Google search engine.

Crawlers, robots, or spiders are also used by companies which offer link checking or link validation services, and by companies which harvest company contact information or email addresses for sale.

More Information on Crawlers, Robots, or Spiders

The Web Robots Pages

Yahoo! Directory > Computers and Internet > Internet > World Wide Web > Searching the Web > Crawlers, Robots, and Spiders

Robots, Spiders and Other User Agents: a Resource for WebMasters
Jos� Luis Pellicer's searchable database of robots, spiders and other user agents for programs that surf the web.

Search Engine IP Addresses
Lists IP addresses of search engine spiders. Can be searched by IP address. Also links to resources on spiders.

SEO Logic® Home.