# See for # detailed info on robots. # # Also, see BrowserStat's periodic list of spiders and odd-ball browsers - # . # Known robots # User Agent Purpose # -------------------- --------- # ArchitextSpider Excite search engine (http://www.excite.com/) # BackRub Part of Stanford's Digital Library project # ESIRover From Enginnering Software Inc (http://www.engsoftware.com/spiders/) # fido Planet Search search engine (http://www.planetsearch.com/) # InfoSeek Sidewinder Infoseek search engine (http://www.infoseek.com/) # LinkWalker SEVENtwentyfour link checker (http://www.seventwentyfour.com/) # Lycos_Spider Lycos search engine (http://www.lycos.com/) # MOMspider Link checks (me and probably other sites) # Poong something out of Stanford Univ # Scooter Alta Vista search engine (http://www.altavista.digital.com/) # Slurp Inktomi search engine (http://www.inktomi.com/slurp.html) # Spider5 seems related to WiseWire based on host # Web21 CustomCrawl 100hot search engine (http://www.web21.com/) # WiseWire WiseWire search engine (http://www.wisewire.com/) # wURLwind-Reaper wURLwind search enging (http://www.wurlwind.com/) # Commercial spiders don't get any access. # LinkWalker (http://www.seventwentyfour.com/) User-agent: LinkWalker Disallow: / # Spider5 User-agent: Spider5 Disallow: / # WiseWire (http://www.wisewire.com/) User-agent: WiseWire-Spider Disallow: / # Stupid spiders don't get any access. # Excite's spider doesn't recognize outdated links (24-Nov-97) User-agent: ArchitextSpider # Don't really know what it is, but it's seems to be part of Internet Explorer v4 # and when it hits here, it hits pretty hard. User-agent: MSIECrawler Disallow: / # MOMspider can get pretty much everywhere User-agent: MOMspider Disallow: /MOMspider # Disallow access to some directories for everyone. User-agent: * Disallow: /cgi-bin Disallow: /hidden Disallow: /icons Disallow: /MOMspider Disallow: /nogo Disallow: /server