Friday, May 29, 2009

Robots.txt - More bots than people

Is your web site getting more hits from bots than people? You might want to try this in your robots.txt file. It blocks out a lot of bots we've seen but not major search engines. Alter as desired:

User-Agent: OnTownsBot
Disallow: /

User-Agent: ServageRobot
Disallow: /

User-Agent: uw_cse_xwc
Disallow: /

User-Agent: ZupeeCrawler
Disallow: /

User-Agent: uberbot
Disallow: /

User-Agent: Axonize-bot
Disallow: /

User-Agent: ips-agent
Disallow: /

User-Agent: RiceComputerArchitecture
Disallow: /

User-Agent: AISearchBot
Disallow: /

User-Agent: flatlandbot
Disallow: /

User-Agent: FairShare
Disallow: /

User-Agent: SapphireWebCrawler
Disallow: /

User-Agent: LocalBot
Disallow: /

User-Agent: LaBot
Disallow: /

User-Agent: Butterfly
Disallow: /

User-Agent: robotgenius
Disallow: /

User-Agent: WillyBot
Disallow: /

User-Agent: GingerCrawler
Disallow: /

User-Agent:larbin
Disallow: /

User-Agent: ru_com_viewer
Disallow: /

User-Agent:Yandex
Disallow: /

User-Agent:yandex
Disallow: /

User-Agent:msnbot-media
Disallow: /

Sitemap: http://www.rainierrhododendrons.com/sitemap.xml

User-Agent:del.icio.us
Disallow: /

User-Agent:Sika
Disallow: /

User-Agent:whois.de
Disallow: /

User-Agent:Isidorus
Disallow: /

User-Agent:Yanga
Disallow: /

User-Agent:MSR-ISRCCrawler
Disallow: /

User-Agent:Snappybot
Disallow: /

User-Agent:Gaisbot
Disallow: /

User-Agent:SapphireWebCrawler
Disallow: /

User-Agent:BobCrawl
Disallow: /

User-Agent:OpenX
Disallow: /

User-Agent:Axonize-bot
Disallow: /

User-Agent:KaloogaBot
Disallow: /

User-Agent:kalooga
Disallow: /

User-Agent:OnTownsBot
Disallow: /

User-Agent:Cazoodle-Bot
Disallow: /

User-Agent: REAP-Crawler
Disallow: /

User-Agent: DotBot
Disallow: /

User-Agent: Gigabot
Disallow: /

User-Agent: NetcraftSurveyAgent
Disallow: /

User-Agent: SurveyBot
Disallow: /

User-Agent: DBLBot
Disallow: /

User-Agent: AISearchBot
Disallow: /

User-Agent: Charlotte
Disallow: /

User-agent: IntegraTelecom
Disallow: /

User-agent: PSIBots
Disallow: /

User-agent:Websense
Disallow: /

User-agent:HornySexSearch
Disallow: /

User-agent: SnapPreviewBot
Disallow: /

User-agent: Snoopy
Disallow: /

User-agent: libwww-perl
Disallow: /

User-agent: nexen
Disallow: /

User-agent: phpversion
Disallow: /

User-agent: attributor
Disallow: /

User-agent: Java
Disallow: /

User-agent: bsalsa
Disallow: /

User-agent: whoisde.de
Disallow: /

User-agent: envolk
Disallow: /

User-agent: QEAVis
Disallow: /

User-agent: NextGenSearchBot
Disallow: /

User-agent: boitho.com
Disallow: /

User-agent: boitho
Disallow: /

User-agent: Wget
Disallow: /

User-agent: Rankivabot
Disallow: /

User-agent: T-Online Browser
Disallow: /

User-agent: webalta
Disallow: /

User-agent: page_prefetcher
Disallow: /

User-agent: cyberpatrol
Disallow: /

User-agent: sitecat
Disallow: /

User-agent: cyberpatrolcrawler
Disallow: /

User-agent: internetseer
Disallow: /

User-agent: searchme
Disallow: /

User-agent: dcbot
Disallow: /

User-agent: scoutjet
Disallow: /

User-agent: sphsearch
Disallow: /

User-agent: exabot
Disallow: /

User-agent: NaverBot
Disallow: /

User-agent: naverbot
Disallow: /

User-agent: twiceler
Disallow: /

User-agent: zermelo
Disallow: /

User-agent: Moozilla
Disallow: /

User-agent: kyluka
Disallow: /

User-agent: scoutjet
Disallow: /

User-agent: baiduspider
Disallow: /

User-agent: MLBot
Disallow: /

User-agent: worio
Disallow: /

User-agent: turnitinbot
Disallow: /

User-agent: exooba
Disallow: /

User-agent: ViolaBot
Disallow: /

User-agent: speedyspider
Disallow: /

User-agent: becomebot
Disallow: /

# disallow Googlebot-Image
User-agent: Googlebot-Image
Disallow: /

User-agent: MJ12bot
Disallow: /

User-agent: QEAVis
Disallow: /

User-agent: VWBot
Disallow: /

User-agent: ShopWiki
Disallow: /

User-agent: SnapPreviewBot
Disallow: /

User-agent: panscient.com
Disallow: /

User-agent: panscient
Disallow: /
User-agent: sproose
Disallow: /

User-agent: voyager
Disallow: /

User-agent: grub
Disallow: /

User-agent: libwww-perl
Disallow: /

User-agent: OmniExplorer_Bot
Disallow: /

User-agent: Twiceler
Disallow: /

User-agent: WebDataCentreBot
Disallow: /

User-agent: OOZBOT
Disallow: /

User-agent: setooz
Disallow: /

User-agent: bsalsa
Disallow: /

User-agent: perl
Disallow: /

User-agent: botmobi
Disallow: /

User-agent: NextGenSearchBot
Disallow: /

User-agent: ASPSimply
Disallow: /

User-agent: Python-urllib
Disallow: /

User-agent: Moozilla
Disallow: /

User-agent: voilabot
Disallow: /

User-agent: WGet
Disallow: /

User-agent: obot
Disallow: /

User-agent: Java
Disallow: /

User-agent: libcurl-agent
Disallow: /

User-agent: phpversion
Disallow: /

User-agent: therarestparser
Disallow: /

User-agent: Jakarta Commons-HttpClient
Disallow: /