Visitas de buscadores en idanas en 2007

Acabo de sacar de nuestras estadísticas los “user agents”, identificadores del navegador que visita la Web, de los robots de búsqueda que han visitado algunas de las Webs de idanas en lo que llevamos de 2007, quizás le puedan interesar a alguien.

Estos User Agents muestran que robots de búsqueda esta escaneando nuestra Web, aunque algunos motores tienes varios User Agents.

Por ejemplo, algunos User Agents de Google son:

  • AdsBot-Google (+http://www.google.com/adsbot.html)
  • Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.0.9) Gecko/20061206 Googlebot 2.1
  • Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Aquí tenéis la lista:

AdsBot-Google (+http://www.google.com/adsbot.html)
CazoodleBot/CazoodleBot-0.1 (CazoodleBot Crawler; http://www.cazoodle.com/cazoodlebot; cazoodlebot@cazoodle.com)
CazoodleBot/Nutch-0.9-dev (CazoodleBot Crawler; http://www.cazoodle.com/cazoodlebot; cazoodlebot@cazoodle.com)
Cityreview Robot (+http://www.cityreview.org/crawler/)
EmeraldShield.com Web Spider (http://www.emeraldshield.com/webbot.aspx)
Gaisbot/3.0+(robot06@gais.cs.ccu.edu.tw;+http://gais.cs.ccu.edu.tw/robot.php)
Gigabot/2.0 (http://www.gigablast.com/spider.html)
Gigabot/2.0/gigablast.com/spider.html
Gigabot/2.0att
Gigabot/3.0 (http://www.gigablast.com/spider.html)
GnoZtiK bot/1.0 (http://www.gnoztik.com)
Googlebot-Image/1.0
Googlebot/2.1 (http://www.googlebot.com/bot.html)
GurujiBot/1.0 (+http://www.guruji.com/en/WebmasterFAQ.html)
HMSE_Robot

ICCrawler – ICjobs (http://www.icjobs.de/bot.htm)
IRLbot/3.0 (compatible; MSIE 6.0; http://irl.cs.tamu.edu/crawler)
Jyxobot/1
MELBOT
MJ12bot/v1.1.0 (http://majestic12.co.uk/bot.php?+)
MJ12bot/v1.1.2 (http://majestic12.co.uk/bot.php?+)
MJ12bot/v1.2.0 (http://majestic12.co.uk/bot.php?+)
MQBOT/Nutch-0.9-dev (MQBOT Nutch Crawler; http://vwbot.cs.uiuc.edu; mqbot@cs.uiuc.edu)
MSRBOT (http://research.microsoft.com/research/sv/msrbot)
MSRBOT (http://research.microsoft.com/research/sv/msrbot/)
Melbot WebSpider & RSS News Crawler www.melbot.info (V.2.42 by A.I.C.E.)
MoJoBot/0.1 libwww-perl/5.805
Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; Girafabot; girafabot at girafa dot com; http://www.girafa.com)
Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0; obot)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) SEOChat::Bot v1.1
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; QihooBot 1.0 qihoobot@qihoo.net)
Mozilla/4.0 compatible ZyBorg/1.0 (wn-14.zyborg@looksmart.net; http://www.WISEnutbot.com)
Mozilla/5.0 (SnapPreviewBot) Gecko/20061206 Firefox/1.5.0.9
Mozilla/5.0 (Twiceler-0.9 http://www.cuill.com/twiceler/robot.html)
Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.0.9) Gecko/20061206 Googlebot 2.1
Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.1.6) Gecko/20070725 Googlebot2.1
Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.1) VoilaBot BETA 1.2 (http://www.voila.com/)
Mozilla/5.0 (compatible; Exabot/3.0; +http://www.exabot.com/go/robot)
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Mozilla/5.0 (compatible; IDBot/1.0; +http://www.id-search.org/bot.html)
Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Exabot-Thumbnails)
Mozilla/5.0 (compatible; SnapPreviewBot; en-US; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9
Mozilla/5.0 (compatible; Webbot/0.1; http://www.webbot.ru/bot.html)
Mozilla/5.0 (compatible; archive.org_bot/1.13.1x +http://crawler.archive.org)
Mozilla/5.0 (compatible; heritrix/1.12.0 +http://www.accelobot.com)
Mozilla/5.0 (compatible; jobs.de-Robot +http://www.jobs.de)
Mozilla/6.0 (MSIE 6.0; Windows NT 5.1; RSSMicro.com RSS/Atom Feed Robot)
Myrasoft.com Active Search Engine Robot
NutchCVS/0.7.2 (Nutch; http://lucene.apache.org/nutch/bot.html; nutch-agent@lucene.apache.org)
OmniExplorer_Bot/6.70 (+http://www.omni-explorer.com) WorldIndexer
PlantyNet_WebRobot_V1.9 dhkang@plantynet.com
Robotgenius crawler/Nutch-1.0-dev (http://robotgenius.net; misc at robotgenius dot net)
S4JCrawl/0.7.2 (S4J Search Bot; http://www.flaptor.com/; bot@flaptor.com)
Seekbot/1.0 (http://www.seekbot.net/bot.html) HTTPFetcher/0.3
Seekbot/1.0 (http://www.seekbot.net/bot.html) HTTPFetcher/2.2
Seekbot/1.0 (http://www.seekbot.net/bot.html) RobotsTxtFetcher/1.2
Semager/1.1 (http://www.semager.de/blog/semager-bots/)
Snapbot/1.0
Snapbot/1.0 (+http://www.snap.com)
Snapbot/1.0 (Snap Shots, +http://www.snap.com)
Spirioo Bot (Version: 1.04, powered by www.spirioo.de +http://www.spirioo.de/cgi-bin/catalog.cgi?cmd=pages&page=bot)
Toplistbot
TurnitinBot/2.1 (http://www.turnitin.com/robot/crawlerinfo.html)
VWBOT/Nutch-0.9-dev (VWBOT Nutch Crawler; http://vwbot.cs.uiuc.edu; vwbot@cs.uiuc.edu)
VisBot/2.0 (Visvo.com Crawler; http://www.visvo.com/bot.html; bot@visvo.com)
Yeti/0.01 (nhn/1noon, yetibot@naver.com, check robots.txt daily and follow it)
bot(www.cuwhois.com)
bot/1.0 (bot; http://; bot@bot.bot)
favorstarbot/1.0 (+http://favorstar.com/bot.html)
googlebot 1.0
msnbot-media/1.0 (+http://search.msn.com/msnbot.htm)
msnbot/1.0 (+http://search.msn.com/msnbot.htm)
msnbot/1.0+(+http://search.msn.com/msnbot.htm)
noxtrumbot/1.0 (crawler@noxtrum.com)
nrsbot/5.0(loopimprovements.com/robot.html)
onCHECK-Robot, www.onsearch.de
owsBot/0.1 (Nutch; www.oneworldstreet.com; nutch-agent@lucene.apache.org)
owsBot/0.2 (owsBot; www.oneworldstreet.com; owsBot)
psbot/0.1 (+http://www.picsearch.com/bot.html)
sproose/1.0beta (sproose bot; http://www.sproose.com/bot.html; crawler@sproose.com)
wectarbot
yacybot (i386 Linux 2.6.9-023stab044.4-smp; java 1.6.0_02; Europe/en) http://yacy.net/yacy/bot.html
yacybot (x86 Windows XP 5.1; java 1.6.0_01; Europe/en) http://yacy.net/yacy/bot.html

Un saludo

David Antón Asensio, idanas

Leave a Reply

Buscar en Google

Archivo