小李飞帖 2006-11-29 09:34
搜索公司的爬虫清单--你见过吗
AbachoBOT=Abacho.com
abcdatos_botlink=Abcdatos.com
[URL]http://www.abcdatos.com/botlink/=Abcdatos.com[/url]
AESOP_com_SPiderMan=Aesop.com
ah-ha.com crawler ([email]crawler@ah-ha.com[/email])=ah-ha.com
ia_[WIKI]archive[/wiki]r=Archive.org
Scooter=Altavista.com
Mercator=Altavista.com
Scooter2_Mercator_3-1.0=Altavista.com
roach.smo.av.com-1.0=Altavista.com
Tv_Merc_resh_26_1_D-1.0=Altavista.com
AltaVista-Intranet=Altavista.co.uk
[email]jan.gelin@av.com[/email]=Altavista.co.uk
FAST-[wiki]web[/wiki]Crawler=alltheweb.com
[email]crawler@fast.no[/email]=alltheweb.com
Acoon Robot=acoon.de
antibot=antisearch.net
A[wiki]Tom[/wiki]z=atomz.com
Buscaplus Robi=buscaplus.com
CanSeek/=canseek.ca
[email]support@canseek.ca[/email]=canseek.ca
ChristCRAWLER=christcrawler.com
Crawler=crawler.de
[email]admin@crawler.de[/email]=crawler.de
DaAdLe.com ROBOT/=daadle.com
RaBot=daum.net
Agent-admin/=daum.net
[email]phortse@hanmail.net[/email]=daum.net
contact/jylee@kies.co.kr=kies.co.kr
DeepIndex=deepindex.com
D[wiki]IT[/wiki]toSpyder=ditto.com
Jack=domanova.co.uk
Speedy Spider=entireweb.com
ArchitextSpider=excite.com
ArchitectSpider=excite.com
Arachnoidea=euroseek.net
[email]arachnoidea@euroseek.net[/email]=euroseek.net
EZResult=ezresults.com
Fast PartnerSite Crawler=fastsearch.net
FAST Data Search Crawler=fastsearch.net
KIT-Fireball=fireball.de
FyberSearch=fybersearch.com
GalaxyBot=galaxy.com
geckobot=geckobot.com
GenCrawler=gendoor.com
GeonaBot=geona.com
Googlebot=Google.com
[email]googlebot@googlebot.com[/email]=Google.com
google=Google.com
moget/2.0=goo.ne.jp
[email]moget@goo.ne.jp[/email]=goo.ne.jp
Aranha=girafa.com
Slurp.so/1.0=[wiki]yahoo[/wiki]
[email]slurp@inktomi.com[/email]=Yahoo
Slurp/2.0j=Yahoo
[url]www.inktomisearch.com=Yahoo[/url]
Slurp/2.0-KiteHourly=Yahoo
Slurp/2.0-OwlWeekly=Yahoo
[email]spider@aeneid.com[/email]=Yahoo
Slurp/3.0-AU=Yahoo
Toutatis 2.5-2=hoppa.com
Hubater=hubat.com
IlTrovatore-Setac[wiki]CIO[/wiki]=iltrovatore.it
IncyWincy=incywincy.com
UltraSeek=infoseek.com
InfoSeek Sidewinder=infoseek.com
Mole2/1.0=in[wiki]TAG[/wiki]s.de
[email]webmaster@intags.de[/email]=intags.de
MP3Bot=mp3bot.de
C-PBWF-ip3000.com-crawler=ip3000.com
ip3000.com-crawler=ip3000.com
kuloko-bot/0.2=kuloko.com
LNSpiderguy=lexis-nexis.com
NetResearchServer=look.com
MantraAgent=looksmart.com
NetResearchServer=loopim[wiki]PR[/wiki]ovements.com
Ly[wiki]COS[/wiki]_Spider_(T-Rex)=lycos.com
JoocerBot=joocer.com
HenryTheMiragoRobot=mirago.co.uk
mozDex/=mozdex.com
[wiki]MSN[/wiki]BOT/0.1=MSN
Gulliver=northernlight.com
ObjectsSearch/0.01=objectssearch.com
PicoSearch/=picosearch.com
PJspider=portalj[wiki]UI[/wiki]ce.com
DIIbot=powerinter.net
nttdirectory_robot=navi.ocn.ne.jp
[email]super-robot@super.navi.ocn.ne.jp[/email]=navi.ocn.ne.jp
griffon=super.navi.ocn.ne.jp
[email]griffon@super.navi.ocn.ne.jp[/email]=super.navi.ocn.ne.jp
小李飞帖 2006-11-29 09:34
Spider/maxbot.com=maxbot.com
[email]admin@maxbot.com[/email]=maxbot.com
gazz/1.0=Unknown Spider
[email]gazz@nttrd.com[/email]=Unknown Spider
NationalDirectory-SuperSpider=nationaldirectory.com
dloader(NaverRobot)/=naver.com
dumrobo(NaverRobot)/=naver.com
Openfind piranha=openfind.com
Shark=openfind.com
[email]robot-response@openfind.com.tw[/email]=openfind.com.tw
Openbot/=openfind.com.tw
psbot=picsearch.org
CrawlerBoy=pinpoint.com
ip3000.com=petersnews.com
AlkalineBOT=AlkalineBOT
Fluffy the spider=searchhippo.com
[email]info@searchhippo.com[/email]=searchhippo.com
Scrubby/=scrubtheweb.com
asterias=singingfish.com
speedfind ramBot xtreme=speedfind.de
Kototoi/0.1=s.u-tokyo.ac.jp
Searchspider/=searchspider.com
SightQuestBot/=sightquest.com
Spider_Monkey/=spidermonkey.ca
Surfnomore Spider v1.1=surfnomore.com
[email]Robot@SuperSnooper.Com[/email]=supersnooper.com
teoma_agent1=teoma.com
[email]teoma_admin@hawkholdings.com[/email]=teoma.com
Teradex_Mapper=mapper.teradex.com
[email]mapper@teradex.com[/email]=mapper.teradex.com
ESISmartSpider=travel-finder.com
Spider TraficDublu=traficdublu.ro
Tutorial Crawler=tutorgig.com
UK Searcher Spider=uksearcher.co.uk
Vivante Link Checker=vivante.com
appie=walhello.com
Nazilla=websmostlinked.com
[url]www.WebWombat.com.au=webwombat.com.au[/url]
marvin/infoseek=webseek.de
[email]marvin-team@webseek.de[/email]=webseek.de
MuscatFerret=webtop.com
WhizBang! Lab=whizbanglabs.com
ZyBorg=wisenut.com
WIRE WebRefiner=wire.co.uk
WSCbot=worldsearchcenter.com
Yandex=yandex.com
Yellopet-Spider=yellowpet.com
Iron33=verno.ueda.info.waseda.ac.jp/
ALink=Link Checkers
AMeta=Link Checker
A[wiki]SP[/wiki]Search URL Checker=Link Checker
[wiki]blog[/wiki]Bot=Link Checker
BMChecker=Link Checker
Bookmark Buddy=Link Checker
Check&Get=Link Checker
CheckWeb=Link Checker
CNET_Snoop=Link Checker
CSE [wiki]HTML[/wiki] Validator=Link Checker
DRKSpider=Link Checker
DISCo Watchman=Link Checker
DoctorHTML=Link Checker
Email Extractor=Email Extractor
EmailSiphon=Email Extractor
EmailWolf=Email Extractor
FavOrg=Link Checker
Favorites Sweeper=Link Checker
FreshLinks.exe=Link Checker
Funnel Web Profiler=Link Checker
Html Link Validator=Link Checker
The Informant=Link Checker
The Intraformant=Link Checker
InternetLinkAgent=Link Checker
InternetPeriscope=Link Checker
javElink=Link Checker
jdwhatsnew.CGI=Link Checker
JRTS Check Favorites Utility=Link Checker
Lambda LinkCheck=Link Checker
LinkLint-checkonly=Link Checker
LinkAlarm=Link Checker
Linkbot=Link Checker
Linkman=Link Checker
LinkProver=Link Checker
Links=Link Checker
LinkScan Server=Link Checker
LinkSweeper=Link Checker
Link Valet Online=Link Checker
LinkVerify Spider=Link Checker
LinkWalker=Link Checker
Morning Paper=Link Checker
MoveAnnouncer=Link Checker
NetLookout=Link Checker
NetMechanic=Link Checker
[url]www.elsop.com=Link[/url] Checker
NetMind-Minder=Link Checker
NetMonitor=Link Checker
Netprospector [wiki]JAVA[/wiki]Crawler=Link Checker
online link validator=Link Checker
Rational SiteCheck=Link Checker
Robozilla=Link Checker
RPT-HTTPClient=Link Checker
SurfMaster=Link Checker
小李飞帖 2006-11-29 09:34
SyncIT=Link Checker
Watchfire WebXM=Link Checker
WatzNew Agent=Link Checker
WebSite-Watcher=Link Checker
WebTrends Link Analyzer=Link Checker
Weblink Scanner=Link Checker
Xenu's Link Sleuth=Link Checker
W3C_Validator=Link Validator
WDG_Validator/=Link Validator
Tooter=Link Validator
citenikbot/=citenik.co.uk
CL[wiki]IP[/wiki]S-index=clips-index.imag.fr/
Computer_and_Automation_Research_Institute_Crawler=Research Bot
cosmos=xyleme.com
[email]robot@xyleme.com[/email]=xyleme.com
DiaGem/=DiaGem
Digimarc WebReader=digimarc.com
EchO!/2.0=voila.com
FinaleRobot=expressus.com
[email]robot-master@expressus.com[/email]=expressus.com
Ideare - SignSite=ideare.com
GentleSpider=research.att.com
Gulper Web Bot=Gulper Web Bot
larbin=Unknown Spider
[email]sebastien.ailleret@inria.fr[/email]=inria.fr
[email]ghi@lcs.mit.edu[/email]=Unknown Spider
MultiText=MultiText
NEC Research Agent=NEC Research Agent
OntoSpider=OntoSpider
sherlock_spider=sherlock.com.cn
Steeler=Steeler
ru-robot=rutgers.edu
0.1_h[wiki]SEO[/wiki](at)cs.rutgers.edu=rutgers.edu
WebGather=WebGather
xyro=xyro
[email]xcrawler@inria.fr[/email]=Unknown Spider
Zao/0.2=Zao
ADSARobot=ADSARobot
AnswerChase=AnswerChase
[wiki]ASP[/wiki]Seek=ASPSeek
AVSearch=AVSearch
Checkbot=Checkbot
DaviesBot=DaviesBot
deepweb=deepweb.com
GigaBaz=brainbot.com
GigaBazVStheWeb=brainbot.com
[email]crawler@brainbot.com[/email]=brainbot.com
Giskard=oralco.com
InternetSeer=InternetSeer
ipiumBot=ipiumBot
InsumaScout=InsumaScout
Katriona=Katriona
LEIA=LEIA
LexiBot=lexibot.com
metabot=metabot
NetCruiser=NetCruiser
NPBot=nameprotect.com
NetZippy=NetZippy
NZBot=navigationzone.com
Opencola=opencola.com
Oxxbot1=Oxxbot
Pansophica=Pansophica
Phoaks=Phoaks
PICgrabber=PICgrabber
PictureOfInternet=PictureOfInternet
[email]erik@malfunction.org[/email]=Unknown Spider
PintaSpider=PintaSpider
PolyBot=PolyBot
Squid=Squid
Sqworm=Sqworm
TaWWWantula=TaWWWantula
TeraCrawl=TeraCrawl
TurnitinBot=turnitin.com
UCmore=ucmore.com
UdmSearch=mnoGoSearch
unlostBot=unlost.com
[wiki]URL[/wiki]Blaze=urlblaze.net
UrlScope=UrlScope
Vagabondo=Vagabondo
vspider=vspider
WAVETools=WAVETools
Webbandit=Webbandit
Webclipping.com=Webclipping.com
webcollage=webcollage
WebCompass=WebCompass
WebGenie=WebGenie
Web Magnet=Unknown Spider
WebMiner=Unknown Spider
Webpush=Unknown Spider
WebSymmetrix=Unknown Spider
webrank=Unknown Spider
webwasher=Unknown Spider
WhosTalking=Unknown Spider
AnzwersCrawl/2.0=Anzwers
fido/1.0 Harvest/1.4.pl2=Planet Search
GAIS Robot/1.0B2=seednet
Googlebot/1.0=Google.com
Gulliver/1.2=Northern Light
Infoseek Sidewinder/0.9=Infoseek
KIT_Fireball/2.0=Fireball
lwp-trivial/1.27=Search 4 Free
小李飞帖 2006-11-29 09:35
Lycos_Spider_(T-Rex)/3.0=Lycos
Scooter/1.0=AltaVista
Scooter/1.0 [email]scooter@pa.dec.com[/email]=AltaVista
Scooter/1.1 (custom)=AltaVista
Scooter/2.0 G.R.A.B. X2.0=AltaVista
Scooter/2.0 G.R.A.B. V1.1.0=AltaVista
search.at V1.2=search.at
inktomi=Inktomi Spider
SwissSearch V1.2=SwissSearch
The Informant=The Informant
Ultraseek=Infoseek
WebCrawler/3.0 Robot libwww/5.0a=WebCrawler
WebCrawler-AddURL/2.0=WebCrawler
WiseWire=WiseWire
WiseWire-Alpha-1.0=WiseWire
WiseWire-Alpha-Spider=WiseWire
WiseWire-Alpha12-Spider971219a=WiseWire
WiseWire-Alpha12-Spider(971223a)=WiseWire
WiseWire-HotSpider-1.0=WiseWire
WiseWire-Spider=WiseWire
WiseWire-Spider-1.0=WiseWire
WiseWire-Spider2=WiseWire
WiseWire-Widow-1.0=WiseWire
WiseWire-Widow-1.0r=WiseWire
WiseWire-Widow-1.0-ALPHA12=WiseWire
CherryPickerSE/1.0=Email Extractor
CherryPickerElite/1.0=Email Extractor
Crescent Internet ToolPak HTTP OLE Control v.1.0=Email Extractor
EmailCollector/1.0=Email Extractor
EmailWolf 1.00=Email Extractor
ExtractorPro=Email Extractor
ask jeeves=Ask Jeeves
lycos=Lycos.com
whatuseek=What You Seek
wisenutbot=Looksmart
msnbot=MSN
GigaBlast=Gigablast
Gigabot=Gigablast
[wiki]archive[/wiki]_org=Archive.org
jeeves=Ask Jeeves
Asterias=Singingfish Spider
Slurp=Inktomi Spider
ZyBorg=LookSmart Bot
baiduspider=Baidu