![]() |
|
|||
|
Arkadaslar cpaneldan son 300 kisi nereleri gezmis diye baktim, yahoo ve google bot robots.txt gezmis.
Benim bildigim kadariyla robots.txt icine arama motorlari tarafindan kaydedilmesi istenilen url giriliyor. Ama google da biraz arastirdim ve su siteye rastladim http://kamerasistemleri.com/robots.txt burda yapdigi sey tam anlami ile ney ?? Kod:
<BASE HREF="http://www.kamerasistemleri.com:80/" # Robot-Manager was used to generate this file. # Copyright (c) 2001 by Sophtware.com, Inc. All Rights Reserved. # http://www.websitemanagementtools.com/ # FULL access (Acoon) User-agent: Acoon Robot Disallow: # FULL access (AllThatNet) User-agent: ATN_Worldwide Disallow: # FULL access (Alta Vista) User-agent: Scooter Disallow: # FULL access (Anzwers) User-agent: AnzwersCrawl Disallow: # FULL access (AustLII) User-agent: Gromit Disallow: # FULL access (CMC/0.01) User-agent: CMC/0.01 Disallow: # FULL access (Direct Hit Grabber) User-agent: grabber Disallow: # FULL access (Entire Web) User-agent: speedy Disallow: # FULL access (EuroSeek) User-agent: Freecrawl Disallow: # FULL access (Excite) User-agent: ArchitextSpider Disallow: # FULL access (FAST/AllTheWeb) User-agent: FAST-WebCrawler Disallow: # FULL access (Fireball) User-agent: KIT-Fireball Disallow: # FULL access (Goo) User-agent: moget Disallow: # FULL access (Google) User-agent: Googlebot Disallow: # FULL access (Google Image) User-agent: Googlebot-Image Disallow: # FULL access (Griffon) User-agent: griffon Disallow: # FULL access (Hämähäkki) User-agent: Hämähäkki Disallow: # FULL access (Industry Central) User-agent: Open Text Site Crawler Disallow: # FULL access (InfoSeek.de) User-agent: marvin Disallow: # FULL access (Ingrid) User-agent: INGRID/0.1 Disallow: # FULL access (Inktomi) User-agent: Slurp Disallow: # FULL access (Internet Cruiser) User-agent: Internet Cruiser Robot Disallow: # FULL access (Kvasir) User-agent: solbot Disallow: # FULL access (Legs) User-agent: legs Disallow: # FULL access (Lets Find It Now!) User-agent: elfinbot Disallow: # FULL access (Lisa) User-agent: Voyager/0.0 Disallow: # FULL access (Lycos) User-agent: Lycos_Spider_(T-Rex) Disallow: # FULL access (MPRM Group Limited) User-agent: spider_monkey Disallow: # FULL access (Mirago) User-agent: mirago Disallow: # FULL access (NetScoop) User-agent: NetScoop Disallow: # FULL access (Northern Light) User-agent: Gulliver Disallow: # FULL access (ODiN) User-agent: Valkyrie libwww-perl Disallow: # FULL access (Openfind) User-agent: openbot Disallow: # FULL access (PlanetSearch) User-agent: fido Disallow: # FULL access (Portal Juice) User-agent: pjspider Disallow: # FULL access (SearchUK) User-agent: MegaSheep Disallow: # FULL access (Suchmaschine21) User-agent: CoolBot Disallow: # FULL access (Thunderstone) User-agent: T-H-U-N-D-E-R-S-T-O-N-E Disallow: # FULL access (TopicLink) User-agent: topiclink Disallow: # FULL access (VietGATE) User-agent: jcrawler Disallow: # FULL access (WhoWhere?) User-agent: whowhere Disallow: # FULL access (e-collector) User-agent: ecollector Disallow: # FULL access (iaNett.com) User-agent: ParaSite Disallow: # FULL access (kensaku.jp) User-agent: suke Disallow: # FULL access (mopilot.com) User-agent: wapspider Disallow: # FULL access (nathan) User-agent: tarantula Disallow: # FULL access (newscan) User-agent: newscan-online Disallow: # FULL access (whatUseek) User-agent: winona Disallow: # PARTIAL access (All Spiders) User-agent: * Disallow: /Admin Disallow: /Config |
|
||||
|
Alıntı:
Ornegin http://adresiniz/admin sayfasinin indekslenmesini istemiyorsunuz; robot.txt icerigi: User-agent: * Disallow: /admin Tum robot.txt'den anlayan (bazilari bu dosyayi ve icindeki kurallari gozardi edebilir) arama motorlari admin dizinini indekslemeyecektir.
__________________
Zend Certified PHP Engineer |
|
||||
|
User-agent: * yapacaksın tüm sayfalarını indexleyecek
__________________
Satılık Domain - 100 $ |
|
||||
|
aha buda google robots.txt
Kod:
User-agent: * Allow: /searchhistory/ Disallow: /search Disallow: /groups Disallow: /images Disallow: /catalogs Disallow: /catalog_list Disallow: /news Disallow: /nwshp Disallow: /? Disallow: /addurl/image? Disallow: /pagead/ Disallow: /relpage/ Disallow: /sorry/ Disallow: /imgres Disallow: /keyword/ Disallow: /u/ Disallow: /univ/ Disallow: /cobrand Disallow: /custom Disallow: /advanced_group_search Disallow: /advanced_search Disallow: /googlesite Disallow: /preferences Disallow: /setprefs Disallow: /swr Disallow: /url Disallow: /wml Disallow: /xhtml Disallow: /imode Disallow: /jsky Disallow: /sprint_xhtml Disallow: /sprint_wml Disallow: /pqa Disallow: /palm Disallow: /hws Disallow: /bsd? Disallow: /linux? Disallow: /mac? Disallow: /microsoft? Disallow: /unclesam? Disallow: /answers/search?q= Disallow: /local? Disallow: /local_url Disallow: /froogle? Disallow: /froogle_ Disallow: /print? Disallow: /scholar? Disallow: /complete Disallow: /sponsoredlinks Disallow: /videosearch? Disallow: /videopreview? Disallow: /videoprograminfo? Disallow: /maps? Disallow: /translate? Disallow: /ie?
__________________
http://www.esato.info |
|
||||
|
evet tabi ki...
__________________
http://www.esato.info |
![]() |
| Seçenekler | |
| Stil | |
|
|
Benzer Konular
|
||||
| Konu | Konuyu Başlatan | Forum | Cevaplar | Son Mesaj |
| Robots.Txt Nedir ? Ne Değildir ? | omerinfocus | Google Optimizasyon ve Sandbox | 1 | 2006-06-14 22:06 |
| Robots.txt dosyası? | hedefturan | Google Arama | 2 | 2006-05-25 20:44 |
| bu robots.txt dosyası doğrumu | yozgatlı | Webmaster Genel Konular | 4 | 2005-09-08 20:42 |
| Robots.text Hakkinda .. | Bedavalar-Net | Link Değişim | 35 | 2005-07-01 21:55 |
| robots.txt | korsan_cd | Webmaster Genel Konular | 2 | 2004-12-03 23:00 |