+ Reply to Thread
Results 1 to 10 of 10

Thread: Robot.txt file

  1. #1
    Nerd is offline Junior Member Nerd is on a distinguished road
    Join Date
    Apr 2009
    Posts
    5

    Smile Robot.txt file

    question no 1

    if we want to include all the search engine to index our web site:

    is these how the Robot.txt file going to look like

    User-agent: *
    Disallow:

    Question no 2

    Is this code simply mean only ia_archiver will not be able to crawl our site and the rest of
    the robot can?

    User-agent: ia_archiver
    Disallow: /

    Or do we need to add two more lines like these

    User-agent: ia_archiver
    Disallow: /

    User-agent: *
    Disallow:

    so we can get the other robot to access our web site...

    ...and finally, what is the type of file in our website site that we usually don't want the robot to read, so I can put this code like...Disallow:/something/ in...Thanks for the help guys...
    Last edited by Nerd; 05-01-2009 at 08:59 AM.

  2. #2
    Arleigh is offline Senior Member Arleigh is on a distinguished road
    Join Date
    May 2008
    Posts
    311

    Default

    Basically, the main function of the robots.txt file is to instruct robots not to crawl and index certain files most particularly password protected folders and folders which contain only images.

  3. #3
    temi's Avatar
    temi is offline Facilitator temi is just really nice temi is just really nice temi is just really nice temi is just really nice temi is just really nice
    Join Date
    Jun 2003
    Location
    London, England.
    Posts
    10,304

    Default

    I did a blogpost about robots.txt files here. You can also generate it automatically this days with help from Google via your Google webmaster tools

    * Build a shopping cart for your business with eCommerce software UK
    * BossCart.com can build you a.
    Register your domain names at Velnet
    ::
    Add Eco sites to The Green Directory free of charge.
    Use LBS Free PHP Directory Script . Web Hosting Blog

  4. #4
    azseoguy is offline Junior Member azseoguy is on a distinguished road
    Join Date
    May 2009
    Posts
    12

    Default

    Have you found that using the robots.txt file with disallow effectively works to stop your pages from showing in a search index?

    Have you ever used noindex in the robots.txt file? I hear it is only accepted by Google.

  5. #5
    temi's Avatar
    temi is offline Facilitator temi is just really nice temi is just really nice temi is just really nice temi is just really nice temi is just really nice
    Join Date
    Jun 2003
    Location
    London, England.
    Posts
    10,304

    Default

    Robots.txt is a W3C standard.... it applied to all good crawlers out there so its now used by Google only

    * Build a shopping cart for your business with eCommerce software UK
    * BossCart.com can build you a.
    Register your domain names at Velnet
    ::
    Add Eco sites to The Green Directory free of charge.
    Use LBS Free PHP Directory Script . Web Hosting Blog

  6. #6
    Robdale is offline Senior Member Robdale is on a distinguished road
    Join Date
    Jan 2009
    Location
    Newark, Delaware, USA
    Posts
    378

    Default

    Basically, robots.txt file will not do anything to improve your search engine positioning but it indicate robots which files you will not allow to be crawled and indexed in the search engines. Whenever a robot crawls your site it looks for the robots.txt file. If it doesn't find one it assumes automatically that it may crawl and index the entire site.

  7. #7
    xhan's Avatar
    xhan is offline Design Photo & Graphics Admin xhan is on a distinguished road
    Join Date
    Jan 2008
    Location
    London/Kent
    Posts
    349

    Default

    can you use robots.txt to stop google caching certain phrases? I dont want my website to be found by googling my email address!

  8. #8
    surreypcsupport's Avatar
    surreypcsupport is offline Senior Member surreypcsupport is on a distinguished road
    Join Date
    Nov 2008
    Location
    surrey
    Posts
    569

    Default

    Quote Originally Posted by xhan View Post
    can you use robots.txt to stop google caching certain phrases? I dont want my website to be found by googling my email address!
    No, you can only stop the crawlers from finding pages and therefore they will not be indexed.

  9. #9
    azseoguy is offline Junior Member azseoguy is on a distinguished road
    Join Date
    May 2009
    Posts
    12

    Default

    Quote Originally Posted by xhan View Post
    can you use robots.txt to stop google caching certain phrases? I dont want my website to be found by googling my email address!
    In the meta tags for that page you could use noarchive. That tells the SE not to cache your web page. I don't think it can be targeted to a single phrase though.

    HTML Code:
    <META content="NOARCHIVE" name="ROBOTS">

  10. #10
    xhan's Avatar
    xhan is offline Design Photo & Graphics Admin xhan is on a distinguished road
    Join Date
    Jan 2008
    Location
    London/Kent
    Posts
    349

    Default

    @surreypcsupport
    @azseoguy

    cheers guys, i've no archived it - I hope it helps!! :S

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

     

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177