What is Robots.txt?


Robots.txt is a file that all website masters recommend placing on their website hosting, if there's something they would prefer not to have crawled by search engines. If you visit a few of your favorite websites and then type into the URL robots.txt at the end, you will likely be presented with their robots text. It is not something that contains a lot of information, but it is very important if you want something to remain out of search engines such as yahoo, google and bing.

Example: www.yourfavouritewebsite.com/robots.txt

Even if you have a website where you want everything in a search engine, it is still good practice to have a robots.txt file, as it will let search engines know everything's ok, and that they can crawl all over the website with no restrictions. For this to be done, you'd just have to create a file named robots.txt, place it in your main website directory (the root), and how the following inside the file.

Access to everything
User-agent: *
Allow: /

Alternatively if you wanted to stop search engines from having access to everything and effectively turn them away at your front door, you could have the following inside your website.

Restrict access to everything
User-agent: *
Disallow: /

There is also the ability to have robots.txt to give access to some things and not to others. Don't worry though, this is just as simple as the other two options above.

Restrict access to something's
User-agent: *
Disallow: /your folder name/
Disallow: /your folder name2/
Disallow: /your folder name3/

What the above will do, is inform all search engine spiders that they're allowed access to your website but they can't view the three folders specified. Obviously you should change these folder names to the ones you'd want.

If you're finding that it's just the one search engine spider that you're wanting to restrict from a certain place, or your website completely, you can replace the User-agent: * to something like User-agent: search engine spider name.

There is a list of the most common search engine spiders below, for webmasters.

Google – googlebot
MSN –msnbot
Yahoo - Slurp
Ask Jeeves – Teoma
Altavista – Scooter (now uses yahoo spiders instead of its own).

Blog Posted: 2014-08-23 15:40:47
Keywords: robots.txt, information,
Blog Post Author: Sycrid





Recent Blog Posts

What is Robots.txt?
Google Domains
Website Update
What is a great domain name?
What is a website host?


Subscribe...




Latest Web News

RSS Feed Widget