Hello, to put a robots.txt file do I put what.
Example :
User-agent: *
Disallow: /administration
Allow: /
or rather like that
User-agent: *
Disallow: / modules
Disallow: /flash
Disallow: /cache
Disallow: /inc
Disallow: /administration
Disallow: /tmp
Disallow /xmlrpc
Allow: /:
Thank you for your advice
Free Dating Site on: http://coolonweb.com |
You can just do this.
User-agent: *
Disallow: /admininistration/
More disallows.
Allow is default and does not need to be specified.
And directories need to end with a /
https://www.deanbassett.com |
if this good to have for webmaster or wat Post Reply - if you going to help - No for - bla bla bla bla |
specifically a robots.txt ends up being a matter of preference, so that your site isnt indexed by rogue bots and placed on the get-viagra directory listing.
its more for control of what bots are indexing your site, and bandwidth consumption.
if this good to have for webmaster or wat
When a GIG is not enough --> Terabyte Dolphin Technical Support - Server Management and Support |
but my question is is good to have a robot text with these Disallow: Post Reply - if you going to help - No for - bla bla bla bla |
As DosDawg mentioned. It's a matter of preference.
It depends on what you want the crawlers indexing.
Some webmasters for the purpose of privacy of their members will disallow the indexing of any folders that contain photos.
But again, it's up to you as to what you want the crawlers going though to add to the search engines. There is no good or bad. It's what you prefer.
https://www.deanbassett.com |
yes it is very good for seo, because d7 has many duplicate content for searchengines.
examples:
change templates url: ?skin=uni and all other templates
same for languages
?lang=en
and many many more....
if you work with google webmaster tools you can see many variables which are found on your site and could be duplicate content
ue30 Mods - http://www.boonex.com/market/posts/ue30 |
Post Reply - if you going to help - No for - bla bla bla bla |
but my question is is good to have a robot text with these Disallow:
i would opt to say it wouldnt hurt if you run disallow for those specific directories, but you can go further, and set which bots you wish to allow crawling your site. i think its a matter of preference, of course you dont want bots crawling your error_logs, or for that matter your cache files, since the cache files do contain usernames, and encrypted passwords, but also considering this /cache has the .htaccess which would disallow a robot from crawling, and then your /inc directory should be set to 555 which would also disallow crawling.
When a GIG is not enough --> Terabyte Dolphin Technical Support - Server Management and Support |
DosDawg, Deano and ue30
is you create one for your site let say
wat you will put on the robot file so i can use this like recommended
Post Reply - if you going to help - No for - bla bla bla bla |
this is not so easy....
you can take a look in my http://www.ue30-party-portal.de/robots.txt but you shouldn´t use it,
it is for each site different what you want to block
so register for google webmaster tools and after a while you can see variables which are found on your site.
what you should block and are big mistakes withour blocking them are
Disallow: /*lang= Disallow: /*skin=
ue30 Mods - http://www.boonex.com/market/posts/ue30 |
A quick net search on the subject indicates only good bots pay attention to robots.txt and bad bots will either overlook it or in severe cases actually use the robots.txt file against you to scan the folders you list:
http://www.kloth.net/internet/badbots.php
This page links to another that describes building a bot trap:
http://www.kloth.net/internet/bottrap.php
Any thoughts on this? It seems like you're damned if you do and damned if you don't.
Greater community info on ways to take protective measures would be helpful. Ending up on spam lists has really hurt me.
Someday, Someway. |