Forums  ›  General  ›  General discussions

Sitemap, robots.txt or both



<url>         <-------------------- Does this do the home page only or the whole site?



That would do the entire site because your not specifying a specific page. Also, the crawler will still follow any links it finds on any pages in the sitemap.

The point of a sitemap is to help the crawlers find all of the pages in your site. It's meant to help get your entire site listed which is opposite of what you want. So disable cheetahs site map. Your not going to want to use it.

The best option in your case is a robots.txt file configured to disallow everything except specific pages.

User-Agent: *
Disallow: /
Allow: /index.php
Allow: /page/about-xxx
Allow: /page/features
Allow: /page/joinus

I have a site with a lot of modules like photos, videos, stories, classifieds etc., but everything is private. Despite this I want the site to be listed in the search engines.

I've created three (3) information pages and I want these and the home page to be the only pages listed.

Cheetah's Sitemap doesn't seem to offer that control, or does it?

I read a post by Deano  on another forum about Robots.txt and it would seem a lot easier if I could just list the four pages and tell the search engine to ignore everything else.

the pages are:

/pages/joinus (home page)

I don't want the crawlers or bots going anywhere else if possible.  With this in mind, is the home page meta description included on the home page, or do I need to direct the bots towards that as well.

What's the easies way to disallow everything else?

As I'm not sure if I understand the Cheetah sitemap correctly, I did one on a free site and removed what I think I don't want:

<?xml version="1.0" encoding="UTF-8"?>


<!-- created with Free Online Sitemap Generator -->

<url>         <-------------------- Does this do the home page only or the whole site?


