I have a site with a lot of modules like photos, videos, stories, classifieds etc., but everything is private. Despite this I want the site to be listed in the search engines.
I've created three (3) information pages and I want these and the home page to be the only pages listed.
Cheetah's Sitemap doesn't seem to offer that control, or does it?
I read a post by Deano on another forum about Robots.txt and it would seem a lot easier if I could just list the four pages and tell the search engine to ignore everything else.
the pages are:
/page/about-xxx
/page/features
/pages/joinus
mysite.com (home page)
I don't want the crawlers or bots going anywhere else if possible. With this in mind, is the home page meta description included on the home page, or do I need to direct the bots towards that as well.
What's the easies way to disallow everything else?
As I'm not sure if I understand the Cheetah sitemap correctly, I did one on a free site and removed what I think I don't want:
<?xml version="1.0" encoding="UTF-8"?>
<urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
<!-- created with Free Online Sitemap Generator www.xml-sitemaps.com -->
<url> <-------------------- Does this do the home page only or the whole site?
<loc>https://xxx.com/</loc>
<lastmod>2023-03-15T06:05:27+00:00</lastmod>
<priority>1.00</priority>
</url>
<url>
<loc>https://xxx.com/page/about-xxx</loc>
<lastmod>2023-03-15T06:05:27+00:00</lastmod>
<priority>0.80</priority>
</url>
<url>
<loc>https://xxx.com/page/features</loc>
<lastmod>2023-03-15T06:05:27+00:00</lastmod>
<priority>0.80</priority>
</url>
<url>
<loc>https://xxx.com/page/joinus</loc>
<lastmod>2023-03-15T06:05:27+00:00</lastmod>
<priority>0.80</priority>
</url>
</urlset>