Fixing the ROBOTS Paradox

Permalink 1 user found helpful
Since I can't find a recently active discussion about this issue I have decided to make my own thread in regards to the issue. Seeing as how this is officially my third post here I'm not sure this is the right forum for this so hopefully someone will please correct me if it's not.

That said, I ran across an interesting scenario while developing my site and that was how to handle limiting or revoking robots and search engines from indexing certain pages of my site (such as the Terms & Conditions).

The key lies in the little used, but often times much more utilitarian robots meta tag.

As defined here:http://www.robotstxt.org/meta.html... We should be able to limit most legitimate search engines from indexing our pages by using this tag.

Leveraging the robots meta tag, along with the built-in HEADER EXTRA CONTENT custom attribute in C5, we can keep certain pages of our site off-limits without having to robots.txt the whole site off limits or attempt to catalog every combination of alias and/or cid that may lead to those pages we want to keep off limits.

To do this follow these steps:

1. While editing the page in question push the properties button in the dashboard bar at the top of the screen.

2. Select the Custom Attributes tab from the Page Properties window.

3. Choose "Header Extra Content" from the Custom Attributes drop down menu.

4. In the Header Extra Content text area type in an appropriate robots meta tag such as:
<meta name="ROBOTS" content="NOINDEX, NOFOLLOW" />


5. Click save and go about your merry way.

Hope this might help someone that finds themselves in the same boat as I was.

 
webhostaz replied on at Permalink Reply
The other option that is possible is to add a bit of code to your php header file like so:

<?php
  $robots = false;
  $robots = $c->getCollectionAttributeValue('meta_robots');
  if($robots) {
    echo '<meta name="ROBOTS" content="'.$robots.'" />';
  }
?>


Then create a custom attribute with the name meta_robots and follow the above procedure to add a "Meta Robots" attribute to those pages you want to limit.

This way, instead of having to write the entire meta tag out each time you simply need to add the rules you wish to apply such as:

NOFOLLOW, INDEX
NOINDEX, FOLLOW
etc...