Published: 21/04/2010

Updated: 21/06/2023

How to Noindex parts of a web page?

There is many ways to tell the google bot not to index pages: Using the Allow Directive in the robot.txtUser-Agent: gsa-crawlerDisallow: /folder1/Allow: /folder1/myfile.html Using Robots META Tags to Control Access to a Web PageUsing .htaccessUser-agent: *Disallow: /tralllala But how does it work, if you just want part from your pages not get craweled? Excluding Unwanted

There is many ways to tell the google bot not to index pages:

Using the Allow Directive in the robot.txt
User-Agent: gsa-crawler
Disallow: /folder1/
Allow: /folder1/myfile.html

Using Robots META Tags to Control Access to a Web Page
Using .htaccess
User-agent: *
Disallow: /tralllala

But how does it work, if you just want part from your pages not get craweled?

Excluding Unwanted Text from the Google Index is still not as easy as it should be.

Here is a video answer from Matt Cutts to this question:

Google still doesn’t offer any solution that allows us to exclude parts from pages from geting indexed, using noindexed iframes is too not a real solution.

So how to Noindex parts of a web page?

It would be cool if there would be such a solution like Google offers this for the Google Search Appliance there you can exclude part from pages from geting indexed.

The Google Search Appliance supports “googleon” and “googleoff” tags, special proprietary HTML tags that can be embedded in the HTML of crawled documents to prevent searching of text between these special tags.

The googleoff/googleon tags disable the indexing of a part of a web page. The result is that those pages do not appear in search results when users search for the tagged word or phrase. For example, some customers use googleoff/googleon tags to comment out a navigation bar in static HTML pages.

You can use googleon/off to tell the Google Search Appliance to ignore portions of a page. Insert at the point you want the Google Search Appliance to stop indexing, then insert where you want it to resume indexing the page.

GoogleOn and GoogleOff tags (which may see live on adobe.com) will be ignored by regular Google spiders or other search engines. They make sense only when used in conjunction with Google Search Appliance or possibly, Google Mini.

http://code.google.com/apis/searchappliance/documentation/46/admin_crawl/Preparing.html

I think google should offer such GoogleON and GoogleOf Tags not just for the Google Search Appliance, these Tags should work for the normal Googlebot too. This would help to get Webpages indexed more proper.

Author: Ortwin Oberhauser
INITIATOR OF SEOLOGY & WORLD’S FIRST SEOLOGE

BSc Applied Computer Science
SEM / SEO & Conversion Optimization Geek


Founder of Oberhauser.com
Co Founder of bobdo.com International Film & Digital Solution Agency
film production . web design . app & web development data driven performance search & social media marketing ai and search engine content & visibility optimization conversion optimizing . datacenter for web data analytics
Bahnhofstraße 10 . 6900 Bregenz . Austria

bobdo.com

Let’s discuss it

Leave a Reply

Your email address will not be published. Required fields are marked *