tinkerEdge > Blog > SEO > Use of robots.txt, noindex, nofollow, canonical URL & 301/302 Redirects

Use of robots.txt, noindex, nofollow, canonical URL & 301/302 Redirects

SEO
January 4, 2015
Cheok Lup

Many are mistaken that page crawling & indexing is an intertwined function in Google Search, but in fact they are actually separated mechanism layers. Web pages can be crawled, but that doesn’t mean that they will be indexed by Google. Overall, we have seen a lot of ambiguity and confusion with the various SEO implementations faced by website owners, managers, and even search specialists or SEO practitioners.

Despite Google mentioned that Google does not transfer PageRank or anchor text across nofollowed (rel=”nofollow”) links, many have the misconception that Google will not crawl these hyperlinks. In fact, Google does so and therefore recommends the use of robots.txt to block Googlebot from crawling the affected nofollowed links.

Google recommends the use of HTTP 301 Redirects to transfer PageRank from your old page to new page. This server-level redirect is recognized and followed by Googlebot to identify and crawl the new page URL. Page-level redirect, i.e. meta refresh is not an effective implementation to transfer most of the PageRank from old page to new page, and it may also screw up web analytics tracking on the page – resulting in the erroneous attribution of traffic source to the new page as direct traffic instead.

Below is a table displaying the comparative results of different type of SEO implementations: blocking Googlebot with robots.txt, noindex, nofollow, canonical URL, 301/302 redirects.

Actions	PageRank be passed from other pages to Page A?	Visitors able to view Page A?	Googlebot able to crawl Page A?	Google able to index Page A?	Page A able to accumulate PageRank?	Page A able to pass PageRank to other pages?
Block Page A with robots.txt	No	Yes	No	Depends, Google may have already index the page before blocking with robots.txt	No	No, hyperlinks on Page A are NOT detected & crawled, as Googlebot unable to crawl to Page A in the first place
Use rel=”nofollow” on hyperlinks to Page A	No	Yes	Yes	Yes	Yes, assuming that there are other “followed” hyperlinks to Page A	Yes
Use noindex meta standard (<meta name=”robots” content=”noindex” />) on Page A	Yes	Yes	Yes	No	Yes	Yes
Use nofollow meta standard (<meta name=”robots” content=”nofollow” />) on Page A	Yes	Yes	Yes	Yes	Yes	No, however Googlebot can still crawl through the hyperlinks on Page A
Use canonical URL of Page B (<link rel=”canonical” href=”Page B’s URL” />) on Page A	No, PageRank is passed to Page B	Yes	Yes	Yes, Google may choose to serve up Page B instead of Page A on Google SERP	No, PageRank is passed to Page B	Since PageRank on Page A is passed to Page B, there is little or no PageRank left on Page A
Implement HTTP 301 Redirect on Page A to Page B	No, PageRank is passed to Page B	No, user lands on and views Page B	No, Googlebot will crawl Page B	No	No	No, Page B passes PageRank to other pages
Implement HTTP 302 Redirect on Page A to Page B	No, PageRank is NOT passed to neither Page A nor Page B	No, user lands on and views Page B	No, Googlebot will crawl Page B	No	No	No

Reference:
Learn about robots.txt files
Block search indexing with meta tags
Use rel=”nofollow” for specific links
Use canonical URLs
Change page URLs with 301 redirects
Redirection

Cheok Lup

Cheok Lup is a results-oriented and data-driven practitioner with over 17 years of expertise encompassing product development, data analytics, and performance-focused digital marketing. He has collaborated with global MNCs, SMEs, and startups to deliver scalable enterprise-level solutions, build teams, and streamline operational processes across international markets.