Indexing as many pages on your website as possible can be very tempting to marketers trying to increase their search engine authority.
While it’s true that publishing more pages relevant to a given keyword (assuming they are also of high quality) will improve your ranking for that keyword, sometimes it is actually worth more to have certain pages on your website keep the end the index of a search engine.
… Say what?!
Stay with us guys. In this post, you will learn why you want to remove certain websites from the SERPs (search engine results pages) and how exactly you go about doing it.
De-indexing a page from Google
There are a few times when you might want to exclude a webpage – or part of a webpage – from search engine crawling and indexing, such as:
- To prevent duplicate content (when more than one version of a page is indexed by search engines, such as a printer-friendly version of your content) from being indexed
- To treat admin and login pages for internal use unless they are intended for use by a community
- For a thank you page (i.e. the page a visitor lands on after conversion to one of your landing pages) that gives the visitor access to the offer promised by that landing page, e.g. B. a link to an e-book PDF
For example, this is what the thank you page for our e-book with SEO tips looks like:
You want everyone who lands on your thank you pages to get there because they’ve already filled out a form on a landing page – not because they found your thank you page in search.
Why not? Because anyone who finds your thank you page in search can access your lead generation offers directly – without having to provide their information to go through your lead capture form. Any marketer who understands the value of landing pages knows the importance of capturing these visitors as leads before they can access your offers.
Bottom line: If your thank you pages are easy to find with a simple Google search, you might be leaving valuable leads behind.
What’s worse, you may even find that some of your highest-ranking pages for some of your long-tail keywords are your thank you pages – which means you’re inviting hundreds of potential leads to bypass your lead capture forms. That’s a pretty compelling reason why you’d want to remove some of your web pages from the SERPs.
So how do you go about “de-indexing” certain pages from search engines? Here are three ways to do this.
3 ways to de-index a website from search engines
Robots.txt for de-indexing
Use when: You want more control over what you de-index and you have the technical resources to do so.
One way to remove a page from search engine results is to add a robots.txt file to your website. The advantage of this method is that it gives you more control over what bots index you. The result? You can proactively keep unwanted content out of search results.
In a robots.txt file, you can specify whether you want to block bots from a single page, an entire directory, or even just a single image or file. There is also an option to prevent your website from crawling while also enabling Google AdSense ads if there are any.
That being said, this one requires the most technical kung fu of the two options available. To learn how to create a robots.txt file, check out this article which explains exactly how to do it.
HubSpot customers: Here’s how to install a robots.txt file on your website and how to customize the content of the Robots.txt file here.
If you don’t need total control of a robots.txt file and are looking for a simpler, less technical solution, this second option is for you.
Htaccess No index No follow-to-de-index
Use if: Your website is running on Apache and mod_headers is enabled. This is a quick fix.
In that case, you could append this single line to your .htaccess file:
Header set X-Robots-Tag “noindex, nofollow”
To signal that your website is indexable but will never show up in Google search results.
Meta No index No follow-to-de-index
Use when: You want a simpler solution to de-indexing an entire webpage and / or de-indexing the links on an entire webpage.
Using a metatag to prevent a page from appearing in SERPs – and / or the links on a page – is both simple and effective. It only takes a tiny bit of technical know-how – in fact, it’s just a copy and paste job if you use the right content management system.
The tags that allow you to do these things are called “noindex” and “nofollow”. Before I get into how to add these tags, let’s take a moment to define and differentiate the two. After all, they are two completely different directives – and they can be used either individually or side by side.
What is a “noindex” tag?
When you add a “noindex” budget tag to a web page, It tells a search engine that while it can crawl the page, it cannot be included in its search index.
So every page will have the “noindex” directive on it not get in the search index of the search engine and therefore cannot be displayed on the search engine results pages.
What is a “nofollow” tag?
When you add a “nofollow” budget tag to a web page, it prohibits search engines from crawling the Left on this page. This also means that any ranking authority the page has in the SERPs not be passed on to pages to which there is a link.
On any page with a “nofollow” instruction, all links from Google and other search engines are ignored.
As I said earlier, you can add a “noindex” directive either alone or with a “nofollow” directive. You can also add a “nofollow” directive by itself.
When should “noindex” and “nofollow” be used separately?
Just add a “noindex” tag if you not want a search engine to index your webpage in search, but you do to do want them to follow the links on this page – thus giving the other pages to which your page links a ranking authorization.
Paid landing pages are a great example of this. You don’t want search engines to index landing pages that people have to pay for, but you may want the linked pages to benefit from their authority.
Just add a “nofollow” tag if you to do want a search engine to index your webpage in search, but you do not wants it to follow the links on this page.
There aren’t many examples of how you would add a “nofollow” tag to an entire page without adding a “noindex” tag as well. When you’re figuring out what to do on a particular page, it’s more of a question of whether to add your “noindex” tag with or without a “nofollow” tag.
When to use “noindex, nofollow” together
Add both a “noindex” and a “nofollow” tag if you not want search engines to index a webpage in search and you don’t want them to follow the links on that page.
Thank you pages are a good example of this situation. You don’t want search engines to index your thank you page, nor do you want them to follow the link to your offer and start indexing the contents of that offer.
How to add a “noindex” and / or a “nofollow” budget tag
Step 1: Copy one of the following tags.
For both “noindex” and “nofollow”:
Step 2: Add the tag to the sectionof the HTML code of your page, also known as the page header.
If you are a HubSpot customer, It’s easy to do – click here or scroll down to see the specific instructions for HubSpot users.
If you are not a HubSpot customer, You then have to manually insert this tag into the code of your website. Don’t worry – it’s very easy. This is how you do it.
First, open the source code of the website that you want to de-index. Then add the full tag on a new line within the-Section of your page’s HTML code called the page’s header. The screenshots below take you through.
That-Tag marks the beginning of your header:
Here is the meta tag for “noindex” and “nofollow” that was put in the header:
And the-Tag marks the end of the header:
Boom! That’s it. This tag instructs a search engine to turn around and disappear, leaving the page out of all search results.
No index No following in HubSpot
Adding the noindex and nofollow meta tags is even easier. All you have to do is open the HubSpot tool from the page you want to add these tags to and select the Settings tab.
Next under Advanced Options and click on “Head HTML”. Paste the appropriate code snippet in the window below. In the example below, I’ve added both a “noindex” and a “nofollow” tag because it’s a thank you page.
Hit “Save” and you are spot on.
Successful No index No following a page
You just magically deleted your page from search engine results. Now you can capture more of those lost leads again.
Remember, you won’t see results right away. Your changes will not take effect until the next time a search engine crawls your page. Depending on how often you typically publish new pages on your website, this can actually take a few weeks. The more you publish content, the more often search engines will crawl your website. The best way to keep track of how often Google is visiting your website is to check your crawl stats in Google Webmaster Tools.
Bottom line: If you find that your page is still showing up in Google’s search results despite the “noindex” tag, it is likely because Google hasn’t crawled your site since you added the tag. You can request that Google crawl your page again using the “Get it as from Google” tool.
Also, be aware that some search engines’ web crawlers may interpret these instructions differently from Google, so your page may still appear in other search engine results. But it will work fine for Google – once it gets around to crawling your website.
Regardless, you can sleep a little more relaxed knowing that you ultimately made your website a better place to do your marketing.
Editor’s note: This article was originally published in July 2016 and has been updated for completeness.