Why isn’t a page in the Google search index?
Why isn’t a page in the Google search index if it is available on a site?
Google search uses a crawling process that may find a page on a site but not put that page into its index.
Site publishers may discover the problem by doing a search on Google for a page on their own sites and see that it’s not there. That may seem surprising, but it’s actually quite common.
Like most things Google, various experts speculate on why that happens without getting much of an answer from Google. So logic and common sense have to prevail.
The Google Search Console has a report showing various types of “coverage”, meaning how many pages are in the Google search index and how many are not. Unfortunately, the report at this time doesn’t reveal which pages are excluded. Publishers will have to discover which ones on their own.
The report does give some breakdowns on the exclusion categories. They include “Crawled – currently not indexed” and “Discovered – currently not indexed“.
Reasons for the Absence
Possible reasons why a page isn’t in the index include a page that:
- Has a “noindex” tag.
- Duplicates other content on the site.
- Duplicates content on other sites (common).
- Has low value for the Google index.
- Uses content too similar to other sites.
- Is blocked by robots.txt.
The entire page doesn’t have to duplicate another page. It can simply include a phrase, sentence or paragraph written the same way as other pages on the site. More likely, a page with too much similar content also may not make it into the index.
Likewise, either by accident or on purpose, a page has content that first appeared elsewhere. Honestly, taking content from elsewhere and rewriting it is a practice that dates back decades if not centuries in print journalism.
In print, it is mainly done for news briefs; professionalist who use more for longer articles should give credit to the source. Online, a writer or publisher risks a duplicate content penalty by copying content from elsewhere without rewriting it or by giving it a light rewrite.
Low-value content often is nothing more than a pages with a series of links. It may have just a few short paragraphs of content that is on thousands if not millions of pages elsewhere. A common example is directory listings.
Ways to Fix the Problem
Identifying a page that is missing is the first step in getting it into the index. For example, a publisher checks for a page in the index that seems worthy of a decent ranking. But it isn’t there.
The next step is making improvements to the page in the form of more content or rewriting it to remove possible duplication.
Another possibility is adding content, either text, photos or graphics to add perceived value to the page.
Does the site have prominent links to the page in question? A review of the link structure may reveal that it is either hidden from search engines or buried so deeply in the site that search engines may not find it.
None of the above guarantees a fix to the index problem. But improving the content of a page has other benefits as well, especially to visitors who find it.