Mobile Friendliness

Q. Pages that are not mobile-friendly don’t fall out of being indexed

  • (00:40) Even if website pages are not completely mobile-friendly, they should still be indexed. Mobile-friendliness criteria is something Google uses as a small factor within the mobile search results, but it definitely still indexes those pages. Sometimes this kind of issue can come up temporarily where Google can’t crawl one of the CSS files for a brief time, as then it doesn’t see the full layout. But if these pages look okay when being tested manually in the testing tool, then there isn’t really a problem and things will go back to normal eventually.

Title Length

Q. On Google’s side, there are no guidelines on how long a page title should be

  • (03:02) Google doesn’t have any recommendations for the length of a title. John says that it’s fine to pick a number as an editorial guideline based on how much space there is available, but from Google’s search quality and ranking side, there aren’t any guidelines that state some kind of required length. Ranking doesn’t depend on whether the title is shown shorter or slightly different – the length doesn’t matter.

URL Requirements

Q. Length of URL and words contained in URL matter mostly for users, not for SEO

  • (04:53) URL length doesn’t make any difference. John says that it’s good practice to have some of the words in the URL so that it’s a more readable URL, but it’s not a requirement from an SEO point of view. Even if it’s the ID to a page, it’s okay for Google too. It’s good to have words, but it’s essentially something that just users see. For example, when they copy and paste the URL, they might understand what the page is about based on what they see in the URL, whereas if they just see the number, it might be confusing for them.

Doorway Pages

Q. If a website has a very little number of similar landing pages, they are not considered doorway pages

  • (14:41) The person asking the question is worried about the fact that his seven landing pages that target similar keywords and have almost duplicate content would be flagged as doorway pages and would be de-listed. John explains that with just seven pages, he probably wouldn’t have any problems, even if someone from the Web Spam Team was to manually look at that. They would see that it’s seven pages, not thousands of them. It would be different, if someone, for example, a nationally active company, had a separate page for every city in the country. Then the Web Spam Team would consider that as beyond acceptable and problematic, where they would need to take action to preserve the quality of the search results.

Reviews Not Showing Up in SERPs

Q.If reviews on a page are not left on the page itself, but are outsourced from some other website, they’re not going to show up in SERPs

  • (21:35) For a review to show up in the search results, it needs to be based on a specific product on that page and it needs to be the thing that a user left directly on that page. So if a website owner was to archive reviews from other sources and post them, then Google wouldn’t pick those up as reviews for the structure data side. These can be kept on the page, it’s just Google wouldn’t use the review markup for that.
    It’s a tricky process because Google tries to recognise this situation automatically and sometimes it doesn’t recognise it and shows the review. There are some sites that have these reviews shown because Google didn’t recognise that it was not left on the site. But from a policy point of view, Google tries not to show reviews that are left somewhere else and are copied over to a website.

Search Console Verification

Q. It’s possible to have a site verified multiple times in Search Console

  • (24:47) In Search Console, it’s possible to have a site verified multiple times, as well as to have different parts of the site verified individually. It doesn’t lose any of the data when the website is verified separately. That’s something, where it’s okay to have both the host level as well as domain level verification running in Search Console.

Crawling AMP and non-AMP Pages

Q. Google tries to keep a balance between crawling AMP and non-AMP pages of a website

  • (26:42) Google takes into account all crawling that happens through the normal Google Bot infrastructure and that also includes things like updating the AMP cache on a website. So if there are normal pages as well as AMP versions and they’re hosted on the same server, then the overall crawling that Google does on that website is balanced out and that includes AMP and non-AMP pages. So, if the server is already running at its limit with regards to normal crawling and AMP pages are added on top of that, then Google has to balance and figure out what it can do there – which part it can crawl at which time. For most websites, that’s not an issue. It’s usually more of an issue for websites that have tens of millions of pages, Google barely gets through crawling all of them and when another kind of duplicate of everything is added, then it makes it a lot harder. But for a website with thousands of pages, adding another thousand pages from the AMP versions is not going to throw things off.

Indexing Process

Q. The way Google indexes pages and the way request indexing tool work have changed over the past few years

  • (32:27) In general, the ‘request indexing tool’ in the Search Console is something that passes it on to the right systems, but it doesn’t guarantee that things will automatically be indexed. In the early days, it was something that was a lot stronger in terms of the signalling for indexing, but one of the problems that happens with this kind of thing is that people take advantage of that and use that tool to submit all kinds of random stuff as well. So over time Google systems have grown a little bit safer in that they’re trying to handle the abuse that they get, and that leads to things sometimes being a bit slower, where it’s not so much slower because it’s doing more, but it’s slower because Google tries to be on the cautious side. This can mean things like Search Console submissions take a little bit longer to be processed, it can mean that Google sometimes needs to have a confirmation from crawling and kind of a natural understanding of a website before it starts indexing things there.
    One of the other things that have also changed quite a bit across the web over the last couple of year, is that more and more websites tend to be technically okay in the sense that Google can easily crawl them. So on the one hand, Google can shift to more natural crawling and on the other hand, that means a lot of stuff it gets, it can crawl and index, which means because there’s still a limited capacity for crawling and also for indexing, Google needs to be a little bit more selective there, and it might not be picking things up fast.

Pages Getting Deindexed

Q. Some pages are being deindexed as new pages are added to the website is a natural process

  • (37:02) For the most part, Google doesn’t just remove things from its index, it kind of picks up new things as well. So, if there’s new content added at the same time and some things get dropped on along the way from the index, usually that is normal and expected. Essentially, there are pretty much no websites that Google indexes everything on the website. It’s something where, on average, between 30 and 60 percent of a website tends to get indexed. So, if there are hundreds of pages added per month and some of those pages get dropped or some of the older or less relevant pages get dropped over time, that is kind of expected.
    To minimise that, the value of the website overall needs to be shown to Google or the users, so that Google will decide to try and keep as much as possible from the website in the index. 

Website Migration

Q. After a few months post website migration, it’s better to remove the old sitemap from the old website

  • (41:58) Usually, when someone migrates a website, they end up redirecting everything to the new website and sometimes they keep a sitemap file of the old URLs in Search Console with the goal that Google goes off and crawls those old URLs a little bit faster and finds the redirect. That’s perfectly fine to do in a temporary way, but after a month or two, it’s probably worthwhile to take that sitemap out because what also happens with the sitemap file is it tells Google which URLs are important. Pointing at the old URLs is almost the same as indicating that the old URLs need to be findable in search and that can lead to a little bit of conflict in Google systems because the website owner is pointing at the old URLs but at the same time, they’re redirecting to the new ones. Google can’t understand which ones are more important to index. It’s better to remove that conflict as much as possible, and that can be done by just dropping that sitemap file.

Spider Trap

Q. Whenever there are spider trap URLs on a website, Google usually ends up figuring them out

  • (46:06) If there is something on a website, like, for example, an infinite calendar where it’s possible to scroll into March 3000 or something like that and essentially one can just keep on clicking to the next day and the next day, and it’ll always have a calendar page for that, that’s an infinite space kind of thing. For the most part, because Google crawls incrementally, it’ll start off and go off and find maybe 10 or 20 of these pages and recognise that there’s not much content there, but think that it will find more if it goes deeper. Then Google goes off and crawls maybe 100 of those pages until it starts seeing that all of this content looks the same, and they’re all kind of linked from a long chain where someone has to click “next”, “next”, “next” to actually get to that page. At some point, Google systems see that there’s not much value in crawling even deeper here because they found a lot of the rest of the website that has really strong signals telling them that those pages are actually important compared to the really weird long chain on the other side. Then Google tries to focus on the important pages.

Multilingual Content

Q. When there is multilingual content, it’s advised to use hreflang to handle that correctly

  • (53:13) In general, if there is multilingual content on a website, then using something like hreflang annotations is really useful because it helps Google to figure out which version of the content should be shown to which user. That’s usually the approach to take. 
    While with a canonical tag, Google knows which URL to focus on. So the canonical should be the individual language versions – it shouldn’t be one language as a canonical for all languages. Each language has its own canonical version – like there is the French version and the French canonical, the Hindi version and the Hindi canonical. So it shouldn’t be linking across languages.

Sign up for our Webmaster Hangouts today!

GET IN CONTACT TODAY AND LET OUR TEAM OF ECOMMERCE SPECIALISTS SET YOU ON THE ROAD TO ACHIEVING ELITE DIGITAL EXPERIENCES AND GROWTH