Enjoy unlimited access to all forum features for FREE! Optional upgrade available for extra perks.

Google indexing non existent https pages

Discussion in 'SEO Search Engine Optimisation' started by mat, Nov 17, 2014.

Thread Status:
Not open for further replies.
  1. mat

    mat Well-Known Member

    Joined:
    Apr 2007
    Posts:
    3,861
    Likes Received:
    111
    Hi guys,

    Have had a bit of a problem for the last month on one of my sites.

    For some reason Google started indexing all my pages as "https".

    There are no internal links pointing at https. But annoyingly if one of the pages is loaded via https all internal structure is replaced with https and has allowed the whole site to be crawled as such!

    So far I have made sure to set sitewide canonical tag to point at http even if the page loads as https. After a while of waiting and not seeing much change I also installed a WordPress plugin which 301 redirects all https to http.

    Is there anything else I can do, or is it now just a case of waiting ages for Google to find the original pages and replace?

    I also note that although if you search via "site:" the majority of pages are set as https, you can still in fact find the http variants in Google doing manual searches.
     
  2. Domain Forum

    Acorn Domains Elite Member

    Joined:
    1999
    Messages:
    Many
    Likes Received:
    Lots
    IWA Meetup
     
  3. ian

    ian Well-Known Member

    Joined:
    Jan 2008
    Posts:
    4,154
    Likes Received:
    311
    I wouldn't know the code, but maybe redirect in the htaccess and robot.txt files.
     
  4. mat

    mat Well-Known Member

    Joined:
    Apr 2007
    Posts:
    3,861
    Likes Received:
    111
    Hi Ian,

    The WordPress plugin will be achieving the htaccess redirect and im keen not to block them in robots as they will need to be crawled to find the original http page.

    Im just hoping there is some way to speed up the process, but probably not :-(

    Thanks.
     
  5. Adam H

    Adam H Well-Known Member

    Joined:
    May 2014
    Posts:
    1,725
    Likes Received:
    267
    It only takes someone to link to you using the wrong protocol , if your site isnt https then id recommend redirecting all https requests to http :

    http://stackoverflow.com/questions/8371/how-do-you-redirect-https-to-http

    Then fetch as Google bot from Webmaster tools and resubmit to the index all URL's .

    Also make sure your sitemap isnt for some reason including a http URL, regenerate and resubmit the sitemap.
     
  6. mat

    mat Well-Known Member

    Joined:
    Apr 2007
    Posts:
    3,861
    Likes Received:
    111
    Hi,

    Thanks Adam.

    I think the problem with Fetch as Google in webmaster tools is that although it will resubmit the site, it will still not disregard the indexed https pages.

    I think it is just a case of waiting it out! :(
     
  7. Adam H

    Adam H Well-Known Member

    Joined:
    May 2014
    Posts:
    1,725
    Likes Received:
    267
    You wont need to disregard them if you are 301 redirecting, Once they crawl the https again and see it 301 redirects to the standard page it will fix its self. You could request removal of those pages but i wouldnt recommend it, simply redirect, resubmit sitemap and fetch as Google bot and it will resolve its self with out any downside.
     
    • Like Like x 1
  8. Retired_Member38

    Retired_Member38 Banned

    Joined:
    Jun 2013
    Posts:
    1,742
    Likes Received:
    41
    I wouldn't recommend removal either as if you do so you might end up with neither of the versions indexed for a period of time, and obviously no incoming traffic because of it.

    If you have redirects and canonical both in place then you don't really need to do anything else. Just wait and it will fix itself.
     
    • Like Like x 1
  9. mat

    mat Well-Known Member

    Joined:
    Apr 2007
    Posts:
    3,861
    Likes Received:
    111
    Hi Adam,

    The problem is webmaster tools will not recrawl the https pages as you request for it to recrawl the homepage and any linked pages. The https pages are not part of the site or sitemap and will not be crawled.

    Also as Monkey says removing can cause for the https and http of the pages to be removed.

    Thanks for your help though.
     
  10. mat

    mat Well-Known Member

    Joined:
    Apr 2007
    Posts:
    3,861
    Likes Received:
    111
    Thought as much. I will update the thread in 6 months time when Google has updated the pages :p
     
  11. Adam H

    Adam H Well-Known Member

    Joined:
    May 2014
    Posts:
    1,725
    Likes Received:
    267
    The fetch as Google Bot is generally designed for any major site change, it certainly wont hurt to get them to recrawl the site and may speed up the process of those redirects being found and resolved.

    Which is exactly why i recommended against it .
     
  12. murph United Kingdom

    murph Well-Known Member

    Joined:
    Dec 2005
    Posts:
    1,066
    Likes Received:
    7
    I had the exact same issue a few years ago. Used 301 in htaccess to solve. Seem to remember it took quite a while for G to correct SERPS but no real harm done.
     
    • Like Like x 1
Thread Status:
Not open for further replies.