Domain Manage

so called 'Offline Surfers', how do you block just them?

Discussion in 'Website Design' started by retired_member12, May 17, 2010.

Thread Status:
Not open for further replies.
  1. retired_member12

    retired_member12 Retired Member

    Joined:
    Aug 2006
    Posts:
    1,505
    Likes Received:
    23
    I have a site which is regularly trawled, and I was wondering if there is any technique to stop the activity? Obviously I'm keen to have the likes of Google, Bing etc still do their job, but i've seen an increase in mainly Chinese, but also UK based ip addresses registered to individual companies up to the activity. The pattern is usually very clear once you know what you are looking at, a fresh page load every 3-4 seconds, each tree explored in a methodical manner. I wouldn't mind so much, image theft I've got used to, but the incidences of trawlng seem to have ramped up over the past 3 months or so. Adding individual ip addresses to the .htaccess file for exclusion seems like locking the barn door after the horse has bolted, i'm looking for a solution which is a little more proactive!

    Any ideas anyone?
     
  2. Domain Forum

    Acorn Domains Elite Member

    Joined:
    1999
    Messages:
    Many
    Likes Received:
    Lots
     
  3. JDubya

    JDubya Active Member

    Joined:
    Apr 2010
    Posts:
    94
    Likes Received:
    7
    I can't think of a simple, complete solution but you could:

    - implement geo tracking and block all chinese IPs
    - If the IP/User agent can't help flag the visitor as rogue based on collected data, I would block based on the behaviour pattern you mentioned. Track pages visited in a timescale and cut access if they reach X pages in X seconds (would require a white list to check the user agent against to not affect trusted agents).
    - in your robots.txt disable access to a file, then put a link to it somewhere early in your content (where a normal user wouldnt see it, hidden) so any rogue spider that ignores a robots.txt command ends up there fairly quick before getting to all your content - and block it from further activity.
     
Thread Status:
Not open for further replies.

Share This Page