Buy Sell Discuss UK Domain Names at AcornDomains.co.uk

Today's Drop Dates are: 07-11-2011 or 14-11-2011   All times are GMT. The time now is 06:35:53 PM.
Domain Name Sales Domain Software Calculate UK Domain Drop Dates Domain Registration NameDrive Domain Parking Subscribe to our Domains For Sale newsletter
Go Back   Domain Forum Acorn Domains Buy Sell Auction UK Domains > Domain Name Research > Domain Traffic / Keyword Research
Connect with Facebook

Domain Traffic / Keyword Research Discuss Domain Name traffic and keyword popularity. Overture no longer functions.

Closed Thread
 
LinkBack Thread Tools Display Modes
Old 30-11-2009, 01:08:56 AM     #1 (permalink)
Administrator
 
admin's Avatar
 
Join Date: Jun 2004
Posts: 8,517
admin has disabled reputation

Script Blocking a rogue bot

One of my domains is showing a massive bandwidth jump which turns out to be wise-guys.nl search bot hitting the site hard.

2009 Aug 163 MB
2009 Sept 1,483 MB
2009 Oct 1,198 MB
2009 Nov 1,395 MB

Vagabondo 762.75 MB 28 Nov 2009 - 11:55
Unknown robot 409.77 MB 29 Nov 2009 - 04:37

How do I block it?
admin is online now  
Old 30-11-2009, 01:16:55 AM     #2 (permalink)

 
Join Date: Jul 2009
Posts: 1,323
retired_member13 has a reputation beyond reputeretired_member13 has a reputation beyond reputeretired_member13 has a reputation beyond reputeretired_member13 has a reputation beyond reputeretired_member13 has a reputation beyond reputeretired_member13 has a reputation beyond reputeretired_member13 has a reputation beyond reputeretired_member13 has a reputation beyond reputeretired_member13 has a reputation beyond reputeretired_member13 has a reputation beyond reputeretired_member13 has a reputation beyond repute

You could try blocking their IP addresses or ranges from your .htaccess file. Full IP addresses block the specific IP, partial (second deny line) blocks that range. You should be able to get the IP addresses of the bots from your log files.

Additions take the form of:

<Limit GET POST>
order allow,deny
deny from 193.49.176.139
deny from 193.49.177
allow from all
</Limit>
retired_member13 is offline  
Old 30-11-2009, 02:39:05 PM     #3 (permalink)

 
Skinner's Avatar
 
Join Date: Jul 2008
Location: Manchester
Posts: 2,501
Skinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond repute

I noticed a massive incease much likes yours on my bounce rate experiment (and learned something about awestats counting this data not just making you aware). Where bots are doing 36k+ hits a month totalling over a 1.2gb+.

I put the + marker because I haven't looked in about 5 days but that was approx.

Most of mine is from Google Image search bot by the look of it, they seem to archive a thumbnail of all the graphics but not the whole thing. So they are hammering my bandwidth to get the images

Should be able to block it as Ty said with ip range blocking, you could block by identifier but the unknown one wouldn't be covered.
__________________
Browse:
Skinner is offline  
Old 01-12-2009, 10:07:32 PM     #4 (permalink)

 
jimm's Avatar
 
Join Date: Feb 2008
Location: North Yorkshire
Posts: 667
jimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond repute

Bandwidth is cheap, why block it unless it is taxing your server?
But the Vagabondo bot does read robot.txt so block them in there if you really want to.
jimm is offline  
Old 01-12-2009, 10:22:41 PM     #5 (permalink)

 
accelerator's Avatar
 
Join Date: Apr 2005
Location: England
Posts: 4,764
accelerator has a reputation beyond reputeaccelerator has a reputation beyond reputeaccelerator has a reputation beyond reputeaccelerator has a reputation beyond reputeaccelerator has a reputation beyond reputeaccelerator has a reputation beyond reputeaccelerator has a reputation beyond reputeaccelerator has a reputation beyond reputeaccelerator has a reputation beyond reputeaccelerator has a reputation beyond reputeaccelerator has a reputation beyond repute

Don't know how up to date this is, but here's some bot blocking code from a .htaccess file, you'll have to configure YourSite.co.uk:

Code:
IndexIgnore .htaccess */.??* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>
AuthName YourSite.co.uk


#######kill some bad bots
RewriteCond %{HTTP_USER_AGENT} ^Balihoo [OR]
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow
Rgds
accelerator is offline  
Old 10-02-2010, 02:40:39 PM     #6 (permalink)
Junior Member
 
Join Date: Feb 2010
Posts: 1
2dareis2do is on a distinguished road

wise-guys.nl

Try adding the following to your robots.txt file to see if this makes a difference:

Code:
# Blocking WIseguys as sucking all my bandwidth
# Vagabondo/4.0; webcrawler at wise-guys dot nl; WiseGuys Internet BV, we provide search technology SiteGround Web Hosting Server Default Page.

User-agent: Vagabondo
Disallow: /

Last edited by rjs_essex; 10-02-2010 at 05:32:26 PM. Reason: Removed Manual Sig
2dareis2do is offline  
Old 14-02-2010, 07:56:20 PM     #7 (permalink)

 
DaveH's Avatar
 
Join Date: Apr 2008
Location: Southern England
Posts: 591
DaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond repute

Quote:
Originally Posted by jimm View Post
Bandwidth is cheap, why block it unless it is taxing your server?
But the Vagabondo bot does read robot.txt so block them in there if you really want to.

lol you don't run your own servers or a large site then!
  • High CPU utilization
  • Unnecessary Database Queries (more log files)
  • Unnecessary Disk space from webserver log files
  • Unnecessary Disk IO which causes 99% for performance problems IME


First I'd try the robots file to see if it obeys it - if not look up it's IP address and block the range.


For really large sites that are heavily indexed, I tend to use agents from http://en.wikipedia.org/robots.txt
DaveH is offline  
Old 14-02-2010, 11:23:04 PM     #8 (permalink)

 
jimm's Avatar
 
Join Date: Feb 2008
Location: North Yorkshire
Posts: 667
jimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond reputejimm has a reputation beyond repute

Quote:
Originally Posted by DaveH View Post
lol you don't run your own servers or a large site then!
You would struggle to get much more wrong to be honest.
Admittedly I have scaled back since I sold a part of my hosting business 18 months ago but I do still have a lot of hardware in use along side administrating some decent sized sites. I am still a small fish, just not quite as small as you think

Quote:
Originally Posted by DaveH View Post
  • High CPU utilization
  • Unnecessary Database Queries (more log files)
  • Unnecessary Disk space from webserver log files
  • Unnecessary Disk IO which causes 99% for performance problems IME
Meh, it can happen but if I get these issues its normally because normal use is taking the server towards its designed limit anyway.
Spec your hardware for the peaks and troughs and a bit higher peak is nothing to panic about.

Plus I did say
Quote:
Originally Posted by Me
unless it is taxing your server
and admin was talking about bandwidth.
__________________
Fov.cc | EvoOwners.co.uk | Forget Debt | xFTP | Jaimee | Linux Book | ESE Pods | Music Quotes
Affordable Server Admin - PM me!
jimm is offline  
Old 14-02-2010, 11:46:01 PM     #9 (permalink)

 
DaveH's Avatar
 
Join Date: Apr 2008
Location: Southern England
Posts: 591
DaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond reputeDaveH has a reputation beyond repute

Fair play - I was just “mythed” initially with that comment due to the amount of headaches I've had in the past with bots and other automated querying.
DaveH is offline  
Closed Thread



Bookmarks

Tags
wise-guys.nl search bot

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

Similar Threads
Thread Thread Starter Domain Name Community Replies Last Post
Domain-name abuse proliferates; rogue registrars turn a blind eye - Computerworld RSS Domain Name News 0 14-09-2009 02:59:00 PM
Domain-name abuse proliferates; rogue registrars turn a blind eye - NetworkWorld.com RSS Domain Name News 0 14-09-2009 05:00:24 AM
ResellerClub Shuts Down Rogue Pharmacies - Web Host Industry Review RSS Domain Name News 0 13-04-2009 04:59:07 PM
Army blocking commercial domain names - News 10 Now RSS Domain Name News 0 26-09-2007 09:59:12 PM
Blocking Web Sites in ISA Server - SQL Server Magazine (subscription) RSS Domain Name News 0 26-12-2006 03:59:06 PM


All times are GMT. The time now is 06:35:53 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.6.0 RC 2
All content on Acorn Domains is member generated and is not moderated before posting. All content is viewed and used by you at your own risk and AD does not warrant the accuracy or reliability of any of the information. The views expressed are those of the individual contributors and not necessarily those of AD. Please contact us to report any issues or send a PM to "Admin".