Block wiki spammer.

Hugh Messenger hugh at alaweb.com
Sun Jul 8 19:50:44 CEST 2007


> Anyone who doesn't have a wiki account and needs one will have to ask one
> of the admins on the list.

Yes please.  :)

I spent quite a while last year working on spam defenses in MW.  Here's a
few suggestions:

1) The most common one is ConfirmEdit at:

http://www.mediawiki.org/wiki/Extension:ConfirmEdit

This provides some extra CAPTCHA options, by default only triggered if the
new text includes any URL's, and an optional 'stronger' CAPTCHA module.
Whatever your personal feelings about CAPTCHA, it remains one of the most
effective defenses, although the serious Black Hats are getting pretty
sophisticated with OCR techniques to defeat it.  But it'll still prevent
about 70% of your basic pharmacy and penile extension type bullplop.

2) To address Alan's pet hate of the "no cleanup method", there's a pretty
good script at:
 
http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/SpamBlacklist/cle
anup.php

This script attempts to revert any pages posted to by a subsequently
blacklisted entity.

3) The SpamBlackList extensions, at:

http://www.mediawiki.org/wiki/Extension:SpamBlacklist

This allows you to import blacklists from a variety of sources, including
the 'official' Meta Wiki blacklist.

4) Another extension I've heard good things about is Bad Behaviour, at:

http://www.bad-behavior.ioerror.us/2007/01/08/bad-behavior-209/

Originally developed for WordPress, it has since been integrated into MW.
It uses a variety of techniques to identify spam, and is said to be very
effective, and will typically catch anything ConfirmEdit doesn't.  I haven't
actually tried this one though.

5) A standard trick for MW spam is the "hidden CSS spam" technique.  This
can be blocked by adding this to the LocalSettings.php:

$wgSpamRegex =
"/\<.*style.*?(display|position|overflow|visibility|height)\s*:.*?>/i";

... which blocks most of the suspect style attributes used by spammers (i.e.
those to do with hiding a div, which almost never get used in 'normal'
usage).

6) Another useful trick is to block connections with blank User-Agent
(99.999% of the time this will be a spambot).  Just add this to a .htaccess
file:

SetEnvIf User-Agent ^$ spammer=yes
Order allow,deny
allow from all           
deny from env=spammer


There are dozens of other extensions, and some good 'best practice' docs I
can point you at if you are interested.

   -- hugh





More information about the Freeradius-Devel mailing list