Spamalyser
Author Message
In short, Spamalyser tries to detect spam posts (not spam registrations) through a number of means and can perform some actions on detected spam.
Note that this plugin isn't exactly "complete", but should be "good enough" to use.

How is this different to other anti-spam plugins?
There are a number of other MyBB plugins which try to prevent spam.  However, I find that most of these can get rather pedantic and can easily block legitimate users with little, if any, means of them working around the block.
The aim of this plugin was to be somewhat more permissive than restrictive, as well as not have to rely on external services (although this plugin is capable of utilising information pulled from them).  This tries to deny as little as possible to legit users - for example, a solution such as blocking link posting for new users means that all users need to have a number of posts before they are able to post links, whereas Spamalyser allows this activity as long as it doesn't look spammy.

Stop Forum Spam and Fassim plugins, for example, seem to only perform blocks on user registration (before they even register, in fact).  As Spamalyser is more permissive, it only performs checks on posting (this also has the slight benefit of being able to work if guest posting is enabled, although Spamalyser isn't as effective with guests than it is with registered users).  Also, performing spam analysis on posts means that Spamalyser is able to use a greater number of inputs to judge whether a post is spam or not.
Of course, the downside is that Spamalyser will not try to do anything about spam registrations.

Can this be used with other anti-spam plugins?
Most of them yes.  I'm not sure whether this works with the Akismet plugin or not (although it's somewhat pointless to do so, because Spamalyser supports Akismet lookups).
If you use it with the Stop Forum Spam plugin, you may wish to disable SFS lookups in Spamalyser.

How does this plugin work?
When a user tries to post a new thread or reply, or edit a post, the plugin will check whether the user passes a number of thresholds (such as post count) and if so, deems the user "safe" and stops there.
If the user fails to pass the threshold test, the main Spamalyser engine kicks in and analyses the post to determine a "spam weighting" (likeliness of it being spam).  It then compares this weighting against some configurable action thresholds to decide on whether it should do anything to the post.  If the weighting meets the thresholds, Spamalyser can currently (depending on what you enable):
  • Report the post,
  • Unapprove the post, or
  • Block the post (displays an error when user tries to submit the post)
All weighting calculations are logged and can be viewed via ACP -> Tools -> Spamalyser Log

How is the "spam weighting" calculated?
Quite a number of means, but most of the code is link analysis.  Spammers ultimately want to post links, so it seems like a good place to start.  Every link posted will add to the weighting, and links with similar keywords or to the same domain get penalised more heavily.
Spamalyser can also make some judgements based on the poster's online time and other factors, and has a number of features to attempt to reduce the number of false positives (for example, by examining the user's previous posts).
External lookups to services such as Stop Forum Spam and Akismet are also supported, and you can specify the amount of weighting to give to these services.

You can try looking in the Spamalyser settings which show all the methods used by this plugin to detect spam.

Configuring Spamalyser
Spamalyser has quite a number of options, designed to give the administrator a fair amount of control over the spam analysis process.
The defaults are relatively permissive.  From my testing, you can probably reduce the Unapprove Threshold from 10 to 5 and capture a lot more spam.
Note that, by default, external lookups are disabled for localhost installs, but enabled for non-localhost installs.

I know there are a lot of options so it may be difficult to get your head around what they all do.  You can try inspecting the log (clicking on the weighting will show how it's calculated) to see how some posts' weights are determined and perhaps get a rough idea of what some of the options do.

Disable Link Analysis
If you wish to, you can completely disable the internal link analysis routine.  To do so, set the following three settings to 0:
  • Weight Per Simple Link
  • Weight Per Complex Link
  • Duplicate Keyword Bias

Akismet Note
To enable Akismet, you must enter in an API key - and you must ensure that this key is valid because Spamalyser won't check it.  Also, your server must be able to either make requests via cURL or fsockopen()

Upgrading
I'm not going to bother maintaining upgrade paths for this plugin.  This means that if you wish to upgrade to a newer version, you'll have to uninstall the old version, upload the files for the new version, then install it.
Obviously this means that you'll lose any setting changes you've made, as well as log entries, so you may wish to note down any custom settings you've set before uninstalling.  Do note that the meaning of some settings may change in the future, so refer to the changelog to ensure your custom settings have the same meaning.


Performance Issues
As this plugin performs external lookups by default, for all analysed posts, posting may be slower for posts which are examined.  Although the services queried (SFS, Akismet and Google) should have reasonably fast servers, you can choose to disable these lookups if you wish. [lookups are performed through MyBB's fetch_remote_file() function]
As for internal link analysis, the algorithm isn't exactly fast (still should be significantly faster than MyBB's post parser), but should be acceptable, especially since it's only done during posting.

Limitations
  • The plugin assumes that MyCode is always enabled (so only detects [url] type links), which is probably in-line with the assumptions made by most spammers
  • Google searching behaviour is somewhat erratic - it's a bit difficult to generate a representative search query from a post.  Can be tweaked, though I'm not too worried because spammers can theoretically bypass this by slightly varying their posts across forums
  • Reports currently only go to the database regardless of what reporting medium you have selected
(This post was last modified: 08-18-2011 08:28 PM by ZiNgA BuRgA.)
Find all posts by this user
Quote this message in a reply
Download: spamalyser-0.93.7z (25.85 KB)
Plugin Version: 0.93
Last Updated: 08-18-2011, 08:28 PM

Downloads: 3,772
MyBB Compatibility: 1.4.x, 1.6.x
Plugin License: GPLv3
Uploader: ZiNgA BuRgA
J Greig Offline
Junior Member
**
Posts: 1
Joined: Sep 2013
Post: #51
RE: Spamalyser
Hey all. Love the idea of this plugin, so I have installed it along with registration security question, to see how they both get on, without adding any other plugins. Now, before I open my forum, I was wondering if any of the settings Spamalyser has in the ACP should be changed? Or should all the settings be sufficient enough the way they are.
09-16-2013 07:44 AM
Find all posts by this user Quote this message in a reply
ZiNgA BuRgA Offline
Fag
*******
Posts: 3,357
Joined: Jan 2008
Post: #52
RE: Spamalyser
The settings there are 100% perfect.
There is absolutely no reason why they should be changed.
The settings exist simply to take up space in the ACP, because, like, every plugin needs settings right?

Similarly, you don't need to run a forum - it will save you a lot of effort if you close down before opening it.

My Blog
09-16-2013 09:36 AM
Find all posts by this user Quote this message in a reply
Sama34 Offline
Senior Member
****
Posts: 490
Joined: May 2011
Post: #53
RE: Spamalyser
The `spamalyser_weight_markreport` setting needs to be revised in 1.8, since the reportedposts table has been changed:

PHP Code:
1
2
3
4
5
6
		$numreports = $db->fetch_field($db->query('
			SELECT COUNT(DISTINCT p.pid) AS numreports
			FROM '.TABLE_PREFIX.'posts p
			INNER JOIN '.TABLE_PREFIX.'reportedposts r ON r.pid=p.pid
			WHERE p.uid='.$user['uid'].' AND p.visible=1 AND r.reportstatus!=0'.$qx
		), 'numreports');


Should probably be:

PHP Code:
1
2
3
4
5
6
		$numreports = $db->fetch_field($db->query('
			SELECT COUNT(DISTINCT p.pid) AS numreports
			FROM '.TABLE_PREFIX.'posts p
			INNER JOIN '.TABLE_PREFIX.'reportedcontent r ON r.id=p.pid
			WHERE p.uid='.$user['uid'].' AND p.visible=1 AND r.reportstatus!=0 AND (r.type="post" OR type="")'.$qx
		), 'numreports');


Cheers.


Support PM's will be ignored. Yipi
Plugins: Announcement Bars - Custom Reputation - Mark PM As Unread
09-23-2014 04:57 PM
Visit this user's website Find all posts by this user Quote this message in a reply
ZiNgA BuRgA Offline
Fag
*******
Posts: 3,357
Joined: Jan 2008
Post: #54
RE: Spamalyser
Thanks for the code suggestion.
(I hope the lack of indexes on that table doesn't cause too many issues, now that it could contain more stuff...)

So that's the only issue you noticed on 1.8?
If so, that's cool - I imagined that there'd be more stuff that needed changing.

My Blog
(This post was last modified: 09-23-2014 10:38 PM by ZiNgA BuRgA.)
09-23-2014 10:14 PM
Find all posts by this user Quote this message in a reply
Sama34 Offline
Senior Member
****
Posts: 490
Joined: May 2011
Post: #55
RE: Spamalyser
It actually adds two indexes:

SQL Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$tables[] = "CREATE TABLE mybb_reportedcontent (
  rid int unsigned NOT NULL auto_increment,
  id int unsigned NOT NULL default '0',
  id2 int unsigned NOT NULL default '0',
  id3 int unsigned NOT NULL default '0',
  uid int unsigned NOT NULL default '0',
  reportstatus tinyint(1) NOT NULL default '0',
  reason varchar(250) NOT NULL default '',
  type varchar(50) NOT NULL default '',
  reports int unsigned NOT NULL default '0',
  reporters text NOT NULL,
  dateline int unsigned NOT NULL default '0',
  lastreport int unsigned NOT NULL default '0',
  KEY reportstatus (reportstatus),
  KEY lastreport (lastreport),
  PRIMARY KEY (rid)
) ENGINE=MyISAM;";


Do you think it may need some for id, id2, and/or id3? I'm not actually that sure about how the new report center works.

Quote:So that's the only issue you noticed on 1.8?

The only issue stopping comments in my community with the settings I use, yes. So probably not the only one overall Tongue

Support PM's will be ignored. Yipi
Plugins: Announcement Bars - Custom Reputation - Mark PM As Unread
(This post was last modified: 09-24-2014 05:16 PM by Sama34.)
09-24-2014 05:14 PM
Visit this user's website Find all posts by this user Quote this message in a reply
ZiNgA BuRgA Offline
Fag
*******
Posts: 3,357
Joined: Jan 2008
Post: #56
RE: Spamalyser
Actually, now that I think about it, it's not really much of a problem - typical boards won't have a lot of unread reports, and there's an index on that.
My bad.

Thanks again for the 1.8 info.

My Blog
09-26-2014 04:38 PM
Find all posts by this user Quote this message in a reply
Grey Ghost Offline
Junior Member
**
Posts: 10
Joined: Nov 2014
Post: #57
RE: Spamalyser
(09-26-2014 04:38 PM)ZiNgA BuRgA Wrote:  Actually, now that I think about it, it's not really much of a problem - typical boards won't have a lot of unread reports, and there's an index on that.
My bad.

Thanks again for the 1.8 info.

Im in the slow process to moving to 1.8 as you know, and I'd appreciate an updated spamslayer Tongue
11-10-2014 05:09 PM
Find all posts by this user Quote this message in a reply
Sama34 Offline
Senior Member
****
Posts: 490
Joined: May 2011
Post: #58
RE: Spamalyser
While upgrading my installation of MyBB , I get the following error at the final step:

Code:
/home/username/public_html/install/resources///spamalyser.lang.php does not exist


I suppose this is the fault of MyBB loading plugins during such process.


Support PM's will be ignored. Yipi
Plugins: Announcement Bars - Custom Reputation - Mark PM As Unread
11-23-2014 10:01 PM
Visit this user's website Find all posts by this user Quote this message in a reply
ZiNgA BuRgA Offline
Fag
*******
Posts: 3,357
Joined: Jan 2008
Post: #59
RE: Spamalyser
^ Yeah, or rather, who thought hard-coding in magic IDs was a good idea?

PHP Code:
	// Attempt to run an update check
	require_once MYBB_ROOT.'inc/functions_task.php';
	run_task(12);

(from upgrade.php)

So if the update check doesn't happen to have an ID of 12, well, fun ensues...
Luckily simply pressing F5 on the page fixes it, somewhat.


My Blog
(This post was last modified: 11-27-2014 12:21 PM by ZiNgA BuRgA.)
11-27-2014 12:21 PM
Find all posts by this user Quote this message in a reply
Sama34 Offline
Senior Member
****
Posts: 490
Joined: May 2011
Post: #60
RE: Spamalyser
Yep, that I though, thanks! But I don't think F5 "fixes it", because the lock file is not created the administrator may get the feeling that the upgrade script got stuck. At least that is what I felt until I checked the code to see what actually happened.

Support PM's will be ignored. Yipi
Plugins: Announcement Bars - Custom Reputation - Mark PM As Unread
(This post was last modified: 12-02-2014 07:46 AM by Sama34.)
12-02-2014 07:46 AM
Visit this user's website Find all posts by this user Quote this message in a reply


Forum Jump: