-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dbl.spamhaus.org not working causing ticket/comments with several URLs to be rejected as spam #245
Comments
Oh interesting theory, that would indeed explain the posts that have been (wrongly) marked as spam. I ran the suggested
I thought I should also try to run it from the container Trac runs from, but onfortunately that image doen't come with I'll look into disabling |
The service doesn't seem to work, possibly because of the containerization. Refs django#245
I found the name of the setting in the source code and made a PR with a change to the config: #246 |
The service doesn't seem to work, possibly because of the containerization. Refs django#245
The service doesn't seem to work, possibly because of the containerization. Refs #245
Here’s a record of Django spam monitoring logs I’ve worked on over the last few weeks, to help us assess different options to tweak the spam filtering to get better results. With this I’ve identified:
For the two that would still be marked as spam even now, it’s because they have low "session" scores of 6, so probably first-time users whose writing is detected as spam. Their total content’s karma is 0, so it would only take a small tweak to get them through as well. But from what I can see, that would cause 4 more spam entries with spam links to get through (score was -3, would be 0 without the faulty link checker, would be more than 0 if we changed the "first time user" session score). I’m very hesitant to spend more time on this personally, it feels like there are ways to tweak and get good results, but the diminishing returns are real. Perhaps there’d be a way to make the spam message more friendly, so users who get into this situation know where to raise the issue or how to work around it? |
Thanks, Thibaud. I also saw there could be merit in changing "min_karma" to 0. FYI, in your analysis, I think you misidentified a some lines where the user id appeared in the "Quote" field. That's most often when users add themselves to CC. I've also been wondering if tuning the spam filters is worth the effort. If we instead had a list of approved users and a moderation queue for first-time posters, I think that would eliminate all spam and false positives. |
I haven't looked into it, but my gut feeling is that changing the error message when spam is detected should be doable. A moderation queue sounds like it would work great for us, but unless our existing plugins support it (or there exists one that's still mainained) then it's most likely a non-starter. A 3rd option could be to do a captcha challenge when spam is detected (I have a vague memory that one of our plugins supports that, but I could be wrong). There's something similar in place for the donation page on djangoproject.com. I will create tickets for 1 and 2+3 together (my reasoning is that what we really want is a moderation queue, but if that's not possible then a captcha could be acceptable) |
👍 I’m also surprised there’s so much spam as we only allow participation from authenticated accounts. Is there some honeypot we might want to put in place on the user registration flow to thwart the simplest types of botting? I feel like any incremental improvements here would be great. So even just a help message if it’s not too much work, would go a long way. @timgraham thank you, not sure what I was thinking with those two! I’ve updated the numbers above accordingly. |
It looks to me like any ticket changes are penalized -3 karma per URL that's submitted because dbl.spamhaus.org fails for each one.
Example:
URL's blacklisted by dbl.spamhaus.org (djangopackages.org[255.255.254]), dbl.spamhaus.org (forum.djangoproject.com[255.255.254]), dbl.spamhaus.org (softwarecrafts.uk[255.255.254])
(Domains that are all okay according to https://check.spamhaus.org/.)
It may be related to the spamhaus cannot be resolved by the djangoproject.com server. Per https://stackoverflow.com/questions/64363090/how-do-you-access-the-public-spamhaus-dbl-service, someone could try
$ dig dbltest.com.dbl.spamhaus.org
Proposed resolution:
Remove from "dbl.spamhaus.org" from "URL Blacklists (comma separated):" at https://code.djangoproject.com/admin/spamfilter/external
I believe this is part of the Trac database because I didn't find it in
tracenv.ini
.The text was updated successfully, but these errors were encountered: