I don't know much about RSS, but it seems like occasionally I get spam1posts2 from biostar in my RSS feed and when I click on them, they're always deleted already. Is there any way to delete them from the RSS feed immediately when they're deleted from biostar so they never show up in the RSS feed?
+1, An additional problem is the BioStar twitter feed. Perhaps it's on the same RSS? I checked twitter last night and saw the nine spam posts and deleted them, but those 9 spam tweets from BioStar are going out to 421 followers.
The issue is with the RSS reader caching the results. Right now if a post is deleted after your RSS reader connects it will stay in your RSS cache even after the post is deleted from the feed.
The amount of spam seems to be increasing - it needs to be fixed at submission level. I'll add a both a "honeypot" form field and a simple extra validation field to the post interface. We could also add either a re-captcha (but I don't like solving those ;-), they make me squint) or a spam classifier to proofread the posts.
There does seem to be a "wave of spam" recently and it gives a bad impression when feeds become contaminated. Ideally, it would not reach them in the first place. Unfortunately, I don't know if there is an easy semi-automated way to check when users first register. We could require moderation of first posts but that would be a lot of work for moderators and slow things down.
Moderator approval of the first post seems reasonable to me, as long as it can trigger an email notification or something. I'm happy to approve some posts if it means cutting down on this nonsense. If Istvan's new countermeasures don't work out, we could give it a shot.
Strangely all seem to be pointing to australian domains. I think it is one aussie bot that got us figured out. Now we made our move, let's see what happens.
By the way, I get an error message in the admin console when trying to delete spam user accounts. I know that deletion makes no practical difference, it just makes me feel better :)
no don't delete those because then they can just come back - they need to stay banned. If you ban a user then their posts will be destroyed within six hours.
Biostar version 1.2.14 released. Added two levels of spam defense to the New Post page.
New posts need to have an extra field filled in - I am curious to see how this will be handled by bots and how soon until it is routinely defeated
New users (defined as within 6 hours of account creation) may only create 3 posts. This is to give us chance to ban runaway bots. An error message informs the user when the effect is triggered.
Also added a notification on how many votes a user has acquired since their last visit. This will be expanded in a next commit to show the actual posts that generated the votes.
They even fill in the about-me section! Now that I think about it that is also a valuable outlink that makes Biostar contribute to their pagerank! I guess we need to hide the autobiography for banned users.
also as I look at it more, this is not a usual spam. The content is too lengthy, verbose. The advice even makes sense in some other context. It looks like something tailored to fool search engines to make it look legitimate content then use our page-rank to increase theirs. They don't actually expect our users to click/buy their stuff. I will investigate this more closely.
New measures are now in place. I removed the extra field after all it has already been defeated.
RSS feeds are now on a four hour delay this gives us time to delete posts without pushing them to the readers. Banned users will have their about pages and websites removed. Posts by banned users are pruned every six hours (removed beyond deletion). Brand new users may only create two new posts then have to wait a few hours to post more.
I feel like the 4-hour delay is a little long. In the interest of timely answers, can we cut it to 2 hours? Do we have enough international coverage with our moderators to cover spam deletion that quickly when it crops up at odd hours?
It could be too long - let's leave the 4 hours on for a few days until the next update then I will change it to a shorter time 2h - then we'll have some observations that help make a decision.
The internal update policy that I am trying to stick to is that no matter how small the change all tests need to be run and must pass before deploying so I like to group changes into batches. Tests include both API and functional test through the browser where the runner fires up a browser and performs a long seriers of actions (via selenium) that someone is watching.
Is this old or new spam? Usually the RSS entries are stored in the user's reader (to allow for quick access) so once an entry makes into there we can't delete it. It will eventually expire. The rule just delays the new entries from entering the feed.
Just to make sure we are talking about the same thing. You can see a post in your feed that does not exist on the site anymore.
An RSS reader is a post collector not a mirror of what is on the site. Once an item enters a reader it will stay there even if the originating site removes it from the feed. There is no way to go back and tell a reader that a post that has been seen in the feed before should not be displayed.
The time at which a post gets removed from an RSS feed depends solely on the the settings of the reader and is usually a simple expiration.
This is a good explanation. I think the issue is, for people who like to use RSS, seeing a lot of spam is upsetting and makes them question the utility of the site. Obviously the ideal scenario would be to prevent all or most of it from ever being posted.
Correct, that is what I mean, but I mainly select the content I visit on BioStar by parsing my RSS feed. The fact that is still polluted by spam is not a function of how fast my RSS reader updates but a function of the failure to stop spambots signing up ;)
we have slowed the spam but not stopped it - I am thinking of a measure where new users would have to pass a mini captcha possibly with a bioinformatics question rather than an image.
I don't know which of the above remediation are still in play now but SPAM in RSS is out of control to such a degree that it puts me off to skimming biostar... unless I need to get some CBD gummies or go on a Keto diet...
thanks for the note, in the new release we forgot about checking spam in the RSS feeds. Looks like we do filter Latest News feeds, but not the other feeds. We'll add a fix soon.
Glad I'm not the only person plagued by this. I think I had more spam than actual comments from BioStar in my RSS feed today
+1, An additional problem is the BioStar twitter feed. Perhaps it's on the same RSS? I checked twitter last night and saw the nine spam posts and deleted them, but those 9 spam tweets from BioStar are going out to 421 followers.