Stop Forum Spam

You are not logged in.

#26 2019-06-07 2:10 pm

Papa Parrot
Moderator
From: Mexico
Registered: 2011-08-19
Posts: 1,775
Website

Re: Running a Diary being hit by bots without StopForumSpam

Interesting approach the moronic admins at the OSM site have taken. I was recently disappointed by "startpage.com", it used to be my favorite search engine, until recently they started using a broken captcha method, to try to control who, and how many searches one can do. But I suppose that would be another topic.
  What I see, or think, the internet is so polluted by bots of all sorts, nothing is done legally, by the world governments, IE: long prison sentences for the criminals involved in using and creating the mal-ware they use to abuse internet.  Any way, it is all so polluted and corrupted, it is rapidly becoming more and more useless. The worst part, all the excessive advertising,  spam, etc. is every where, 99% of the websites that do show when one does a search , are basically just full page advertisements, and very little or zero useful information,.... now the search engine is useless because, every time I try new key words, I have to try to use a broken captcha,...it is getting to be a real mess, every where.

Offline

#27 2019-06-07 11:07 pm

Papa Parrot
Moderator
From: Mexico
Registered: 2011-08-19
Posts: 1,775
Website

Re: Running a Diary being hit by bots without StopForumSpam

I was thinking more about this, and seriously I do think the only real solution to the pollution that the so called Marketing engineers (spammers) are causing on the internet highways is severe penalties, jail time is not good enough, the problem with putting them in jail or prison is then society has to support them, so better yet severe financial penalties, very high fines. The same money raised from the fines could be used to help clean up the mess, and develope better methods of catching them, and stopping them. 
Advertising people , Marketing engineers, spammers, what ever you want to call them, have all ways been scuzz butts, and do not respect anything, they have a history of doing this, trying to force everyone to see their garbage, every where.
  I don't know about over in England, and Europe, all though I did visit Switzerland,Germany,Italy and Austria when I was very young. And I do remember the difference on the highways over there, there were not so many billboards, like I was accustomed to in the U.S. At that time in the US, the highways were filled with billboards, especially in the South West, there was no more real "Scenic" drives, any where. 
Eventually a movement did start, and laws were passed , and the billboards started getting removed, the highways became more scenic, and enjoyable. It seems to me, like the admins, at OSM are sort of closing the highway, instead of removing the billboards, in a way.  Any websites that are a benefit to the public, should be legally protected, and it would be a serious crime to even attempt to pollute them with unwanted advertising, or any advertising for that matter.  It would be nice if they (the marketing engineers) were restricted to only being able to use a "commercial" zone, like the yellow pages in the phone book, there they could place their advertising and contact info.  Submitting or posting unwanted advertising any where else is essentially like graffiti, it defaces the environment .  The penalties for doing that need to be enforced, and severe, otherwise it will only continue to get worse and worse. These kind of people have no morals, nor respect anything...other then $$$$, if the cost / consequence of the abuse is high enough, it would get their attention.  Strict licensing including knowing the true identity of the owner of servers being used as "production" machines, producing advertisements and marketing propaganda, probably is needed. The production of mal-ware intended for abusing websites and injecting unwanted data, should be not only a very long jail term, but also banned from ever being able to be a licensed server operator or admin as well as a very high fine.
Any way, guess that is about it for now,

Offline

#28 2019-06-22 6:17 am

Alex Kemp
Moderator
From: Nottingham, England
Registered: 2009-12-02
Posts: 2,093
Website

Re: Running a Diary being hit by bots without StopForumSpam

More help requested. 2 items, so I shall list here & make 2 separate posts, 1 per item

1) How to setup a *Very* Busy site to be able to Manage the SpamBots?

I'm dealing with folks (Admin & Mods) that are technically capable but ignorant to the nth-degree on spam. It is partly educational, but the trick, I think, is to be able to give a cascade of simple-steps that the folks reading can quickly understand and implement. The servers are remotely hosted at UCL (London university) and the admin have full root access.

The servers are currently setup via robots.txt to throttle Search Engine access to most of the site as to preserve bandwidth  (not a lot of money, and potentially vast access numbers). I want to see numbers increase, but that should not happen until the spam is brought under control.

There are hundreds of cumulative years of experience in the webmasters that visit here. In addition, the advice given will be useful to all, not just this one site. I intend to start us off in the next post with my current advice as a starter pack. However, I'm not a current webmaster, and I anticipate that it can be improved.

2) Chinese-language spammers are Filling the Diaries at 1 Post every 5 Secs (10,000/day); What is going on?

The point here is that there are no links, in which case what on earth is going on?! An earlier set of attacks stopped when a bot-prevention was placed into robots.txt, which strongly suggests that it was indeed spam, yet that was also apparently link-free. This all makes zero sense to me, and my Chinese-language skills are also zero. Is this only abuse, or am I missing something?

This is my Post in OSM:
Sigh. Now it is _主管Q (“_Supervisor Q”) Spammers (post deleted by DWG)

There is a sample spam-post in the private part of this site:
https://www.stopforumspam.com/forum/vie … hp?id=8855

Offline

#29 2019-06-22 6:24 am

Alex Kemp
Moderator
From: Nottingham, England
Registered: 2009-12-02
Posts: 2,093
Website

Re: Running a Diary being hit by bots without StopForumSpam

1) How to setup a *Very* Busy site to be able to Manage the SpamBots?

The way to think about stopping spam/abuse:
i) “Open” should not mean “Open to abuse”
ii) You can only reduce bot spam, not stop it

  1. Some idiot(s) each day will attempt to mirror the entire site using an unregulated bot   
    (so employ fast-scrape preventions at the kernel level)

  2. Universally sites prevent spam by preventing anonymous posts

  3. SFS can be used to auto-prevent abuse from known spammers   
    (current hit-rate is ~30% on all email queries)

  4. Give new users restricted rights + place their posts into moderation   
    (meaning all new-user posts are hidden & supervised by humans *before* release to the public; this is common practice to restrict spam)

  5. On the OpenStreetMap site those previous steps need to apply to the map as much as to Diary, Blog, Help, Forum, Wiki, any-other-sub-site

These are survival techniques.

Offline

#30 2019-06-25 3:20 am

Alex Kemp
Moderator
From: Nottingham, England
Registered: 2009-12-02
Posts: 2,093
Website

Re: Running a Diary being hit by bots without StopForumSpam

I asked for some help earlier in this thread. None needed now (the OSM admin have rejected all help, stating that they do not need it).

I've rewritten many of my posts in this thread, adding " (post deleted by DWG) " and putting strikeout lines through the post-titles (DWG = Data Working Group). That is because the DWG have risen up in arms against my attempts to investigate the spam, find ways to help, and to promote SFS as part of that help. They finally had enough when I raised 6 connected issues in GitHub pointing out that the map static files could never receive a 304 Not Modified & explained how to fix it. It worked & that was probably the worst thing of all since it showed what prats they were (their view) and they were never going to let me get away with it.

I managed to save one article directly relevant to this thread & have added it at the relevant post in the thread.

The OSM Admin, mods & a cohort of close buddies to those folks are deeply narcissistic & bound as a cohort by their narcissism. My mistake was to finally point out that narcissism plus write good articles that drew a lot of attention.

• 24/06/2019, 22:52: “we have taken the liberty removing some of your comments above that were insulting on purpose.”
• 24/06/2019, 23:00: “we have removed your latest diary entry because it was considered too offensive … we will remove any future derogatory blog entries or comments that you write on the user diaries.”

Without further warning, another 10 articles, all connected to the spam, were deleted across the next few minutes. There were now zero Diary posts about the spam, and I think that we now have the actual reason for those deletions.

The final email was a masterpiece in 1984 rhetoric:

• Re: [Ticket#2019062210000073] Alex: Enough with the Insults and Comdemnation
• 25/06/2019, 00:33:

Dear Alex,

just in case this was not clear: Everybody else in OSM is allowed to make snarky, cynical, or sarcastic comments, and we don't kick people out for a single misstep either. It's just that the rules that apply to you specifically have been tightened because of your past record.

Also just in case this was not clear, this whole situation has long ago left the place where we were discussing what is true and false. Even if everything you have said was factually correct, and everything everyone else said was factually wrong, this would not change our judgement one bit. Therefore we will not be drawn into discussions about wrong or right; no amount of being right can nullify your obnoxious behaviour.

Best regards
<name redacted>

--
OpenStreetMap Foundation
Data Working Group - data@osmfoundation.org

Thank goodness that this is OpenStreetMap. Imagine how bad if it might have been if it had been ClosedStreetMap.

I've had a long rest & done some other stuff to allow myself to think about this. I'm an awful long way from being perfect, but the DWG + Admin are attempting to bury all their mistakes & punishing someone for being genuine in offering good advice to mitigate a difficult situation at their end. As best as I can see, the reason for their bile is that they felt that it reflected badly on them to see it published. That is truly vile, and I see no reason why it should stay hidden.

Offline

#31 2019-06-25 5:01 am

Maikuolan
Member
From: Perth, Western Australia
Registered: 2011-08-09
Posts: 743
Website

Re: Running a Diary being hit by bots without StopForumSpam

Though I'm neither directly connected to the aforementioned situation, nor have any desire to be.. from what I've read here, it sounds like they've doomed themselves to spiralling down towards inevitable obscurity and irrelevance (i.e., ignoring, denying the existence of, and therefore refusing to fix a severe problem that degrades the quality of the service or product that the problem affects, which if continued to be ignored into the future, will eventually reach the point where users no longer want to engage with said service or product, thus making said service or product obscure and irrelevant).

Last edited by Maikuolan (2019-06-25 5:02 am)


Some free, open-source packages I wrote:
- phpMussel (protects websites from malicious file uploads)
- CIDRAM (protects websites from unwanted traffic, spammers, bots, cloud services, etc)
- SFS Mass IP Checker (bulk query IP addresses with SFS)

Offline

#32 2019-06-25 11:55 am

Alex Kemp
Moderator
From: Nottingham, England
Registered: 2009-12-02
Posts: 2,093
Website

Re: Running a Diary being hit by bots without StopForumSpam

Hi Maikuolan

It is actually worse than you think. The email from the DWG said "we have removed your latest diary entry because it was considered too offensive." I shall post that article in the private area & folks can  decide for themselves whether that is accurate or not. However, after going through this thread and adding the warning that the 'former link was no longer available', two points finally got home to me about the 12 diary posts that were removed:-

  1. Each removed post was about the OSM spam attack, including one that was purely a set of stats on daily numbers
    (max: 30,000 spam posts)

  2. Those admin & the DWG are actually attacking SFS. I shall not repeat their insulting, poisonous and snarky comments about SFS here — that shall be reserved for the private area — but make no mistake, they were pouring their bile out on SFS at a rate of knots

It has just occurred to me why the DWG are attacking SFS. I'm afraid that it was my fault.

I've recovered 11 of the 12 deleted posts (one of those posts took me 11 hours to research & write). The one that I did *not* recover was titled HiddenStreetMap, and had that moniker because I had discovered that all the Diaries had been removed from all Search Engine (SE) results using robots.txt. That is accurate, but the next assertion is NOT accurate yet was backed up with statements from TomH, the admin at OSM:

I misread robots.txt in my haste & stated that it was removing all Map entries as well as Diary entries (not true - see post#25, yet my Google research showed almost no osm-map entries at that moment so I made a logical, yet wrong, connection between those two facts). However the OSM admin endorsed what I said with the comment that there were "millions of nodes that could be searched" but that the load would be too great. I thus kept my eye out for ways to reduce the load, which lead to 6 GitHub posts from me into the OSM Issues section of their website: [1] + 5 others. The linked issue was of a missing Last-Modified header & restoring that header fixed the entire problem for 2 of the 6 issues. Then came my mistake which has led to the OSM hierarchy detesting & bad-mouthing SFS:

You can see the full list of OSM hardware here (90 servers upon, I believe, 60+ physical machines). I really wanted OSM to be able to be as big as Google maps, but how could it be if all the search engines were prevented from indexing it? The admin stated that the problem was server load, and in my innocence swallowed all this whole then stated "SFS can handle as many accesses as OSM yet has just one server whilst OSM has 90 servers". Oh god, you idiot. You just told them that SFS's dick is bigger than than theirs.

The result was inevitable, I guess.

Offline

#33 2019-06-25 8:07 pm

Alex Kemp
Moderator
From: Nottingham, England
Registered: 2009-12-02
Posts: 2,093
Website

Re: Running a Diary being hit by bots without StopForumSpam

I'm going to post a recovered OSM diary-post next so that you can decide whether it is as bad as OSM have declared it to be. Here is brief background before the post:

On the evening of Saturday, 22 June I received 5 warning emails, including one from GitHub telling me that OpenStreetMap had blocked me due to this content. My spidey senses were tingling and it was clear to me that the writing was on the wall.

The OSM Admin had summarily released osm/diary + osm/user/*/diary within robots.txt a little while before and, sure enough, the spammers had re-commenced flooding the Diary pages. As before, I documented what was happening with a diary post (unfortunately NOT recovered before deletion, but it was called "Sigh. Now it is _主管Q (“_SupervisorQ”) Spammers" and listed the numbers). Hence all the emails.

I'm long enough in the tooth within both life & OSM to know what was in the wind. If I was not to be broken by this I needed to take some kind of control. Therefore, I took inspiration from Hernán Cortés, and burnt all my ships.
Hernán Cortés

This is the 24 June post which on the same day DWG removed because it was "considered too offensive.". Decide for yourself. Words which show strikout were previously links to diary posts deleted by the DWG:

A Stranger at your Table wrote:

There is a narcissistic cabal that are censoring me in the name of preventing insult (a good example of which is, of course, my naming them as a narcissistic cabal). Well, in the manner of Hernán Cortés I am going to burn my boats by showing what they are like.

The Lead-Up
This site became deluged with spam (beginning & most-recent examples). Attempted abuse from spam-bots is perfectly normal for any webmaster in the 21ˢᵗ Century that runs an internet forum, but the OSM admin were not prepared to take what would (in my view) otherwise be considered to be perfectly-normal steps to prevent it (OSM-specific example: “no-edit, no-diary” & generic steps: “stopping spam/abuse”). Instead, they killed visibility for all Diary pages. Twice.

I still find that previous paragraph perfectly astonishing. The other spam professionals that I have spoken to also are in astonishment at the OSM action, but the OSM admin really have done that. In order to kill the disease they have killed the patient and, in the process, caused profound damage to OSM visibility itself. This last bit is new, so I had better explain.

If you look at the Extreme SEO post you will find the sentence:

The “Disallow: /user/*/diary” line above means that in the approx. 1,100,000 current results for “a site:openstreetmap.org” on google there is not a single result for a Diary page; the results are almost all wiki pages, with a sprinkling of help, blog, etc pages.

The OSM robots.txt deliberately causes most of openstreetmap.org to be de-listed in Google, including all of the Map.

That fact of the map de-listing was true (I spent an hour or so establishing that fact) but the reason I gave was NOT true. At the time that I wrote it I had misread the effect of the final 4 entries within the OSM robots.txt. They are legacy items to prevent ancient values once (but no longer) used within osm.org from being searched. So why was the map almost entirely de-listed? The only thing that makes sense is that it was collateral damage from the recent, multiple robots.txt changes.

I just checked again (10:26 BST) as the Diary is killed again and, sure enough, just 500,000 results and 4 map results (and about 10 Diary user-pages, even though blocked).

Personal Attack
My efforts to help during these spam-attacks had other collateral damage: me. The OSM admin & members of their cabal did not take kindly to what they increasingly characterised as personal insult. A good RL example would possibly be of villagers throwing stones at the village idiot because he has shouted out that the village hall is burning. As his bruises grow the hall meanwhile burns to the ground.

It most certainly was NOT personal insult nor attack from my point of view. Indeed, I knew from the inside just what it took in terms of discipline to turn up on a regular basis & clean the kind of sh*t that was being thrown at OSM by these bots, having earned my living as a webmaster for 15 years. I’m an old geezer & only sleep ~4 hours each night & with a pension no longer need to earn a living each day. TomH is a young man & still got stuck in at 7am every morning & cleaned the stable, so credit to him.

Does that mean that TomH is perfect? Well of course not, but the strong stream of narcissism that runs within him & joins him to his cohorts means that any comment that does not flatter them is an insult (that fact is one of the characteristic features of narcissistic conduct that allows you to recognise it and yes, I also know that fact from the inside). And by gum!, I’ve been coming out with floods of comments recently that do not flatter any of them. Poor dears. So they are gunning for me.

GitHub Block
On 22/06/2019, 20:33 Firefishy made a comment to one of the Spam posts:

Please refrain from perpetually insulting our voluntee admin team, of which I am a part.

Do notice that there is almost zero explanation here of the insult perceived to have been caused, nor of what it was, meaning that I cannot respond, apologise, or ever change my ways. Thus this is an emotional response unmediated by reason or rationale. It is a classic outburst caused by innate narcissism.

That would have been bad enough, but on the same day it got taken two steps further which, at the time (mistakenly perhaps), I thought were all connected:

First, I got an email from GitHub:

A maintainer of the @openstreetmap organization has blocked you because of this content. For more information please see the community guidelines.

That is vicious behaviour.

Second, I got an email from some one at osmfoundation.org, and CC’d to operations + the Data Working Group. Now they are getting serious.

Alex: Enough with the Insults and Comdemnation

(why is it that these folks can never use a spell-checker?)

What followed was classic passive-aggressive language. It began “Hi Alex” and ended:

Any further unacceptable language or actions from you and I will seek that you are politely asked to leave our project for good.

Kind regards

I mean, ‘Kind regards’. Seriously?

There are broadly 2 accusations:
1) “frequently use(ing) unacceptable poisonous and insulting language against hard working volunteers”
The example given was from the most-recent spam post:

The latest splurge began because on 15 June 2019 at 23:56 TomH unilaterally removed his former unilateral block on these Diary pages within robots.txt

I responded “Of course it is open for someone in TomH’s situation to take emergency action that will, by it’s very nature, be unilateral in implementation. How could I possibly argue with that? The issue here is that no emergency was at play & his actions would inevitably lead to further spam. It therefore required a second action to protect the site against spam. I informed him of that fact & he ignored me. Thus, the inevitable became reality.”

The response ignored the previous paragraph, but simply doubled-down, again accusing me of using “insulting, poisonous and abusive language.” At this point I realised that I was not having a human interaction here.

I pointed out the facts & got accused of being “poisonous” for doing so. Hmm.

2) you have put our project hosting in danger

On the 16th or 17th June you far exceed what is even remotely acceptable and phoned University College London’s network abuse team because we were not responding to your ticket: https://github.com/openstreetmap/operations/issues/308 The project is greatly indebted to the generous hosting support we receive from UCL and the department heads which support us. Your actions caused a major issue.

This was nonsense on speed.

I gave a point-by-point rebuttal. It did not help:

  1. You have not asked me at any point WHY I did that.
    All your words are suppositions with zero connection to fact.

  2. Where did I say that I contacted UCL because OSM was “not responding”?
    That is nonsense and exists only in your imagination.

  3. Tom said: “Piwiki is nothing to do with this repository.”
    https://github.com/openstreetmap/operations/issues/308#issuecomment-502493325
    I therefore thought that he meant that some other organisation was responsible for delivery of those files. I therefore traced through whois to contact the organisation. I asked to speak to the NoC and the UCL telephone operator put me through.

  4. I have been in business for 40 years & the above is just normal action.
    Tom said “this file is nothing to do with us” so I used the tools available to me to find out who was responsible.
    Explain why that action “exceeds what is even remotely acceptable”. What hyperbole!

  5. Please explain the “major issue” I caused.
    (UCL Noc) informed me via email that (in spite of Tom’s statement) that the server was managed by OSM. I thanked him for the info and that was the end of the exchange. Perfectly civil & without rancour in any direction.

The only folks that I have experienced rancour from are people like you.

It would have been wonderful if it could have ended there, but this is someone that doubles-down in his insults (just like that epic narcissist dear Donald), so I got a third + fourth accusation in a follow-up email:

3) repeatly asking for us to compromise users’ privacy

You repeatly asking for us to compromise users’ privacy by sending their data to the 3rd party stopforumspam.com will not make it happen. Are you involvement with this service? Please review OpenStreetMap Privacy Policy: https://wiki.osmfoundation.org/wiki/Privacy_Policy

Here we go with the rebuttal to that nonsense:

Once again, Grant, you have made zero engagement with me to discover what is involved. If you ever do that I will be happy to provide what you need to know. For this moment, let’s just address it like this:

Engagement with the SFS API is a question of one computer making auto-search into the database of another computer, looking to receive a “is present”/”is NOT present” response.

There are 3 fields available to search upon:

→ Email address
→ IP address
→ Username

My advice would be to concentrate upon the email address during new-user creation to discover whether access is coming from a spammer.

Please explain what is private about any of those fields?

Now I accept that any of the 3 could be ‘private’ if the field is connected to a person’s real-name and/or street address and/or etc.. But it is not. Indeed, none of those 3 queries are even retained by the SFS system (they are retained when reported as a spammer, and hit/no-hit is retained as a cumulative metric, but the fact of the content of a query from any particular address is not retained. What would be the point in that?).

The point is:- there is zero compromise of any privacy for any of your users in using SFS. We are, after all, used by websites in Germany & Sweden who are way more concerned about privacy than you, and they have zero problem with us. Also, recall that the individuals that are checked out will not even be users at the point of search.

It seems to me that you are simply looking for a reason not to have to depart from current practice. That is very sad if true.

I’m sorry that, once again, this is so long. This is the final section.

4) You are repeatedly abusive
As I understand the system, GitHub.openstreetmap invites ‘issue’ posts within https://github.com/openstreetmap/operations/issues as contributions to the conversation, in the same way that Diary posts are part of the OSM conversation to see the way that the affaire can make it’s way into the future. However, it clearly is not working, and especially when TomH immediately closes every issue immediately after being posted, thus preventing anybody from commenting.

The Meat
(probably) a dynamic GIF file. This report is actually here to say “WHAT THE HELL DO YOU THINK YOU ARE DOING SENDING A WEB-BUG TO MY BROWSER, YOU SCUMBAG?”, but is dressed up as a technical report —

This is NOT not acceptable language. We tire of your trivial tirades against us. We are well aware how web caching works.

Hmm. They are so aware of how web caching works that I had to inform them that they had not bothered to implement it for any static file within the map (catty, but accurate).

The privacy Policy mentioned is 3,432 words and I am not certain that it mentions at any point the webbug dropped on every page.

That file is a 1x1 pixel webbug. I run a JS-blocker on both FireFox & on Chromium. Both are written by different people and each auto-blocks vile JavaScript. Part of the default auto-blocking for each of them is of webbugs, since those are never ever required or desired. The Foundation thinks that it is fine to put webbugs onto everybody’s page, but thinks that asking why is “not acceptable language”?

I’m done with these folks. Justice has no place at their table, and thus neither do I.

Offline

#34 2019-06-26 6:58 am

Maikuolan
Member
From: Perth, Western Australia
Registered: 2011-08-09
Posts: 743
Website

Re: Running a Diary being hit by bots without StopForumSpam

Yeah.. I would just bail on them, if I were in your shoes. Sounds like emotions are running high, views and opinions have already been cemented, and the nature of the interaction between them and you has already escalated to a point where deescalation is extremely unlikely. Sounds like he's dead, Jim.


Some free, open-source packages I wrote:
- phpMussel (protects websites from malicious file uploads)
- CIDRAM (protects websites from unwanted traffic, spammers, bots, cloud services, etc)
- SFS Mass IP Checker (bulk query IP addresses with SFS)

Offline

#35 2019-06-26 10:58 pm

Alex Kemp
Moderator
From: Nottingham, England
Registered: 2009-12-02
Posts: 2,093
Website

Re: Running a Diary being hit by bots without StopForumSpam

Maikuolan wrote:

the nature of the interaction between them and you has already escalated to a point where deescalation is extremely unlikely

Sadly true.

The Diaries are still dark to the SEs. That keeps them free of bot-abuse but, AFAICT, no-one has a plan to be able to restore the website to normal operation. I think that they are just crossing their fingers & hoping that the spammers go away. That rarely works.

I'm slowly recovering my equilibrium.

Offline

#36 2019-06-28 4:28 pm

Alex Kemp
Moderator
From: Nottingham, England
Registered: 2009-12-02
Posts: 2,093
Website

Re: Running a Diary being hit by bots without StopForumSpam

Alex Kemp wrote:

I'm slowly recovering my equilibrium.

Sometimes you get so ill that it is only when you get well again that you can realise just how ill you were. That has been the case with the effect on me of these attacks from the OSM DWG + Admin. I did not realise just how mentally/emotionally disturbed those guys are.

Take a piece of advice from an old man: never marry a narcissist plus never work for a narcissist. I've done the former once & the latter more than once; I'm clearly prone. It is a truly horrible experience that can take months to recover from. Thank goodness that I have become wise enough to know that you need to run out of the burning building before you turn to a crisp.

One of the defining characteristics of Narcissism is the way that those folks attack any that criticise them or, in their view, appear to criticise them. In particular, anything that contradicts their self-image causes immediate insult & attack (male aspect) or conspiracy & sniping (female aspect) as a return feature.

The previous paragraph is most clearly seen in the way that the admin was authorised by the DWG to remove just one Diary post (A Stranger at Your Table) but went right on to remove all 12 Diary posts that reported on the spam attacks on OSM, including one that was simply a set of stats. What exactly is “insulting, poisonous and abusive” about compiling statistics then reporting the number, timing & rates at which bots were abusing/spamming OSM?

There are a large number of folks that come to SFS that have built up considerable experience & acumen about spam-fighting, and I'm one of them. I thought “well, it is obvious that they do not know how to handle this abuse” (most of the bot-posts did not contain links) “but I do, so I will be able to help here”. Ah, how innocent can you get?

The stats post was one of the 9 deleted posts that I was able to recover. I'll reproduce it in the next post in this thread.

Offline

#37 2019-06-28 7:44 pm

Alex Kemp
Moderator
From: Nottingham, England
Registered: 2009-12-02
Posts: 2,093
Website

Re: Running a Diary being hit by bots without StopForumSpam

In 2019 spam was beginning to be a big issue for the OSM website. The Diaries were suffering regular spam posts + abuse posts. A reporting procedure had been put in place for the Diaries, but more & more spam in the map was also turning up. That came to a head in late April when the OSM Diaries began to be overwhelmed every night by tens of thousands of posts that had every appearance of being bot-posts.

The conclusion in that last sentence is terrifying. Human spam is annoying, but can be dealt with by ordinary human intervention (eg report via users and remove by Mods). Bots are a completely different matter. In order to be able to offer some expert advice it was going to be necessary to acquire some accurate information, such as: “The numbers involved”“Are they bots?” + “Just how fast are they?”.

That was the reason for beginning a stats post. It is reproduced below. Now try to work out what exactly was the “insulting, poisonous and abusive language” that was given as the reason for deleting it.

Recent Spam Attacks wrote:

Posted by alexkemp on 6 May 2019 in English (English)

Mention was made in my last diary (removed by DWG) and also in Sam Wilson’s diary about the large amounts of spam coming in to overwhelm these Diary pages. In good scientific manner here is a quantification of the issue, obtained by examining ID numbers for all recent surviving Diary posts.

Background
Diary posts are incremented serially. Thus, deducting the theoretical number of posts by the actual number of posts leads to the measure of how many spammer posts may have been removed.

The Numbers

    Date    End-ID  ---------Posts---------
                   Actual   Theory      Diff
    12 Apr   48187      -              (spam)
    13 Apr   48193      5        6         1
    14 Apr   48195      1        2         1
Mon 15 Apr   48202      2        7         5
    16 Apr   48216      5       14         9
    17 Apr   48223      2        7         5
    18 Apr   48234      5       11         6
    19 Apr   48242      5        8         3
    20 Apr   48252      8       10         2
    21 Apr   48255      2        3         1
Mon 22 Apr   48287      4       32        28
    23 Apr   48378     12       91        79
    24 Apr   48385      1        7         6
    25 Apr   56488      6    8,103     8,097
    26 Apr   74643      8   18,155    18,147
    27 Apr   99519      2   24,876    24,874
    28 Apr  128866      7   29,347    29,340
Mon 29 Apr  140684      3   11,818    11,815
    30 Apr  149349      4    8,665     8,661
     1 May  152912     13    3,563     3,550
     2 May  156826      8    3,914     3,906
     3 May  158835      2    2,009     2,007
     4 May  158837      1        2         1
     5 May  172694      6   13,857    13,851
Mon  6 May  193238      6   20,544    20,538
     7 May  210953      2   17,715    17,713
     8 May  218281      4    7,328     7,324
     9 May  240069      2   21,788    21,786
    10 May  256019      7   15,950    15,943
    11 May  270022      1   14,003    14,002
    12 May  275013      8    4,991     4,983
Mon 13 May  276830      2    1,817     1,815
    14 May  283239      2    6,409     6,407
    15 May  291589      2    8,350     8,348
    16 May  296320      1    4,731     4,730
    17 May  318162      6   21,842    21,836
    18 May  339272      2   21,110    21,108
    19 May  347443      2    8,171     8,169
Mon 20 May  364479      3   17,036    17,033
    21 May  364493      7       14         7
    22 May  364971      4      479       475
    23 May  368657      4    3,686     3,682
    24 May  368669      8       12         4
    25 May  368675      3        6         3
    26 May  368682      4        7         3
Mon 27 May  368691      3        9         6
    28 May  368702      3       11         8
    29 May  368711      2        9         7
    30 May  368716      2        5         3
    31 May  368725      7        9         2
     1 Jun  368726      1        1         0
     2 Jun  368734      2        8         6
Mon  3 Jun  368750      7       16         9
     4 Jun  368753      2        3         1
     5 Jun  368757      2        4         2
     6 Jun              0        0         0
     7 Jun  368766      2        9         7
     8 Jun  368773      3        7         4

Between 25 Apr & 23 May (29 days):
-------------------------  -------
            Total  :  124  320,272
            Daily  :    4   11,044
-------------------------  -------

PS
This diary is being posted in the midst of yet another day’s blizzard of spam. Let’s hope that it survives the cull.

Update 8 May:
01:52am BST: I dropped in on the 1st of tonight’s spammers:

Title: translation of ID=210955: Being vomited and vomiting frlse 
Text : 苟颜德缕uwrfh 苟颜德缕uwrfh 苟颜德缕uwrfh..wfgz

09:37am BST Sunday 12 May:
The latest spammer is /user/twuptyoe378/diary/274627 (removed)
The first spammer:

Title: translation of ID=270023: Vomiting
Text : 07633abawl

Update 14 May:
The first 3 posts shortly after 1am BST were the now-classic Bengali (bn) wfgz spam. Here is the very first:

Title: (ID=276831): 暮铣德娜侗cjenp
Text : 肆考韭缕节oqgwr肆考韭缕节oqgwr肆考韭缕节oqgwr..wfgz

After 90 minutes we began to get some Chinese (zh-CN) vip spam, which continues until shortly before 20:42 BST. Once again, here is the very first:

Title: (ID=282971): 北京幸运28官方网站
Text : 北京幸运28官方网站 【导师微信:<redacted>】【网址<redacted>.vip 】【加拿大28稳赢法】…

Update 15 May to discover spam stats:
I put in place a cron-job Monday to save the current Diary top-page every 10 minutes from 01:00 BST until 10:00 BST. I wasn’t in the best emotional state yesterday, so had a rest & investigated it today using egrep & tabulated the listing below.

These are the rates at which the wfgz spammers dropped their spam into these Diaries over the night of 14 May. You will see that they hit a maximum rate of 66 post/minute and averaged 27 post/minute. The spam began shortly after 3am BST and stopped (presumably due to the intervention of OSM Mods) at about 7am. As best I can tell, all of the posts made between those times were from these wfgz spammers.

Date        1st Post Posts
------------ ------- -----
May 14 07:10  279688     -
May 14 07:00  282939    66
May 14 06:50  282873   424
May 14 06:40  282449   119
May 14 06:30  282330   167
May 14 06:20  282163   234
May 14 06:10  281929   426
May 14 06:00  281503   231
May 14 05:50  281272   129
May 14 05:40  281143   182
May 14 05:30  280961   292
May 14 05:20  280669   237
May 14 05:10  280432   659
May 14 05:00  279773   130
May 14 04:50  279643   190
May 14 04:40  279453   120
May 14 04:30  279333   321
May 14 04:20  279012   186
May 14 04:10  278826   347
May 14 04:00  278479   466
May 14 03:50  278013   223
May 14 03:40  277790   244
May 14 03:30  277546   342
May 14 03:20  277204   342
May 14 03:10  276862    28
May 14 03:00  276834     0
May 14 02:50  276834     1
May 14 02:40  276833     0
------------ ------- -----
            minimum:   119
            maximum:   659
            average:   273

It is difficult to believe that professional webmasters such as the OSM admin would not have basic kernel-level preventions in place to stop boy scrapers (youngsters trying to download all 50m+ pages at hi-speed using a bot), but possible. In my time as a webmaster the max attempted speed hit >1,000 accesses/second (yes, that really was each second). In the event the peak OSM access was 66 post/minute. That actually made it worse.

One access/second means that this was not some amateur event. These were professional spammers deliberately moderating their bot as to evade any speed-bumps. That suggests that OSM has been introduced into the routines of a professional bot such as xRumer, and the best method that I found to handle that situation was to make use of StopForumSpam.

Offline

#38 2019-07-04 5:15 am

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 6,742

Re: Running a Diary being hit by bots without StopForumSpam

those are some truly horrifying spam numbers.  is recaptcha in there or are these all copy/paste manual spammers?

Offline

#39 2019-07-04 5:53 am

Maikuolan
Member
From: Perth, Western Australia
Registered: 2011-08-09
Posts: 743
Website

Re: Running a Diary being hit by bots without StopForumSpam

is recaptcha in there

I'm not in the loop myself, so I couldn't/wouldn't claim to know, but.. if they're not about to trust SFS with queries due to privacy concerns (and assuming, of course, that they're actually being honest about their reasons, as opposed to just claiming it's to do with privacy, in order to avoid admitting that they've got a grudge against Alex), I doubt they would've implemented reCAPTCHA, seeing as that would involve trusting Google (which is far more lax in regards to privacy concerns than SFS could/would ever be). But.. Alex could likely confirm whether they do or don't implement reCAPTCHA (and of course, if the former, rather than the latter.. there'll certainly be a particular word coming to mind for me.. <lɐɔᴉʇᴉɹɔodʎɥ> tongue).


Some free, open-source packages I wrote:
- phpMussel (protects websites from malicious file uploads)
- CIDRAM (protects websites from unwanted traffic, spammers, bots, cloud services, etc)
- SFS Mass IP Checker (bulk query IP addresses with SFS)

Offline

#40 2019-07-04 12:10 pm

Alex Kemp
Moderator
From: Nottingham, England
Registered: 2009-12-02
Posts: 2,093
Website

Re: Running a Diary being hit by bots without StopForumSpam

I have never experienced Recaptcha during my use of OSM, ever.

Whilst there *are* manual spammers, the ones detailed above are certain to be bots.

Offline

#41 2019-08-20 1:55 am

Dr.Flay
Member
From: Kernow, UK
Registered: 2017-10-12
Posts: 11
Website

Re: Running a Diary being hit by bots without StopForumSpam

Whew, this was a deeper read than I expected.
I am looking forward to the TV adaptation.
Yeah TV not the silver screen because the story still isn't truly finished I guess, until they sort it out or abandon ship.

I can't get over them suggesting that pulling data from here has anything to do with the privacy of the users of their site.
They must know that is BS.
Their site won't show any details from here.


"I am a genius trapped inside an idiot"

Offline

#42 2019-09-13 8:59 pm

PaulBuonopane
Member
From: United States
Registered: 2015-08-26
Posts: 39

Re: Running a Diary being hit by bots without StopForumSpam

Dr.Flay wrote:

I can't get over them suggesting that pulling data from here has anything to do with the privacy of the users of their site.
They must know that is BS.
Their site won't show any details from here.

It's not that they're pulling data--it's that they're submitting data.  When you query SFS by submitting an IP address and email address, SFS discovers that email address A registered at site B with IP address C at time D.  Any middlemen who see the request in an unencrypted state get the same information.  When you use SFS, you're entrusting them with that information; you're also trusting them to choose good middlemen.  It sounds as though OSM doesn't trust SFS and doesn't trust the default middleman that SFS chose.

Offline

#43 2019-09-13 9:31 pm

Alex Kemp
Moderator
From: Nottingham, England
Registered: 2009-12-02
Posts: 2,093
Website

Re: Running a Diary being hit by bots without StopForumSpam

PaulBuonopane wrote:

It sounds as though OSM doesn't trust SFS and doesn't trust the default middleman that SFS chose.

That is certainly true.

SFS makes use of Cloudflare as a means of DDoS mitigation (the precise problem that OSM was suffering due to all the spam that was being flung at it). Cloudflare is a commercial business, but provides a free service for non-commercial businesses like SFS & OSM.

Offline

#44 2019-09-16 6:11 am

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 6,742

Re: Running a Diary being hit by bots without StopForumSpam

are the OSM mods still unhappy with SFS integration?  would they like to test a k- anonymity API query?

Offline

#45 2019-09-16 5:30 pm

Alex Kemp
Moderator
From: Nottingham, England
Registered: 2009-12-02
Posts: 2,093
Website

Re: Running a Diary being hit by bots without StopForumSpam

I've just sent the OSM admin (TomH) a message. We shall see what response, if any, he gives.

Offline

#46 2019-09-19 11:22 am

Alex Kemp
Moderator
From: Nottingham, England
Registered: 2009-12-02
Posts: 2,093
Website

Re: Running a Diary being hit by bots without StopForumSpam

No response from sole admin TomH; this is the almost-identical message that I've sent to Firefishy:

k- anonymity API queries for SFS wrote:

(I’ve attempted to get this through to TomH [16 September 2019 at 17:28], but of course he does not read his messages; can you forward it or use it?):

The StopForumSpam operator has listened to OSM privacy concerns & has put in place an ultra-private system for making API queries. If you wish to test it or ask questions he may be contacted here (name: ‘pedigree’ if sending a PM, or via support@stopforumspam.com):

Offline

Board footer

Powered by FluxBB

Close
Close