You are not logged in.

#1 2010-08-06 5:13 pm

KeyDog
Member
From: Europe
Registered: 2010-07-18
Posts: 181
Website

Query The URL DB

I've come to the conclusion that with all the open proxies, it's a bit of a Sisyphus task entering just IPs (+email & username), trying to counter the more talented spam masters, on its own. I alone see up to 30 "clean" IPs a day, i.e. no entry here, that are spamming.

That's why Grez and me over at PunBB have been working (well him coding and doing all the clever sutff) an extension that queries a list of spam URLs I've put together in .csv format (currently it has just 240 urls). I'd say the spammers are acquiring 50-80 new "customers" daily with websites they want promoted.

I now wanted to know whether SFS will host a queryable db of URLs? I remember reading recently that you didn't initially? I think that would be a shame.... Or maybe someone knows of another place "bad" URLs are listed?

At the moment our extension checks each post and signature being edited and cross-checks it with the .csv list. A db fed by this community would of course be larger and more up-to-date!


Project URL Checker:
Database with several thousand website URLs spammers are promoting. Please feel free to contribute obvious spam related URLs to me: keydog@keydogbb.info
Further Plug-In users welcome; currently over 50k daily checks

Offline

#2 2010-08-06 7:49 pm

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 7,061

Re: Query The URL DB

Well, the new site, ready to give live now, will collect URLs posted as evidence (if ppl submit the spam of course) so well be able to provide them as a db function shortly smile

I do however have to decide how to handle redirection services, full urls or just to list entire domain names...  tough call

www.yahoo.com

obviously isnt a spamertized site, however

www.yahoo.com/this/redir?sjdfhjkashdfajklshdf12341234&www.pillz.com is

If a spammer controls a site, then anything after the domain name can be randomized and ignored by the server but how do we treat that

www.pillz.com/ahsjkldfhlasjkdhflk295728947928374/asfasdf/asdf/sdf/as/dfas/f/asf/asdf/asd/f

etc.. so do we handle that when the domain is spam?

dfajksdfh9a8sdf7698as7dfasdf.pillz.com
oifoiweuriowuerowiuer.pillz.com

is another example of obscuring spam, when the spammer controls the dns

some legit providers provide something like

pillz.demon.co.uk

or

www.amazon.com/~pillz

maybe for every domain added to the database, the api could give different results depending on what the user wants

api?url=www.pillz.com/928428934/swrioweuriowur&l=1

l=1 = tld, just the domain
l=2 = everything

so if the db had pillz.com/343434 listed, then l=1 would return true, l=1 would return false

or should we just do domains names...

This dilemma is already been addressed by surbl so we just need to see how we can best implement it.

Offline

#3 2010-08-06 9:00 pm

KeyDog
Member
From: Europe
Registered: 2010-07-18
Posts: 181
Website

Re: Query The URL DB

*****
EDIT:
FYI: Browsable list of Spam URLs (unclickable for safety)
*****

Well to answer one must consider:

They often post 4 different urls now in a single post/signature;
1. a profile on some other site - spammed
2. the basic url of the seller of product
3. the basic url plus a specific product name
4. a variation of 3 or an article on the subject

I haven't collected 1 - but if l=2=everything is used it wouldn't matter that a legit site was in the db. However if l=1= just the domain was used a potentially valuable site could be blocked.

I tend to collect 2. and 3. so there l=2 would suit my needs best.

Also noteworthy: Some sites like youtube are used to promote certain videos advertising a product or similiar, and you don't want to block that as forum admin/moderator. There you also need l=2 to just block the specific video.

If they just post one URL - but a long one - I sometimes "cut off" the end, so that the spammers can't just create 100s of different sub-urls that all lead to same page in endeffect.

I also imagine that if human errors occurs when adding to db it will be easy to get feedback from genuine users trying to post such links and modify.

Last edited by KeyDog (2010-08-07 10:36 am)


Project URL Checker:
Database with several thousand website URLs spammers are promoting. Please feel free to contribute obvious spam related URLs to me: keydog@keydogbb.info
Further Plug-In users welcome; currently over 50k daily checks

Offline

Board footer

Powered by FluxBB

Close
Close