You are not logged in.
- Topics: Active | Unanswered
Pages: 1
#1 2010-08-06 5:13 pm
- KeyDog
- Member
- From: Europe
- Registered: 2010-07-18
- Posts: 181
- Website
Query The URL DB
I've come to the conclusion that with all the open proxies, it's a bit of a Sisyphus task entering just IPs (+email & username), trying to counter the more talented spam masters, on its own. I alone see up to 30 "clean" IPs a day, i.e. no entry here, that are spamming.
That's why Grez and me over at PunBB have been working (well him coding and doing all the clever sutff) an extension that queries a list of spam URLs I've put together in .csv format (currently it has just 240 urls). I'd say the spammers are acquiring 50-80 new "customers" daily with websites they want promoted.
I now wanted to know whether SFS will host a queryable db of URLs? I remember reading recently that you didn't initially? I think that would be a shame.... Or maybe someone knows of another place "bad" URLs are listed?
At the moment our extension checks each post and signature being edited and cross-checks it with the .csv list. A db fed by this community would of course be larger and more up-to-date!
Project URL Checker:
Database with several thousand website URLs spammers are promoting. Please feel free to contribute obvious spam related URLs to me: keydog@keydogbb.info
Further Plug-In users welcome; currently over 50k daily checks
Offline
#2 2010-08-06 7:49 pm
- pedigree
- uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
- From: New Zealand
- Registered: 2008-04-16
- Posts: 6,999
Re: Query The URL DB
Well, the new site, ready to give live now, will collect URLs posted as evidence (if ppl submit the spam of course) so well be able to provide them as a db function shortly
I do however have to decide how to handle redirection services, full urls or just to list entire domain names... tough call
www.yahoo.com
obviously isnt a spamertized site, however
www.yahoo.com/this/redir?sjdfhjkashdfajklshdf12341234&www.pillz.com is
If a spammer controls a site, then anything after the domain name can be randomized and ignored by the server but how do we treat that
www.pillz.com/ahsjkldfhlasjkdhflk295728947928374/asfasdf/asdf/sdf/as/dfas/f/asf/asdf/asd/f
etc.. so do we handle that when the domain is spam?
dfajksdfh9a8sdf7698as7dfasdf.pillz.com
oifoiweuriowuerowiuer.pillz.com
is another example of obscuring spam, when the spammer controls the dns
some legit providers provide something like
pillz.demon.co.uk
or
www.amazon.com/~pillz
maybe for every domain added to the database, the api could give different results depending on what the user wants
api?url=www.pillz.com/928428934/swrioweuriowur&l=1
l=1 = tld, just the domain
l=2 = everything
so if the db had pillz.com/343434 listed, then l=1 would return true, l=1 would return false
or should we just do domains names...
This dilemma is already been addressed by surbl so we just need to see how we can best implement it.
Offline
#3 2010-08-06 9:00 pm
- KeyDog
- Member
- From: Europe
- Registered: 2010-07-18
- Posts: 181
- Website
Re: Query The URL DB
*****
EDIT:
FYI: Browsable list of Spam URLs (unclickable for safety)
*****
Well to answer one must consider:
They often post 4 different urls now in a single post/signature;
1. a profile on some other site - spammed
2. the basic url of the seller of product
3. the basic url plus a specific product name
4. a variation of 3 or an article on the subject
I haven't collected 1 - but if l=2=everything is used it wouldn't matter that a legit site was in the db. However if l=1= just the domain was used a potentially valuable site could be blocked.
I tend to collect 2. and 3. so there l=2 would suit my needs best.
Also noteworthy: Some sites like youtube are used to promote certain videos advertising a product or similiar, and you don't want to block that as forum admin/moderator. There you also need l=2 to just block the specific video.
If they just post one URL - but a long one - I sometimes "cut off" the end, so that the spammers can't just create 100s of different sub-urls that all lead to same page in endeffect.
I also imagine that if human errors occurs when adding to db it will be easy to get feedback from genuine users trying to post such links and modify.
Last edited by KeyDog (2010-08-07 10:36 am)
Project URL Checker:
Database with several thousand website URLs spammers are promoting. Please feel free to contribute obvious spam related URLs to me: keydog@keydogbb.info
Further Plug-In users welcome; currently over 50k daily checks
Offline
Pages: 1