You are not logged in.

#26 2009-04-21 2:44 pm

coldwind
Member
Registered: 2009-04-21
Posts: 3

Re: API Rate Limiting

skippybosco wrote:

If you are developing your own SFS checks you should consider locally caching data locally as a part of the lookup process similar to what pedigree does in the VB plugin. It will allow you to query your local data store first before making a round trip to SFS servers. Setting your code to refresh every 'x' hours/days should provide a better experience for you (quicker check times) and for SFS (reduced load to server)

Agree. I have 3 forums on my server and I needed to add some protecting from spammers so I just made simple bash script to retrieve cvs file from server every hour. This script copies file for every forum and added simple PHP-script in index.php (I use IP.Board):

if (!isset($_COOKIE['clear']) || $_COOKIE['clear'] != md5('anysalt . $_SERVER['REMOTE_ADDR'])) {
    if (in_array($_SERVER['REMOTE_ADDR'], explode("\n", file_get_contents('antispam/ips.txt')))) {
        header('HTTP/1.1 403 Forbidden');
        echo 'I suggest you DO NOT SPAM!';
        exit;
    } else {
        setcookie('clear', md5('anysalt' . $_SERVER['REMOTE_ADDR']), 0, '/', null, false, true);
    }
}

There can be some issues with this solution but some spamers register from one IP (home machine) and spamming from another (remote web-server).

If you have any suggestions, feel free to comment.

P.S. Sorry for posting program code

Offline

#27 2009-04-21 4:31 pm

MysteryFCM
Member
From: Tyneside, UK
Registered: 2008-01-16
Posts: 606
Website

Re: API Rate Limiting

Your code also does not seem to allow for those using proxies? (it uses REMOTE_ADDR and not HTTP_X_FORWARDED_FOR). What you may want to consider, instead of using REMOTE_ADDR, is using the accounts e-mail address, this shouldn't change regardless of the IP they arrive from.


Regards
Steven Burn
I.T. Mate / hpHosts
it-mate.co.uk / hosts-file.net

Offline

#28 2009-04-21 5:47 pm

coldwind
Member
Registered: 2009-04-21
Posts: 3

Re: API Rate Limiting

MysteryFCM wrote:

Your code also does not seem to allow for those using proxies? (it uses REMOTE_ADDR and not HTTP_X_FORWARDED_FOR). What you may want to consider, instead of using REMOTE_ADDR, is using the accounts e-mail address, this shouldn't change regardless of the IP they arrive from.

If you will look at http://www.stopforumspam.com/ipcheck/94.142.129.32 then you'll see - spammers use single email for one "project".

As I can remember if HTTP_X_FORWARDED_FOR exists then first addr is equal to REMOTE_ADDR. Let's look at IPB forum's code. It saves in DB only one IP and this IP is REMOTE_ADDR or first valid IP in HTTP_X_FORWARDED_FOR. Apache saves only REMOTE_ADDR in it's logs by default. This means that forum administrator can use only this IP for adding it to this database of spammers and if spammer had used proxy only proxy would be added in this database, but not his REAL IP. So we don't need to analyze HTTP_X_FORWARDED_FOR - his real IP is no within database.

Offline

#29 2009-04-21 7:40 pm

MysteryFCM
Member
From: Tyneside, UK
Registered: 2008-01-16
Posts: 606
Website

Re: API Rate Limiting

Actually, that's not strictly true. For example, the gateway I run has an IP of x.x.x.x, which is picked up by the server in REMOTE_ADDR. However, the actual visitors real IP, is passed in HTTP_X_FORWARDED_FOR. This var is accessable, regardless of the server software or scripting language, and should always contain the visitors real IP.

Not fool proof obviously, as the HTTP vars can be faked (my vURL service for example (not a proxy, just an analysis service), doesn't pass the visitors real IP), but should still be checked anyway.

For further details, and examples, see;

http://roshanbh.com.np/2007/12/getting- … n-php.html


Regards
Steven Burn
I.T. Mate / hpHosts
it-mate.co.uk / hosts-file.net

Offline

#30 2009-04-22 4:24 am

coldwind
Member
Registered: 2009-04-21
Posts: 3

Re: API Rate Limiting

MysteryFCM wrote:

Actually, that's not strictly true. For example, the gateway I run has an IP of x.x.x.x, which is picked up by the server in REMOTE_ADDR. However, the actual visitors real IP, is passed in HTTP_X_FORWARDED_FOR. This var is accessable, regardless of the server software or scripting language, and should always contain the visitors real IP.

Not fool proof obviously, as the HTTP vars can be faked (my vURL service for example (not a proxy, just an analysis service), doesn't pass the visitors real IP), but should still be checked anyway.

Depending on configuration of server HTTP_X_FORWARDED_FOR may not exists if the request had no X-Forwarded-For header (Apache+PHP as example). Just setup Apache+PHP on your local machine and send request from command line (unix):

GET -H 'X-Forwarded-For: 1.1.1.1' http://localhost/test.php

If test.php contains var_dump($_SERVER) you'll see HTTP_X_FORWARDED_FOR=1.1.1.1

I agree that this script depends on server configuration, but as for front-end/back-end architecture, for example, apache has module for extracting real user IP from front-end's request and put it into REMOTE_ADDR (I had used it before). Some servers use another envirement variables. BUT this MUST be coded individually for your server and checked as well if it's not REMOTE_ADDR.

MysteryFCM wrote:

For further details, and examples, see;

http://roshanbh.com.np/2007/12/getting- … n-php.html

The article without comments make me smile because of it's mistakes (hope you have read at least first comment).

The solution depends on the task. I think that the task is to get IP of the machine what had sent this request to our server and check it for spammer.

If server have front-end/back-end architecture (or specific configuration) then back-end must skip front-end's IP and set up correct IP to REMOTE_ADDR (imho).

When script is common for many solutions (checking HTTP_CLIENT_IP or HTTP_X_FORWARDED_FOR) then you may have troubles - this variables can be faked (as you said and in comments).

So if you check both REMOTE_ADDR and HTTP_X_FORWARDED_FOR addresses it's great. But if you check only HTTP_X_FORWARDED_FOR and your server haven't put REMOTE_ADDR to this chain - your script can be fooled. For example IPB checks only REMOTE_ADDR by default and if you need you can force to check HTTP_X_FORWARDED_FOR instead of REMOTE_ADDR in settings. Again, it's depend on server's configuration.

The main problem with my script is when spammer uses dynamic IP obtained from ISP.

Offline

#31 2009-04-22 11:07 am

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 7,104

Re: API Rate Limiting

We dont use "forwarded" IPs anywhere in our code.  Personally, there are seriously unreliable.  They could cause IP polution to the database here if people started auto submitting them instead of the server ip connecting and they are easily spoofed as well.

Offline

#32 2009-04-22 2:25 pm

nevereven
Member
Registered: 2009-04-21
Posts: 12

Re: API Rate Limiting

Russ wrote:

We will soon be implementing a rate limiting scheme into the API for checking IPs/usernames/email addresses. I hate to but after analyzing the hits there is a handful of hosts who are hammering it constantly, and I want to make sure the server resources are not being hogged because of it.

The limit will probably be around 1000 API queries per day, which is going to be plenty for most everyone. If you need more than that, let me know and something can probably be worked out.

If your script is integrated with the API, and your host exceeds the daily limit, the server will return a 403 HTTP status code and the output will look like this.

<response success="false">
        <error>rate limit exceeded</error>
</response>

You should be able to code sufficient error handling for this case should it happen.

That's a little over 1 inactive forum registration every 2 minutes.  Has this been implemented?  I'll need to add a check for it, if so.

I can't believe, btw, how such high volumes go unnoticed by upstream providers.  Amazing.

Offline

#33 2009-04-22 4:09 pm

kpatz
Member
Registered: 2008-10-09
Posts: 1,437

Re: API Rate Limiting

AFAIK the limit is actually 5,000 queries/day.  No one should see that many new registrations in a day.  Unless you're Google or something. wink


Spam happens when greed meets stupidity.

Offline

#34 2009-04-22 8:26 pm

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 7,104

Re: API Rate Limiting

Yes, 5000 queries per 24 hours.

Thats queries, not sets.  If you query IP , then username for the same registration, thats two queries unless you stack them in the query string

api.php?username=usertotest&ip=1.23.4.5&email=email@spam.com

thats one query.  hitting the api three times, once per field, it three queries

Offline

#35 2009-04-23 11:23 am

kpatz
Member
Registered: 2008-10-09
Posts: 1,437

Re: API Rate Limiting

This brings up a suggested enhancement to the API:  allow more than one of each field to be passed to a query (e.g. multiple IPs).  That way, I can check username, email, registration IP, activation IP, and first post IP all with one query (if they're different).

maybe allow ip1...ip9 in the query, or some kind of delimiter in the field, e.g ip=1.2.3.4,2.3.4.5,6.7.8.9 or something like that.


Spam happens when greed meets stupidity.

Offline

#36 2009-04-23 11:43 pm

pavemen
Member
Registered: 2008-01-17
Posts: 17

Re: API Rate Limiting

i think that the ability to include multiple items for each query attribute is a good thing, but does little to deal with day to day operations of active checks at registration time.

I think it would be beneficial for those users that wish to go back and check existing registrations, but how many IP/email/usernames are supplied at any one instance of registration? Just one for all forums as far as I know.... Unless the site owner is caching registrations and then making bulk SFS checks, there would be little reduction in API call I would think.

Now that I am building a plugin for the MyBB forum that uses SFS, I have found that the best set of changes Russ could implement right now would be:

1) to output the timestamp of an entry in the advanced search results. this would allow me to cache current data rather than only caching data up to the end of yesterday

2) to accept a full timestamp (or add hours/mins/sec) to the advanced query input values. in conjuction with #1 above, this would stop duplicate entries from being retrieved by having to query whole days.

what my plugin does is query the advanced search for entries up to the end of the previous day and parse that into a database. During registration, the cache is checked (optional) and if the IP/email/username is found and exceeds the minimum number of times, the registration is denied.

If not denied from the cache, the plugin then uses the API to check and parse the XML output.

I also included 'task' functionality (scheduled processes via the MyBB admin area) that will update the cache with advanced search output limited to the start of the day after the last update and the end of the prior day. Again, i need to track todays 'yesterday' and the date of the last updates 'yesterday' so i can limit the search time and avoid duplicates which would throw of the frequency checks.

Offline

#37 2012-11-02 5:36 pm

shannona
Member
Registered: 2012-10-29
Posts: 7

Re: API Rate Limiting

It looks like we've hit this limit form atropos.skotos.net (66.211.109.196) using the Vbulletin plugin ( http://www.vbulletin.org/forum/showthread.php?t=230921 ). We're a large site and we seem to have come under determined attack this morning (enough that the sites load skyrocketed when the spammers started getting through).

Anything that can be done about this? The serivce doesn't really work if it could fail any time under normal usage.

Thanks!

Offline

#38 2012-11-02 5:44 pm

John Darkhorse
Member
Registered: 2012-02-19
Posts: 319

Re: API Rate Limiting

shannona wrote:

It looks like we've hit this limit form atropos.skotos.net (66.211.109.196) using the Vbulletin plugin ( http://www.vbulletin.org/forum/showthread.php?t=230921 ). We're a large site and we seem to have come under determined attack this morning (enough that the sites load skyrocketed when the spammers started getting through).

Anything that can be done about this? The serivce doesn't really work if it could fail any time under normal usage.

Thanks!

Are you using ZB BLock?

http://www.spambotsecurity.com/zbblock.php

It works a treat!

Offline

#39 2012-11-02 6:37 pm

shannona
Member
Registered: 2012-10-29
Posts: 7

Re: API Rate Limiting

John Darkhorse wrote:

Are you using ZB BLock?

http://www.spambotsecurity.com/zbblock.php

It works a treat!

The discussion of it over at vbulletin.org suggests to me that it's just too hit and miss:
http://www.vbulletin.org/forum/showthread.php?t=254331&page=2

The site's too big for me to be constantly chasing false positives.

Hoping to get stopforumspam working again ...

Offline

#40 2012-11-02 8:17 pm

John Darkhorse
Member
Registered: 2012-02-19
Posts: 319

Re: API Rate Limiting

shannona wrote:
John Darkhorse wrote:

Are you using ZB BLock?

http://www.spambotsecurity.com/zbblock.php

It works a treat!

The discussion of it over at vbulletin.org suggests to me that it's just too hit and miss:
http://www.vbulletin.org/forum/showthread.php?t=254331&page=2

The site's too big for me to be constantly chasing false positives.

Hoping to get stopforumspam working again ...

It sounds to me that all the problems reported are due to ignorance on the webmaster/admin's part.

Most all of us can attest that ZB Block works great!

I've been running it for months now, and not had a single issue with "good" bots/spiders and moral humans.

I have had a 100% drop in "guests" churning through my site(s) but not dong anything visible (like posting or whatever).

Offline

#41 2012-11-02 8:41 pm

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 7,104

Re: API Rate Limiting

You don't seem to be doing a lot of caching as some usernames are being queried 1000 times in a 24 hour period.  Please check your caching in spam-o-matic...

You are blowing 15% of your daily limits on querying for a blank email address as well

   2840 /api?email=

Check those both out and keep me up to date smile

Offline

#42 2012-11-02 9:43 pm

zaphod
Jägermonster
From: USA
Registered: 2008-11-22
Posts: 2,985
Website

Re: API Rate Limiting

Just a little side note in defense of my brain child...

I have no idea who yotsume is, but I never heard from him, or had a help request on my forum from him. Also that thread for the most part, is pretty old, and many improvements have happened since. ZB Block will always be a WIP, because our enemies never stop developing.

Honestly, most problems can be fixed with a little consultation, and there is no reason if the manual is read, and the whitelisting password is used, that an admin would ever see the 503.

Zap hmm


Get Protected, Stay Protected...
With ZB Block, GNU/GPL Freeware Anti-Spam/Anti-Hack protection for your php based website.

Little boxes in the server farm, little boxes running php...

Offline

#43 2012-11-02 10:55 pm

shannona
Member
Registered: 2012-10-29
Posts: 7

Re: API Rate Limiting

pedigree wrote:

You don't seem to be doing a lot of caching as some usernames are being queried 1000 times in a 24 hour period.  Please check your caching in spam-o-matic...

You are blowing 15% of your daily limits on querying for a blank email address as well

   2840 /api?email=

Check those both out and keep me up to date smile


The vBulletin mod is supposed to cache.  I can't test right now because it was emptied out when I saw that the SFS link was down the morning. Sounds to me like something's not working as there's no way it should have queried 1000 times. At best 48.

However, I have upped the cache time. It was initially set to 30 minutes and I upped it 1440.  I guess we'll see how that works tomorrow ...

As for the blank email; sounds like a bug with the mod. I'll report it smile.

Last edited by shannona (2012-11-02 10:56 pm)

Offline

#44 2012-11-02 11:19 pm

shannona
Member
Registered: 2012-10-29
Posts: 7

Re: API Rate Limiting

zaphod wrote:

I have no idea who yotsume is, but I never heard from him, or had a help request on my forum from him. Also that thread for the most part, is pretty old, and many improvements have happened since. ZB Block will always be a WIP, because our enemies never stop developing.

I'm more concerned with what percentage of customers will get blocked than admins, but I'll take a look next week if you think it's mightily improved form those discussions ...

Thanks!

Offline

#45 2012-11-03 12:01 am

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 7,104

Re: API Rate Limiting

shannona wrote:

However, I have upped the cache time. It was initially set to 30 minutes and I upped it 1440.  I guess we'll see how that works tomorrow ...

Even at 30 minutes, caching wouldnt hit the figures we are seeing within a 24 hour period.  Something is wrong there.

Offline

#46 2012-11-03 12:10 am

shannona
Member
Registered: 2012-10-29
Posts: 7

Re: API Rate Limiting

shannona wrote:

As for the blank email; sounds like a bug with the mod. I'll report it smile.

Yep. The vBulletin mod rather inconveniently checks against the SFS database even if the new user would be otherwise rejected (for no email address, for not answering a human-interface question, etc). And it doesn't even check whether it has an $ip/$email/$username before trying to look those up.

I'm fixing these in my local version, but it's surely generated useless lookups from other folks ...

Offline

#47 2012-11-03 12:13 am

shannona
Member
Registered: 2012-10-29
Posts: 7

Re: API Rate Limiting

pedigree wrote:
shannona wrote:

However, I have upped the cache time. It was initially set to 30 minutes and I upped it 1440.  I guess we'll see how that works tomorrow ...

Even at 30 minutes, caching wouldnt hit the figures we are seeing within a 24 hour period.  Something is wrong there.

Oh, I agree. We do have a master/slave setup, and it probably does take a teeny bit of time to get the cached info from the master to the slave, which could cause some multiple lookups, but I wouldn't expect that to increase stuff by a lot.

When we're not locked out from the site, I can see if the caching is getting built right ...

Offline

#48 2012-11-03 1:59 am

shannona
Member
Registered: 2012-10-29
Posts: 7

Re: API Rate Limiting

We're up and running again. As I said, we've now corrected our copy of the vBulletin mod so that it doesn't even try a lookup if there's no email address, which I expect was much of the problem.  Thanks *very* much for the feedback on that. I've also verified our cache seems to be doing the right thing.

If you see any weird behavior coming from atropos.skotos.net (66.211.109.196), please let me know specifics and I'll make sure that gets corrected too.

Thanks for the service too!

Offline

Board footer

Powered by FluxBB

Close
Close