You are not logged in.
- Index
- » General Discussion
- » API Rate Limiting
#26 2009-04-21 9:44 am
- coldwind
- Member
- Registered: 2009-04-21
- Posts: 3
Re: API Rate Limiting
If you are developing your own SFS checks you should consider locally caching data locally as a part of the lookup process similar to what pedigree does in the VB plugin. It will allow you to query your local data store first before making a round trip to SFS servers. Setting your code to refresh every 'x' hours/days should provide a better experience for you (quicker check times) and for SFS (reduced load to server)
Agree. I have 3 forums on my server and I needed to add some protecting from spammers so I just made simple bash script to retrieve cvs file from server every hour. This script copies file for every forum and added simple PHP-script in index.php (I use IP.Board):
if (!isset($_COOKIE['clear']) || $_COOKIE['clear'] != md5('anysalt . $_SERVER['REMOTE_ADDR'])) {
if (in_array($_SERVER['REMOTE_ADDR'], explode("\n", file_get_contents('antispam/ips.txt')))) {
header('HTTP/1.1 403 Forbidden');
echo 'I suggest you DO NOT SPAM!';
exit;
} else {
setcookie('clear', md5('anysalt' . $_SERVER['REMOTE_ADDR']), 0, '/', null, false, true);
}
}
There can be some issues with this solution but some spamers register from one IP (home machine) and spamming from another (remote web-server).
If you have any suggestions, feel free to comment.
P.S. Sorry for posting program code
Offline
#27 2009-04-21 11:31 am
- MysteryFCM
- Member

- From: Tyneside, UK
- Registered: 2008-01-16
- Posts: 601
- Website
Re: API Rate Limiting
Your code also does not seem to allow for those using proxies? (it uses REMOTE_ADDR and not HTTP_X_FORWARDED_FOR). What you may want to consider, instead of using REMOTE_ADDR, is using the accounts e-mail address, this shouldn't change regardless of the IP they arrive from.
Regards
Steven Burn
Ur I.T. Mate Group / hpHosts
it-mate.co.uk / hosts-file.net
Offline
#28 2009-04-21 12:47 pm
- coldwind
- Member
- Registered: 2009-04-21
- Posts: 3
Re: API Rate Limiting
Your code also does not seem to allow for those using proxies? (it uses REMOTE_ADDR and not HTTP_X_FORWARDED_FOR). What you may want to consider, instead of using REMOTE_ADDR, is using the accounts e-mail address, this shouldn't change regardless of the IP they arrive from.
If you will look at http://www.stopforumspam.com/ipcheck/94.142.129.32 then you'll see - spammers use single email for one "project".
As I can remember if HTTP_X_FORWARDED_FOR exists then first addr is equal to REMOTE_ADDR. Let's look at IPB forum's code. It saves in DB only one IP and this IP is REMOTE_ADDR or first valid IP in HTTP_X_FORWARDED_FOR. Apache saves only REMOTE_ADDR in it's logs by default. This means that forum administrator can use only this IP for adding it to this database of spammers and if spammer had used proxy only proxy would be added in this database, but not his REAL IP. So we don't need to analyze HTTP_X_FORWARDED_FOR - his real IP is no within database.
Offline
#29 2009-04-21 2:40 pm
- MysteryFCM
- Member

- From: Tyneside, UK
- Registered: 2008-01-16
- Posts: 601
- Website
Re: API Rate Limiting
Actually, that's not strictly true. For example, the gateway I run has an IP of x.x.x.x, which is picked up by the server in REMOTE_ADDR. However, the actual visitors real IP, is passed in HTTP_X_FORWARDED_FOR. This var is accessable, regardless of the server software or scripting language, and should always contain the visitors real IP.
Not fool proof obviously, as the HTTP vars can be faked (my vURL service for example (not a proxy, just an analysis service), doesn't pass the visitors real IP), but should still be checked anyway.
For further details, and examples, see;
Regards
Steven Burn
Ur I.T. Mate Group / hpHosts
it-mate.co.uk / hosts-file.net
Offline
#30 2009-04-21 11:24 pm
- coldwind
- Member
- Registered: 2009-04-21
- Posts: 3
Re: API Rate Limiting
Actually, that's not strictly true. For example, the gateway I run has an IP of x.x.x.x, which is picked up by the server in REMOTE_ADDR. However, the actual visitors real IP, is passed in HTTP_X_FORWARDED_FOR. This var is accessable, regardless of the server software or scripting language, and should always contain the visitors real IP.
Not fool proof obviously, as the HTTP vars can be faked (my vURL service for example (not a proxy, just an analysis service), doesn't pass the visitors real IP), but should still be checked anyway.
Depending on configuration of server HTTP_X_FORWARDED_FOR may not exists if the request had no X-Forwarded-For header (Apache+PHP as example). Just setup Apache+PHP on your local machine and send request from command line (unix):
GET -H 'X-Forwarded-For: 1.1.1.1' http://localhost/test.php
If test.php contains var_dump($_SERVER) you'll see HTTP_X_FORWARDED_FOR=1.1.1.1
I agree that this script depends on server configuration, but as for front-end/back-end architecture, for example, apache has module for extracting real user IP from front-end's request and put it into REMOTE_ADDR (I had used it before). Some servers use another envirement variables. BUT this MUST be coded individually for your server and checked as well if it's not REMOTE_ADDR.
For further details, and examples, see;
The article without comments make me smile because of it's mistakes (hope you have read at least first comment).
The solution depends on the task. I think that the task is to get IP of the machine what had sent this request to our server and check it for spammer.
If server have front-end/back-end architecture (or specific configuration) then back-end must skip front-end's IP and set up correct IP to REMOTE_ADDR (imho).
When script is common for many solutions (checking HTTP_CLIENT_IP or HTTP_X_FORWARDED_FOR) then you may have troubles - this variables can be faked (as you said and in comments).
So if you check both REMOTE_ADDR and HTTP_X_FORWARDED_FOR addresses it's great. But if you check only HTTP_X_FORWARDED_FOR and your server haven't put REMOTE_ADDR to this chain - your script can be fooled. For example IPB checks only REMOTE_ADDR by default and if you need you can force to check HTTP_X_FORWARDED_FOR instead of REMOTE_ADDR in settings. Again, it's depend on server's configuration.
The main problem with my script is when spammer uses dynamic IP obtained from ISP.
Offline
#31 2009-04-22 6:07 am
- pedigree
- Administrator
- Registered: 2008-04-16
- Posts: 1,447
Re: API Rate Limiting
We dont use "forwarded" IPs anywhere in our code. Personally, there are seriously unreliable. They could cause IP polution to the database here if people started auto submitting them instead of the server ip connecting and they are easily spoofed as well.
Offline
#32 2009-04-22 9:25 am
- nevereven
- Member

- Registered: 2009-04-21
- Posts: 12
Re: API Rate Limiting
We will soon be implementing a rate limiting scheme into the API for checking IPs/usernames/email addresses. I hate to but after analyzing the hits there is a handful of hosts who are hammering it constantly, and I want to make sure the server resources are not being hogged because of it.
The limit will probably be around 1000 API queries per day, which is going to be plenty for most everyone. If you need more than that, let me know and something can probably be worked out.
If your script is integrated with the API, and your host exceeds the daily limit, the server will return a 403 HTTP status code and the output will look like this.
<response success="false"> <error>rate limit exceeded</error> </response>You should be able to code sufficient error handling for this case should it happen.
That's a little over 1 inactive forum registration every 2 minutes. Has this been implemented? I'll need to add a check for it, if so.
I can't believe, btw, how such high volumes go unnoticed by upstream providers. Amazing.
At the height of the potato famine, the London Times "looked forward" to a time "when a Celt on the Shannon would be as rare as a red man in Manhattan".
Offline
#33 2009-04-22 11:09 am
- kpatz
- Member
- Registered: 2008-10-09
- Posts: 110
Re: API Rate Limiting
AFAIK the limit is actually 5,000 queries/day. No one should see that many new registrations in a day. Unless you're Google or something. ![]()
Spammer spam-ur (N) Someone or something who is dumber than a bag of rocks. Commonly known for committing suicidal and idiotic moves such as posting spam on an anti-spam forum.
Offline
#34 2009-04-22 3:26 pm
- pedigree
- Administrator
- Registered: 2008-04-16
- Posts: 1,447
Re: API Rate Limiting
Yes, 5000 queries per 24 hours.
Thats queries, not sets. If you query IP , then username for the same registration, thats two queries unless you stack them in the query string
api.php?username=usertotest&ip=1.23.4.5&email=email@spam.com
thats one query. hitting the api three times, once per field, it three queries
Offline
#35 2009-04-23 6:23 am
- kpatz
- Member
- Registered: 2008-10-09
- Posts: 110
Re: API Rate Limiting
This brings up a suggested enhancement to the API: allow more than one of each field to be passed to a query (e.g. multiple IPs). That way, I can check username, email, registration IP, activation IP, and first post IP all with one query (if they're different).
maybe allow ip1...ip9 in the query, or some kind of delimiter in the field, e.g ip=1.2.3.4,2.3.4.5,6.7.8.9 or something like that.
Spammer spam-ur (N) Someone or something who is dumber than a bag of rocks. Commonly known for committing suicidal and idiotic moves such as posting spam on an anti-spam forum.
Offline
#36 2009-04-23 6:43 pm
- pavemen
- Member
- Registered: 2008-01-17
- Posts: 17
Re: API Rate Limiting
i think that the ability to include multiple items for each query attribute is a good thing, but does little to deal with day to day operations of active checks at registration time.
I think it would be beneficial for those users that wish to go back and check existing registrations, but how many IP/email/usernames are supplied at any one instance of registration? Just one for all forums as far as I know.... Unless the site owner is caching registrations and then making bulk SFS checks, there would be little reduction in API call I would think.
Now that I am building a plugin for the MyBB forum that uses SFS, I have found that the best set of changes Russ could implement right now would be:
1) to output the timestamp of an entry in the advanced search results. this would allow me to cache current data rather than only caching data up to the end of yesterday
2) to accept a full timestamp (or add hours/mins/sec) to the advanced query input values. in conjuction with #1 above, this would stop duplicate entries from being retrieved by having to query whole days.
what my plugin does is query the advanced search for entries up to the end of the previous day and parse that into a database. During registration, the cache is checked (optional) and if the IP/email/username is found and exceeds the minimum number of times, the registration is denied.
If not denied from the cache, the plugin then uses the API to check and parse the XML output.
I also included 'task' functionality (scheduled processes via the MyBB admin area) that will update the cache with advanced search output limited to the start of the day after the last update and the end of the prior day. Again, i need to track todays 'yesterday' and the date of the last updates 'yesterday' so i can limit the search time and avoid duplicates which would throw of the frequency checks.
Offline
- Index
- » General Discussion
- » API Rate Limiting



