You are not logged in.

#1 2009-01-17 12:27 pm

AsciiD
Member
Registered: 2008-12-07
Posts: 49

Long usernames in the spam database?

I have noticed usernames showing up in the database recently which spread the right side of the page far---far away.

This example is pasted from todays list, wraps once, and counts 134 characters in all.

Региональные новости

Is anyone really allowing usernames with that many characters on their forum?

Is it possible the username field could wrap at a lesser value (say 63 chars)?

Offline

#2 2009-01-17 2:34 pm

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 7,104

Re: Long usernames in the spam database?

Its because of UTF encoding.   Each russian symbol translates to between 3-6 characters in latin encoding

Offline

#3 2009-01-17 4:17 pm

AsciiD
Member
Registered: 2008-12-07
Posts: 49

Re: Long usernames in the spam database?

I have support for crylic, and several east asian codepages.

I can see it is rendered properly as crylic text once posted, however that does not alter the 134 character string I see on the main page.

The on-line forum editor also renders it as a 134 character alphanumeric string.

Last edited by AsciiD (2009-01-17 4:38 pm)

Offline

#4 2009-01-17 4:28 pm

AsciiD
Member
Registered: 2008-12-07
Posts: 49

Re: Long usernames in the spam database?

Please see these entries:

1/16/09 7:47 PM - 68.34.1.132
1/16/09 7:48 PM - 95.28.88.36

Offline

#5 2009-01-18 5:42 pm

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 7,104

Re: Long usernames in the spam database?

crylic still has to be encoded, regardless of how a forum supports it.  If you look in your database using phpmyadmin (for example) you will see how its encoded by your forum/php and stored in the database.  The string in your first post displays as russian to me.  Im not sure if there is any easy way for php/mysql to natively support all language spaces within utf encoding to numerical/alphabetic latin characters

Offline

#6 2009-01-18 9:17 pm

zaphod
Jägermonster
From: USA
Registered: 2008-11-22
Posts: 2,985
Website

Re: Long usernames in the spam database?

Would it not be better to treat the domain of the mail address (right of the @) as a seperate field, and then take perhaps the first 12 and the last 12 of any username, and do a pattern match ( PHP substr_count($haystack,$needle) ) on both the first 12 AND/(OR for more paranoid admins) last 12 of the username, AND the domain?

This of course would require a re-structure of the database, and the interface modules. Perhaps something to keep in mind for the future?

Zap hmm


Get Protected, Stay Protected...
With ZB Block, GNU/GPL Freeware Anti-Spam/Anti-Hack protection for your php based website.

Little boxes in the server farm, little boxes running php...

Offline

#7 2009-01-19 3:24 am

AsciiD
Member
Registered: 2008-12-07
Posts: 49

Re: Long usernames in the spam database?

Just to ensure we are not working at cross purposes:

My system and browser support utf8 in general and specificaly crylic codepages. I understand what utf8 is.

My GUI browser (Seamonkey) renders the text (as posted in this thread) recognisibly as crylic. My console browsers (elinks, links) render this phrase as "Regional'nye novosti"

I was not concerned with my database, but had issue with how these usernames displayed on your main page.

Although they have both dropped off the main page, a search in your database for 95.28.88.36 still returns the same 134 character string. It seems impossible to reproduce this literally here in this forum, even with "code" tags.

I have come to expect a default placeholder indicating an "unprintable" in most GUI apps. That the characters are being rendered in their literal alphanumerc form I thought may have been an indicator of something.

At the time I had only viewed this string on the main page. Only after I had posted it here in the forum was it renderd as text.

Although for me it is a minor inconvenience, I had thought it a possible indicator of some larger concern at your end.

I do not believe I am the only one this has ever happened to, and am sure it will happen again to someone in the future.

smile

Offline

#8 2009-01-19 2:38 pm

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 7,104

Re: Long usernames in the spam database?

Ah, I get what youre at, yes, the search will return the latin encode instead of the utf encoded crylic, its because of UTF-7/8 cross-site scripting vulnerablities and I havent got to the bit of displaying the code correctly and not having it vulnerable to a XSS.  I removed all the XSS that I could when I revised the sites php and thats the only one that I couldnt make work properly, hence why its like that.

Searching from within a script/API will still return a hit though.

I should drink more coffee

Offline

#9 2009-01-19 3:49 pm

AsciiD
Member
Registered: 2008-12-07
Posts: 49

Re: Long usernames in the spam database?

wink Strong dark brew for me.

Offline

Board footer

Powered by FluxBB

Close
Close