Re: Cursing code

From: Templar Viper (templarviper@hotmail.com)
Date: 05/14/03


From: "Jesse Becker" <hawson@temperedweaves.com>
> On Wed, May 14, 2003 at 06:57:41PM +0200, Templar Viper wrote:
> > From: "Thomas Arp" <t_arp@stofanet.dk>
> > > From: "Templar Viper" <templarviper@HOTMAIL.COM>
> > > > A while ago, I fixed this code together. It checks the argument for
> > > > cursing (placed in curse_list), and replaces it with a harmless
beep. I
> > > > had to use str_strr, as strstr isn't case sensitive. However, I want
to
> > > > ignore certain characters, such as spaces, full stops and more of
those.
> > >
> > > I'm a bit unclear on your intent here;
> > > Do you wish to make sure to catch "CUR SE", too? Or do you mean you
have
> > > 'multiword' curse words ?
> >
> > Yes, I do want to catch "CUR SE".
>
> I had to help out with something like this a few years ago for a
> semi-major online game from a semi-major online game company (both of
> which shall remain nameless <grin>).  We didn't both to filter
> chat--that was nominally covered by the TOS agreement (<chuckle>), but
> we did want to prevent people from choosing handles based on various
> 'illegal' words.
>
> We did this:
>
> 1)  Generate a list of invalid words (BADWORDS), in plain english.[1]
> 2)  Generate a list of substitutions that could be used, stuff
>     like 'l' => 'l', and 'k' => '|<', and 'U' => 'V' or '|_|', etc
> 3)  Permute BADWORDS against all of our mappings, and generate a HUGE
>     list of 'invalid' names.
> 4)  On each name creation, check the proposed name against this list.
>
>
> Note 1:  This was, perhaps, the single most fun thing I've ever done in
> a paying job. :-)  I also discovered just how twisted the minds of two
> of my co-workers are...

Wow, looks like a fun to do.. Especially when you get paid for it :) I can
think
of worse things to do then to come up with a huge list of bad words. I see
you
put in quite some efford to make a watertight-system.

> It worked quite well, and the permutation, which was essentially an N*N
> operation, only had to be done whenever we update the BADWORDS list
> (which wasn't every often...see note 1...).  The new checks added zero
> CPU time overhead.  These were also running on hardware that was already
> old (HP 715 pizza boxes).

I prefer to keep it more simple - KISS (Keep it simple, stupid ;)
I don't intend to create the über-anti-badword system. Instead, I want to
create
a simple system to filter out obvious cursing.
If a player _really_ wants to chat out badwords, he'll do that anyway I
think.
The system is to make the curser be aware that he shouldn't be cursing. If
he
chooses to ignore it, so be it, then it is up to immortals to get rid of the
pest, just
like in any other mud.

> Now, if you want to filter on the fly, you now have, at worst, a linear
> search to perform per word.  Naturally, if you do some clever data
> structure work, you can improve on this.  All of the patterns were for
> matching whole words only, so we didn't have to worry about works like
> "hassle".  Adding a rule to permute your words into spaced out versions
> and adding them to your checklist can also be done.
>
> This might be a fair bit faster than recomputing each word on the fly,
> especially if you have a good searching algorithm.

I am not that concerned about spending CPU resources. I don't think the
function
will need to process hundreds of lines per second, only a few every minute.
I know
the method could be improved, but I prefer to spend time on other parts of
code in
my mud :)

--
   +---------------------------------------------------------------+
   | FAQ: http://qsilver.queensu.ca/~fletchra/Circle/list-faq.html |
   | Archives: http://post.queensu.ca/listserv/wwwarch/circle.html |
   | Newbie List:  http://groups.yahoo.com/group/circle-newbies/   |
   +---------------------------------------------------------------+



This archive was generated by hypermail 2b30 : 06/26/03 PDT