Re: [NEWBIE]Invalid names

From: Daniel Koepke (dkoepke@california.com)
Date: 11/10/99


On Wed, 10 Nov 1999, Strider Aragorn wrote:

> They way I got around this was to include two lists.  One with names that
> couldn't be ANYWHERE in the name:

Well, FWIW, the way I got around it is to use regular expressions for the
bad name list.  This permits both sort of matching and others in a far
more consolidated format.  The other idea I've been playing with is that
instead of having a list of invalid names is to have a list of valid
names, either according to some formula (in which case, an actual listing
would be unnecessary: checking the name for compliance to the formula's
criteria would be sufficient) or manually compiled from some source.  The
latter might be slightly more work, but would probably generate much
higher quality names.  I don't think there'd be any problem with shortages
of names.  I could probably come up with about 1000 Celtic (or
Celt-influenced) names for characters.

The other part of this system is to not make players GUESS valid names or
make a name a restricting factor.  Instead, new characters pick race,
gender (as it applies to the race, so that not every race is as binary as
"male" or "female"), origins, etc., and by this we either give them a name
or allow them to browse through names that fit and are unused.

> There is a little bit more to it (loading the list, etc) but copy the code
> already there and it should be easy.

Speaking of which, regular expression support is not difficult to
implement (albeit, it isn't straight-forward).  There are existing
libraries that do all of the work for you in a few function calls, so you
shouldn't have too great a problem.  In fact, according to the 'info'
pages, Glibc has regular expression support included.  Basically, there
are two steps to supporting regular expressions.  The first is the step
that prevents it from being completely straight-forward.  Regular
expressions must be compiled (so that matching against them is fast and
easy).  To do this one calls regcomp() (or the appropriate equivalent for
your library).  regcomp() is for compiling POSIX regexps.  It is called in
the following manner (Beware, Mailer Code(tm)):

    regexp_t * g_badNames = NULL; /* Global Regexp for Bad Names */
    int g_numBadNames = 0; /* Precounted Number of Bad Names */

    .
    .
    .

    /*
        void InitializeBadNames ()

        Initialize the list of bad names by reading every non-comment line
        from the bad names file and compiling them as regular expressions.
        Note that regular expressions are case insensitive (REG_ICASE) and
        that substring indices are not requested (REG_NOSUB).  The latter
        means that the third and fourth arguments of regexec() can be
        arbitrary and are ALWAYS ignored regardless of whether they are
        valid or not.
    */

    void
    InitializeBadNames (void)
    {
        int idx = 0, t, lines = 0;
        char line[256];
        FILE *fp;

        if ((fp = fopen(BADNAMES, "r")) == NULL) {
            log("Could not open %s.  Bad names check disabled.",BADNAMES);
            return;
        }

        /* Count the number of bad names then rewind the file. */
        while (get_line(fp, line)) g_numBadNames++;
        rewind(fp);

        /* Read the bad names and fill in g_badNames. */
        CREATE(g_badNames, regexp_t, g_numBadNames+1);

        while ((t = get_line(fp, line))) {
            lines += t;
            if (regcomp(g_badNames+i, line, REG_ICASE | REG_NOSUB)) {
                log("Error compiling regexp on line %d", lines);
                /* Do better error reporting via regerror(). */
                exit(1);
            }
        }

        log("Loaded %d bad name regular expressions.", lines);
    }


    /*
        bool BadName (const char *)

        Executes all of the bad name regular expressions on the passed
        string to determine if it fits any of these.  Returns TRUE if
        the passed string matches any of the bad name regular expressions
        and FALSE otherwise (even on error!).
    */

    bool
    BadName (const char *name)
    {
        int idx = 0;

        for (idx = 0; idx < g_numBadNames; idx++)
            if (!regexec(g_badNames+idx, name, 0, NULL, 0))
                break; /* Matched! */

        return (idx != g_numBadNames);
    }

SUGEON GENERAL'S WARNING: Mailer Code(tm), such as the above, has been
proven to cause cancer, hair loss, fatigue, headache, fever, upset
stomach, shortness of or inability to catch breath, impotence,
unpopularity, astroid impacts, small civil uprisings in third world
countries, revolution, mad cow disease, frivolous lawsuits, electronic
transmission of obscene material over the Internet, the election of bad
politicians, a rise in Pauly Shore's popularity, lewd behavior, a Seinfeld
reunion, a new series on the WB, incest, Jerry Springer "Too Hot For
TV" tapes, minor concussions and other head trauma, moderate to large
sized explosions, outbreak of plague, disease, and famine, and, in some
rare cases, frustration.  You should not use Mailer Code(tm) if you are a
woman or pregnant; if you have a humanoid form; if you feel life is
precious and taking gambles with it is foolish; if you have any sense of
morality; or if you don't want people to come up to you on the street to
remark, "You know, Mailer Code's bad for you."

You have been warned.

(P.S., the redundancy of "woman or pregnant" is intentional and taken from
the television advertisements for Propecia, where it's stated first,
"Women should not use Propecia," and then later, "You should not use
Propecia if you are pregnant or are going to be pregnant."  The whole
thing might seem quite absurd, but a certain company I know very well
includes in its licensing agreement, under section 9, Force Majeure, that
they will not be held responsible for service failures as a result of,
"fire, lighting [sic], explosion, power surge or failure, water, acts of
God, war, revolution, civil commotion or acts of civil or military
authorities or public enemies," along with a number of other rather
bizarre protections.)

Also, dkoepke@california.com will be expiring soon enough.  I'll probably
switch my CircleMUD mailing list subscription to dkoepke@circlemud.org and
setup a temporary e-mail account for personal e-mail.

-dak


     +------------------------------------------------------------+
     | Ensure that you have read the CircleMUD Mailing List FAQ:  |
     |  http://qsilver.queensu.ca/~fletchra/Circle/list-faq.html  |
     +------------------------------------------------------------+



This archive was generated by hypermail 2b30 : 12/15/00 PST