Re: Regular Expressions [was: Invalid names]

From: Daniel A. Koepke (dkoepke@california.com)
Date: 11/11/99


On Thu, 11 Nov 1999, Mark A. Heilpern wrote:

> I do have a question on the Mailer Code(tm) you included...  In the area
> where you're compiling the expression, the first parameter to regcomp() is
> your expression plus an offset i. You don't have a local i defined, though
> you do have idx. However, you aren't using it anywhere beyond
> initialization to 0. Am I correct in assuming that idx should be
> incremented by 1 after each successful call to regcomp()?

Yes -- as always, this is mailer code.  The corrected version (with one
other fix) is below:

     void
     InitializeBadNames (void)
     {
         int idx = 0, t, lines = 0;
         char line[256];
         FILE *fp;

         if ((fp = fopen(BADNAMES, "r")) == NULL) {
             log("Could not open %s.  Bad names check disabled.",BADNAMES);
             return;
         }

         /* Count the number of bad names then rewind the file. */
         while (get_line(fp, line)) g_numBadNames++;
         rewind(fp);

         /* Read the bad names and fill in g_badNames. */
         CREATE(g_badNames, regexp_t, g_numBadNames+1);

         while ((t = get_line(fp, line))) {
             lines += t;
             if (regcomp(g_badNames+idx, line, REG_ICASE | REG_NOSUB)) {
                 log("Error compiling regexp on line %d", lines);
                 /* Do better error reporting via regerror(). */
                 exit(1);
             }
             idx++; /* Increment it. */
         }

         log("Loaded %d bad name regular expressions.", g_numBadNames);
         fclose(fp); /* Avoid memory leak, etc. */
     }

A few things should be pointed out about this.  It's probably much more
clear to use "&g_badNames[idx]" instead of "g_badNames+idx", so I would
maybe recommend that instead of the above.  Error handling with POSIX
regular expressions is done by regcomp() and regexec() returning error
codes.  Both return zero on success (which, for regexec(), means it found
a match) and some REG_xxx error code on failure.  These values can be
passed as the first argument to regerror() along with the compiled regexp,
a character buffer, and the size of that character buffer, respectively.

Finally, any regular expression you're not dealing with on a full time
basis (i.e., let's say you're going to reload your g_badNames list) should
be deleted with regfree().  Unfortunately, I don't know how well this will
work with our array.  You can attempt regfree(g_badNames);, but, YMMV.

    #define FREE(_addr) do { free(_addr); _addr = NULL; } while (0)

Ah, and I just spotted and fixed something else in the Mailer
Code(tm).  The logging of the number of regexps loaded used 'lines'
instead of 'g_numBadNames', which would report how many lines are in the
file, not how many bad name regexps were loaded.

Luckily, I gave fair warning about the volatility of Mailer Code(tm).


,-'~^~`-,._.,-'~^~`-,._.,-'


ObOffTopic:

What do people name their computers?  I've recently had difficulty
thinking up names for the addresses of my computers.  This is obviously a
very important decision (and I can even tie it into CircleMUD, since the
name of a particular computer at JHU became the name of Jeremy's little
MUD project), one which could change the course of my life for years to
come.  Okay, so maybe that's a bit on the melodramatic side.  But, in any
case, I've named two of my computers thus far.  The one I'm sitting at
right now, which I think of as the same computer I've had for the past
five or so years (although I've changed NEARLY every part of it, with the
exceptions being the floppy drive and my SoundBlaster 16), which is
currently a Pentium II, is 'hoodoo' (no, I didn't spell, "voodoo," wrong;
fetch a dictionary and look it up, the OED and OAD have it, at the least)
running Slackware 7.0 (as of three days ago).  The other named computer is
'tyro'.  This is the newer computer, a Pentium III.  It doesn't do a whole
lot right now other than bookkeeping and playing of DVDs.  Finally, there
are three other machines that are in need of names (yes, I know, I own
five computers and that makes me strange.  But, really, I own more than 5,
these are just the ones that are part of my house network).  The sound
server, which is in charge of playing MP3s and such; the game computer,
which has a pretty obvious function; and the printer/word processor
computer, which is your standard MS Office setup.  Prospective names are
'mozart' for the sound server; 'checkers' for the gaming computer; and
'redcoat' for the big, bad Win98 computer (which, up until I put together
the Pentium III, was the only computer running just Windows [the game
server runs it and Linux], and thus was analogous to a redcoat: a British
soldier in the American Revolution).  So, how do you other people name
your computers?  Have better names for any of mine?

-dak


     +------------------------------------------------------------+
     | Ensure that you have read the CircleMUD Mailing List FAQ:  |
     |  http://qsilver.queensu.ca/~fletchra/Circle/list-faq.html  |
     +------------------------------------------------------------+



This archive was generated by hypermail 2b30 : 12/15/00 PST