Re: [NEWBIE] [CODE] file_length() and Parsing

From: Sammy (samedi@ticnet.com)
Date: 05/11/00


Erk!  Sorry about the double post.  Ticnet's new webmail is still a little
buggy.

From: "Jarratt Davis" <Jarratt.Davis@CAMTECH.COM.AU>
Sent: Thursday, May 11, 2000 8:14 AM

> On Thu, 11 May 2000, Sammy Should Use His Whole Name Next Time wrote:
> <Some stuff snipped>
> > err = fstat(fileno(fp), &sb);
> >
> > This should be portable across all systems.  I think I've used it in
> > win95/98, slackware and redhat linux (roughly a year old), and probably at
> > least one flavor of bsd.
> >
> > Sam

> Thanks very muchly for pointing that out - I wouldnt have noticed
> till I actually tried to compile it on UN*X of someflavour.  My comments
> in that file were simply from my poking around in stdio.h in the MSVC5
> includes directory and also /usr/include on a reasonably recent version of
> FreeBSD.  The old brain isnt working too well at this point in time :P
> Next step is to reqrite the parsing system for my ascii files..
> My old one read from the files direct which of course is a bad way to go.
> Well on my laptop it is anyway :)  My old function that interprets stuff
> like:
> ImAString ("You enter a large room /(My isnt it bright in here for such a
> large indoors room/).  You see a sign that says, /"Welcome adventurer/".")
> Now the way this was written it would read in the file byte by byte,
> remembering the byte that came before so it could recognise "control"
> characters such as /" /( /) as well as (" and ").  What I want to know is
> this really the way I have to go or can I do it another way without slow
> things down to byte by byte parsing?  Or should I just make my ASCII
> system simpler? :)

If by reading one byte at a time you mean reading from disk, then you should
definitely move to a system that reads the entire file or at least a line
first.  If you're talking about parsing from the buffer you've already read
from disk, then don't worry too much about speed concerns.  Parsing strings in
memory is pretty speedy and done pretty regularly in stock circle code.

> It also used to handle, among others:
> ImANumber 12345
> * Hey heres a comment
> ImAFlag TRUE
> I guess what Im saying is, what is the most effecient way of parsing stuff
> like this that is lumped together as in one whole mob file read into a
> buffer?  Not looking for code - just some ideas to kickstart this poor
> brain of mine.  Ive been thinking about this for a few days and I cant
> think ofmuch.  Just had a brief look through the archives, a few brief
> mentions but nothing realy solid enough to be useful to me...
> Sorry to be such a pain - once Ive got a direction to aim, I promise you
> wont hear a peep out of me till Ive finished it :)  Unless I code myself
> into a corner that is :)

If those ImA* labels are part of the file, then you've got something pretty
similar to what I've done.  I used a standard 4-character "tag" (actually 6
characters counting a colon and space).  My file buffering code included its
own get_line() function, and probably a get_string() function for descriptions
and such.

When reading a file, the sequence goes something like:

- open file and get its length
- create a buffer of the exact size and fread() the whole thing at once
- close the disk file
- parse one line
- seperate the line into the first 4 characters (the tag) and anything after
the colon/space
- find out what to do with it based on the tag
- read more data if necessary (for multiline data such as descriptions)

I also set default values for all variables before parsing the buffer, so any
values that match the defaults can be left out of the file write.  This can
save you a *ton* of space over binary files.  If you want to see how I've set
things up, get a copy of my ascii pfiles 2.0b system from the ftp site.  The
file diskio.c has all the file buffering and buffer parsing functions in it
and has been very useful to me any time I need to read/write ascii files since
it handles all the memory allocation and other chores for me.

If anyone's interested in a bugfixed copy, I think I can probably dig up my
most recent version.

Sam


     +------------------------------------------------------------+
     | Ensure that you have read the CircleMUD Mailing List FAQ:  |
     |  http://qsilver.queensu.ca/~fletchra/Circle/list-faq.html  |
     +------------------------------------------------------------+



This archive was generated by hypermail 2b30 : 04/10/01 PDT