Re: Better string swapping. (fwd)

From: George (greerga@CIRCLEMUD.ORG)
Date: 07/04/98


The aforementioned good ideas.  I'd recommend looking into them as
improvements upon the basic code.

--
George Greer, greerga@circlemud.org | Genius may have its limitations, but
http://patches.van.ml.org/          | stupidity is not thus handicapped.
http://www.van.ml.org/CircleMUD/    |                  -- Elbert Hubbard

---------- Forwarded message ----------
Date: Sat, 4 Jul 1998 21:07:53 +0200 (MET DST)
From: "Erwin S. Andreasen" <erwin@andreasen.com>
To: George <greerga@circlemud.org>
Subject: Re:  Better string swapping.

On Sat, 4 Jul 1998, George wrote:

> I took the time to refine the previous code I had to make it use less disk
> space and save more memory.  This is provided with no guarantee and is
> known to be incompatible with all existing OLC systems.
>
> If you're serious about using this, it might be a good idea to not
> fopen/fclose the file constantly as I do here.  fopen() it once in the
> world_init() function and fclose() it in the world_end() function.  This
> will reduce the overhead, but possibly increase the memory usage by a
> little bit. If you do this, remember to fseek() to 0 where the current
> fclose() calls are.
>
> It is, of course, possible to extend this relatively easily...

I understant you meant this as only an example, but it really breaks my
heart to see this:

+int find_room(room_rnum rnum, FILE *fl)
+{
+  int jump;
+
+  if (rnum < 0 || rnum > top_of_world)
+    return FALSE;
+
+  while (rnum--) {
+    /* Read length of title. */
+    fread(&jump, sizeof(unsigned int), 1, fl);
+    /* Jump past title to room description. */
+    fseek(fl, jump, SEEK_CUR);
+
+    /* Read the length of this room description. */
+    fread(&jump, sizeof(unsigned int), 1, fl);
+    /* Jump offset to next room. */
+    fseek(fl, jump, SEEK_CUR);
+  }
+  return TRUE;

If the file was 400k as you say, then this means 100 * 4096 buffers - thus
this will result in at least 1200 system calls if you want the last room!
(400 reads, and 800 seeks - I think fseek() always causes a lseek()).

That is really inefficient. So, consider making an *index* file at the
same time: before writing to the main data file, write the current
position to the index file.

When everything is done, read that file back into memory, and close it.

When looking up something, find out what position in the data file it has
by lookig int up in the index entry.

Or alternatively, save that index in the room data for each room.


Yet-another-alternative, that makes the patch much smaller, is to:

1) find a good range of memory that is empty. Set a pointer to that (now
empty area), call it str_ptr or such.

2) Whenver reading a string then writing it to the disk, set the
room->name/ room->description to current str_ptr, and advanced str_ptr
strlen(string) bytes +1 (and write JUST the string to the database
(including the NUL though), nothing else)

3) After you are done, mmap() the string file, and make sure you request
that it appears as the location you started str_ptr with (yeah, it's a bit
risky, but I do a similar thing to ensure that pointers in shared memory
modules I use are right).

This results in minimal changes to main code: You don't have to make any
get_room_name/description at all - the strings in the file are
automatically fetched in when you access, and are NUL-terminated.

You could actually extend this to other strings loaded at boot time that
will not change.

If the number of swapped-in pages from the file becomes too large, you can
always just munmap() and mmap() the file again, "freeing" the memory that
way.

As a bonus, something like Solaris doesn't seem to show mmaped() pages on
ps :)

An alternative to this alternative is to start out by opening the file,
and just lseek() and write() one byte at some quite large position - it
becomes a "spare" file, the space between the one byte and the beginning
of the file does not get allocated until something is actually written
there, but you can then mmap() first, then instead of calling fread() and
then fwrite() to read/write the stirngs, you fread_string() the strings
from the disk - just directly into the mmaped() space, and they later get
automagically flushed.



 =============================================================================
<erwin@andreasen.com>      Herlev, Denmark              UNIX System Programmer
<URL:http://www.abandoned.org/drylock/>     <*>         (not speaking for) DDE
 =============================================================================


     +------------------------------------------------------------+
     | Ensure that you have read the CircleMUD Mailing List FAQ:  |
     | http://democracy.queensu.ca/~fletcher/Circle/list-faq.html |
     +------------------------------------------------------------+



This archive was generated by hypermail 2b30 : 12/15/00 PST