Re: Circlemud design issues

From: James Turner (turnerjh@XTN.NET)
Date: 04/23/98


George <greerga@CIRCLEMUD.ORG> writes:

> >Compressed storage could easily be added to the current binary
> >system.  Two options -- make a second file, players.cryo, where
> >entries are copied during a playerpurge, or write the char_file_u to a
> >separate file for players who are offline.  Bringing players back
> >would be equally easy.
>
> Now you're increasing complexity where it wouldn't matter for ASCII pfiles.

Increase in complexity, yes.  But there are alternatives for all of
the pluses for ascii pfiles, except perhaps for the easy player
editing... but that could be incorporated into OLC.  Not as nice as
being able to fire emacs or vi or pico or whatever up when you need
it, but still nice.

> >ext2fs, as well as many other Unix file systems, work to prevent
> >fragmentation much more efficiently than FAT/VFAT/FAT32/NTFS (yes NTFS
> >tries, but its algorithms are quite bad in many cases).  Besides, such
>
> Now you're arguing the point I had, that the filesystem is optimized to
> handle such cases.

I am arguing that large files don't suffer from fragmentation, not
nearly as much as in DOS, Win95, or NT.  You were arguing that the
large number of files wouldn't be that much more fragmented than a
binary flatfile.  But binary flatfiles aren't that fragmented, whereas
the OS has no way of knowing that this group of text files should be
close to each other on the physical disk.

> You'll have probably 50 people at a time on a MUD, 150k.  Not everyone will
> be logged in at once and the OS need only care about who is actually being
> accessed. (My server keeps ~40 MB of cache anyway.)

Yes, but until a player logs in, their data isn't cached, nor is their
directory entry, so there is overhead in this.  Unless you cache them
all, you can only be guaranteed of having a particular file cached
after the player has logged on.

As for your server keeping 40 meg cached -- how many muds are running
on it?  Is there a webserver on it?  How many hits?  I can point to
any number of particular, high-end machines that will support a point
-- it still isn't proof.

> There's a difference here.  Putting all your headers into one file gives
> virtuall no benefit with quite a few disadvantages.  Here, we're getting
> all the advantages of ASCII player files and lose very little.  It's a ffar
> better tradeoff to have ASCII pfiles than a unified header.

"All the advantages" -- a slight change to binary organization can
gain those advantages as well.  Further, your arguments clutter the
runtime environment of a mud; the header issue is involved in the
compile environment.  Very different issues.

> >offsets are read by build_player_index().  There is no need to call
> >load_char() for such information.  There's no similar method for ascii
> >pfiles, unless some data is stored in a binary file that keeps track
> >of this.  But then you can't, say, edit gold in an ascii pfile because
> >it is stored in the index.
>
> So we extend build_player_index() to scan for gold also, since it already
> scans for all the player names.  Trivial.

You're saying run build_player_index() every boot, and have it scan in
each of the text files?  Even the inactive ones?

A hundred or so ascii files in the world file is one thing.  But
thousands of players, each maybe 3k (which I doubt, taking into
account descriptions, equipment, aliases... unless these are separate
files).  You're saying this won't bring normal servers to a crawl
during boot?  It'll thrash the disk like mad, particularly in a
heavily used environment.  Worse, the OS's cache won't be able to
predict the next file to read, so readahead won't help (unlike binary,
where it can readahead in the binary file itself).

> >You love those numbers, yet they mean nothing.  As I stated above,
> >they are not very relevant to the ascii pfile situation.
>
> It took 2.6 seconds when nothing (at all, including the directory
> structure, most MUDs would have some cached) was in the cache, .8 seconds
> when everything was in the cache. (Including the overhead of the
> 'bin/circle -c' call and loading the 1 megabyte circlemud binary.)  That is
> still one megabyte per second when uncached, and most of those are small
> files.  I expect ASCII pfiles to be around 1-3k so they would be parsed
> fast enough.  (I have IDE drives if you wonder.)

Again, you're running on a devoted system, with no other muds loaded,
on a highend system.  Not all muds in the real world are so lucky.
That is why I said the numbers weren't very relevant -- not just
because of the file sizes, but also because of the server type.

We can argue this on any level you want; file system, disk cache,
physical disk, whatever.  A big issue is fragmentation resulting from
their being a large number of ascii files.  Since you untarred the
circle source at one time, there will be a fair amount of physical
nearness on the disk of the files.  That will reduce disk time a fair
bit because seek times won't be an issue.

Try this.  Run updatedb at the same time as booting the mud.  Do it on
a freshly rebooted system.  Maybe have netscape loading at the same
time, and circle compiling.  That will give you an environment more
representative of the real world.

> >> If the interface is the same, there will be no difference to patches unless
> >> they deal directly with the store_to_char/char_to_store type routines.
> >
> >Or unless they add entries to char_data that need to be stored, then
> >patches much be applied in two places (though the binary system is
> >simple except for the issue of database conversion, which is one of
> >ascii pfiles' strong points).
>
> char_data doesn't matter, char_file_u does, and there's spares they should
> use anyway.  Those spares can be automatically taken care of.

Reread the first line -- entries to char_data that _need to be
stored_.  Which implies some kind of storage in char_file_u.

Now, as for using spares -- are you saying the ascii pfiles should
have tags like "Spare01: " and such?  That way they end up
uneditable.  There will have to be something lower that char_data
dealing with all new char_file_u entries.  They will need entries
somewhere in the interface between the ascii pfiles and the
char_file_u translation.

--
James Turner               turnerjh@xtn.net
                           http://www.vuse.vanderbilt.edu/~turnerj1/


     +------------------------------------------------------------+
     | Ensure that you have read the CircleMUD Mailing List FAQ:  |
     | http://democracy.queensu.ca/~fletcher/Circle/list-faq.html |
     +------------------------------------------------------------+



This archive was generated by hypermail 2b30 : 12/15/00 PST