Re: Circlemud design issues

From: James Turner (turnerjh@XTN.NET)
Date: 04/22/98


George <greerga@circlemud.org> writes:

> >Reading 4k from a disk won't really get the speed boost you're talking
> >about -- in a case like this, the overhead of seeking the position on
> >the drive will far outweigh the decompression time.  It won't speed
> >things up unless these files are huge.
>
> I didn't say it would speed it up dramatically (especially if using SCSI
> drives where you may slow down), but you would still save disk space while
> you're not using the player.

Compressed storage could easily be added to the current binary
system.  Two options -- make a second file, players.cryo, where
entries are copied during a playerpurge, or write the char_file_u to a
separate file for players who are offline.  Bringing players back
would be equally easy.

However, the compression that was being discussed was in the context
of someone who liked doing incremental backups every 10 days, and who
felt text files offered better compression ratios.

> Sure it is, your 'flat' file may be fragmented all over the disk from
> beginning to end.

ext2fs, as well as many other Unix file systems, work to prevent
fragmentation much more efficiently than FAT/VFAT/FAT32/NTFS (yes NTFS
tries, but its algorithms are quite bad in many cases).  Besides, such
fragmentation isn't so horrible in unix, where the inode system keeps
a map of a file and can seek to any given sector without having to
process through a linked list system (ie FAT).  The fragmentation
would not be nearly so bad as a large number of separate files.

> >Again, in a multiuser environment, or on a server with multiple muds,
> >this is adding quite a bit of overhead.  Particularly the highly
> >non-localized disk access.  Hard drives are the real bottleneck.
>
> You're scanning through a 3k file on a system that can do many megabytes of
> transfers per second.  Servers with multiple MUDs have lots of RAM free for
> cache. (If they don't, the administrator has a big problem.)

1. If the server cached every player, each having a 3k file, on a mud
of moderate size (1000 players) then that's 3 meg.  You're saying the
OS should cache 3 meg of data per mud?

2. When accessing small files, the throughput of a drive is a much
smaller factor than seek rate.  Ascii pfiles wouldn't be any kind of
burst transfer, whereas initial scanning of a flat file like is
currently used would.

> I said you can parse the entire CircleMUD world in 9/10 of a second.  Your
> average player file will be about 3k.  The CircleMUD _rooms_ (nothing else)
> total 1.5 megabytes.  The entire world is 2.5 megabytes. 9/10 of a second.
> Player files are 3k (overestimation).  Your computer wouldn't even blink to
> parse that. Please stop arguing for the sake of arguing.  The speed factor
> here is moot.

You're right, it would be a blink of an eye.  But it still would take
orders of magnitude longer than straight reading from a binary
structure.  Once again, it is not an issue of disk throughput.  Your
9/10ths of a second is for files that are 1. contiguous (large files,
all written to the disk at the same time -- ie when decompressing the
original tarball) and 2. already in the system's cache.

You were rabidly arguing against pushing the limits of circle in a
different thread.  But now you're saying we should increase cpu time
as well as increase the number of random disk accesses.

> >With ascii pfiles, that involves opening, scanning, and closing each
> >file.  That is quite a bit of overhead.  Several orders of magnitude
> >slower than can be done with the current system.  Prohibitively slow
> >on a multiuser system.
>
> I said 'The interfaces can remain exactly as current.'  In order to total
> how much gold is in each players posession you could _still_ have to call
> load_char() on each one if I kept the interface the same. If you had looked
> at the current code to scan through the pfile you would see that there
> could be no difference in execution.  And as I said, the speed issue is
> moot.

No, you are incorrect.  The totalling of gold and whatnot can take
place during the initial boot process where player names and file
offsets are read by build_player_index().  There is no need to call
load_char() for such information.  There's no similar method for ascii
pfiles, unless some data is stored in a binary file that keeps track
of this.  But then you can't, say, edit gold in an ascii pfile because
it is stored in the index.

> >Not for a binary database.  There is no question of ease of off-line
> >maintenance in an ascii pfile situation.  But in a binary system, that
> >ease is given up in favor speed and other niceties.  I'm not totally
> >convinced ascii pfiles are worth the transition, particularly if both
> >are supported.
>
> Speed is moot, I can parse 2.5 megabytes in 5 directories and multiple
> files in 9/10 of a second. 3k of a player file will not matter.

You love those numbers, yet they mean nothing.  As I stated above,
they are not very relevant to the ascii pfile situation.

> >Again, patches will have to support both.  There will be the added
> >burden of maintaining both, also.  One or the other should be picked
> >as the standard method.
>
> If the interface is the same, there will be no difference to patches unless
> they deal directly with the store_to_char/char_to_store type routines.

Or unless they add entries to char_data that need to be stored, then
patches much be applied in two places (though the binary system is
simple except for the issue of database conversion, which is one of
ascii pfiles' strong points).

> >The current code is somewhat close to that... but not quite enough to
> >make new systems plugable.
>
> Well, I have an idea on how to do it, and it won't be hard.

You're right, it won't be.  Circle is in many places already fairly
well modularized.  However, there are exceptions.  The char_file_u
structure shouldn't be interfaced directly by the code at any point,
since it is intimitely tied to the actual file format.  Put it away
into smaller functions whose specifications can be set forth and allow
other storage techniques to be interfaced.

--
James Turner               turnerjh@xtn.net
                           http://www.vuse.vanderbilt.edu/~turnerj1/


     +------------------------------------------------------------+
     | Ensure that you have read the CircleMUD Mailing List FAQ:  |
     | http://democracy.queensu.ca/~fletcher/Circle/list-faq.html |
     +------------------------------------------------------------+



This archive was generated by hypermail 2b30 : 12/15/00 PST