Re: [LONG] Buf Switches

From: Jeremy Elson (jelson@circlemud.org)
Date: 02/28/97


> I have recently noticed that show stats is reporting a much higher
> number of buf switches lately. (around 700 after 16 hours uptime)
> 
> After looking through the code, I'm still confused as to exactly
> why this is happening.. or if it's even A Bad Thing?
> 
> Here is what seems to be the relevant section of the code
> (write_to_output)

[code]

> It would seem that the mud is allocating a new buffer to store output
> designated to be sent to a player. (and logging this as a buf switch)
> 
> Is this a sign that many players are losing link and therefore more
> buffers are required to store this info, or is this something I have
> caused via a code addition somewhere along the way?
> 
> If this is allocating a new buffer to store this info, does it destroy
> the buffer and return the memory when it's no longer needed?

The code you quoted is the new output buffering system that I wrote for
Circle in the early (non-released) beta versions of 3.0.  The old Diku
code for queueing output (which stayed with Circle until version 2.20) was
memory- and time-inefficient in many cases (and, in my opinion, was
inefficient for the normal behavior of most MUDs).

First, I should explain what output queueing is and why it's necessary. 
On each pass through the game_loop(), the MUD performs a number of steps:
check to see if there are any new players connecting, kick out people with
bad links, read input over the network for all players, then process the
input for each player that has sent a complete line over the net.  The
processing step is usually where output is generated because it's where
MUD commands are processed (e.g., "kill" might generate output of "Kill
who?")  When output is generated, it is not immediately sent out to the
player, but instead queued for output in a buffer.  After all players'
commands are processed (and each command generates the appropriate output
for various players), the next step of the game_loop() is to send all the
queued output out over the network.

The new output system that Circle now uses allocates a small, fixed size 
buffer (512 bytes) for each descriptor in which output can be queued.
When output is generated (via such functions as send_to_char(), act(),
etc.), it is written to the fixed size buffer until the buffer fills.
When the buffer fills, we switch over to a larger (12K) buffer instead.
A "buffer switch", therefore, is when the 512-byte fixed buffer overflows.

When a large (12K) buffer is needed, it is taken from a pool of 12K
buffers that already been created.  It is used for the duration of that
pass through the game_loop() and then returned to the pool immediately
afterwards, when the output is sent to the descriptor.  If a large buffer
is needed but none are in the pool, one is created (thereby increasing the
size of the pool); the "buf_largecount" variable records the current pool
size. 

If a player has *already* gone from their small to large buffer, and so
much output is generated that it fills even the large buffer, the
descriptor is changed to the overflow state, meaning that all future
output for the duration of the current pass through the game loop is
discarded.  This is a buffer overflow, and the only state in which output
is lost.


Now that I've described how the system works, I'll describe the rationale. 
The main purpose for the two-tiered buffer system is to save memory and
reduce CPU usage.  From a memory standpoint: Allocating a fixed 12K buffer
for each socket is a simple scheme (and very easy to code), but on a large
MUD, 100 12K buffers can add up to a lot of wasted memory.  (1.2 megs of
memory used for buffering on a 100-player MUD may not seem like very much,
but keep in mind that one of Circle's big selling points several years
ago, when memory was expensive, was that it had a very small memory
footprint (3 or 4 megs total!)  And from a CPU standpoint: the original
Diku used a dynamically allocated buffer scheme to queue output, which
unfortunately meant that for *each* player, on *each* pass through the
game loop, dozens of tiny buffers (often one for every line of output,
depending on the code to execute the command) were allocated with
malloc(), *individually* written to the system using individual calls to
write(), and then free()'d.  My system saves hundreds or thousands of
calls per second to malloc() and free(), and reduces the number of system
calls *drastically* (to at most one per player per pass through the game
loop).

The trick is to choose the size of the small and large buffers correctly
in order to find the optimal behavior.  I consider "optimal" to mean that
90% of the time, most players stay within the limits of their small buffer
(for example, when wandering through town or mindlessly killing some
monster while watching damage messages go by).  Hopefully, a large buffer
switch is only necessary when a player executes a special command that
generates an unusually large amount of output, such as "who", "read
board", or "where sword".  This critically depends on the fact that not
everyone will be executing such a special large-output command at the same
instant.

For example, imagine you have 10 players on your MUD.  They are all
wandering around town, and every once in a while one of them types "who",
or reads the board, meaning that they are seeing more than 512 bytes of
output at a time.  On such a MUD, I would hope that there would only be a
*single* 12K buffer allocated which gets passed around among all the 10
players as needed.  Now, all players think they can queue up to 12K of
output per command without getting truncated even though only one 12K
buffer actually exists -- they are all sharing it.


But - there's a problem with this.  There are certain cases when *many*
players have to see a lot of output at the same instant (i.e. on the
*same* pass through the game_loop()), all of them will need a large buffer
at the same time and the pool will get very big.  For example, if an evil
god types "force all who"; or if the MUD lags for several seconds, then
suddenly gets unlagged causing many commands to be processed at the same
moment; or if 20 people are all trying to kill the same MOB and are all
seeing 20 damage messages (more than 512 bytes) on the same pass through
the game_loop().

Unfortunately, the current patchlevel of Circle has no way to destroy
large buffers so such cases are pathological and cause wasted memory. 
Unfortunately since I don't run a MUD I can't actually tell how often this
happens on a real MUD.  (If there are any IMPs out there who run large
MUDs (say, >= 30-50 players on regularly), and you've read this far,
please send me the output of "show stats" after your MUD has been played
for at least several hours.) 

Sometime soon (hopefully for pl12, which I've been working on slowly but
steadily), I want to implement something that will free large buffers if
the MUD thinks doing so is not a waste (for example: if we have 50 large
buffers but we've never simultaneously used more than 15 for the last 10
minutes.)

Hopefully this cleared up the way buffers work.

Regards,
Jeremy



This archive was generated by hypermail 2b30 : 12/18/00 PST