Documentation for File Object API
AweMUD Next Generation
Copyright (C) 2003  AwesomePlay Productions, Inc.
Sean Middleditch <elanthis@awesomeplay.com>
-------------------------------------------------

 *** WARNING: OUT OF DATE INFORMATION ***

1. Introduction
   ============

   The file object API is comprised of two parts; the first is simply the
   API itself.  The API abstracts the details of the file format in question
   to a large degree, while retaining the ability to add formatting and
   other "prettifying" constructs to supporting formats.

   The second part is a standard file format.  This format is not the only
   one that can be used with the file object API; it is, however, designed
   to work optimally with the API.  Other backends, including SQL databases,
   XML, and so on could also be plugged in to this API.

2. Object Concepts
   ===============

   The file object concept is very simple.  It is based on these premises:

    - All data is represented as objects.
    - Objects are comprised of attributes and child objects.
    - Objects have a type.
    - Objects optionally have a (possibly unique) name.
    - The definition of object type is depedent on parent object.

   That's a lot of ugly stuff.  The last three points probably need the most
   explanation; the rest should hopefully be clear enough.

   One thing to look at is the object type.  Continuing the above example, the
   Black Longsword is of type blueprint.  The item instance is of type object
   (object is, confusingly, the name of the item type in AweMUD).  This type is
   also freeform; the player object might see any item objects as inventory
   items.  The blueprint object might be any item objects as a definition of a
   blueprint to include.

   This document does not seek to explain these interactions; those are
   wholly dependent on the core of AweMUD, and this document is meant merely
   to explain how the file object API works, and to document the default
   file format for the API.

3. File Object Format
   ==================

   At once point in AweMUD's history, a format similar to an ini file was
   used.  Each object was begun with a header in brackets ([]), followed by a
   list of attributes in the form name=value, and completed
   with [end].  Ex:
   
    [room]
     name=a_bar
     title=Blood & Flagon
    [end]

   Eventually, as this format was a little hard to grok (the [end] tags in
   particular were hard), and XML was all craze, AweMUD switched to using
   XML as its primary data store.  While this works, and works well, XML is
   even harder to edit than the original format.  XML is highly verbose, and
   very very strict about its formatting.  The simple example above expands
   into the grotesque:

    <room>
     <name>a_bar</name>
     <title>Blood &amp; Flagon</title>
    </room>

   The important data becomes lost in a sea of tags (including the
   unnecessary end tags) and what should be a simple escape sequence turns
   into a horrendous &amp;.  This format is _not_ human friendly in the
   least.  XML is a machine language, with the side benefit of being
   human-readable when necessary.

   The replacement file-format had three goals:

    - To be as easy or easier to parse than XML.
    - To be very easy to edit by hand.
    - To provide a sane syntax for the object features needed (type, name).

   Looking aroud at some other formats, a format very similar to the
   original AweMUD file format was decided on.  Example:

    room a_bar {
     title = Blood & Flagon
    }

   The format is incredibly simple.  An object begins with it's type (a
   single word).  If a name is to be specified, a space follows, and then the
   name.

   Attributes are in the name=value form.  The name must be a single word.
   Any space before or after the name (but before the =, equals sign) is
   discarded.  Following the = (equals sign) all spaces are consumed.  Once
   a non-space is encountered, the rest of the line is part of the value.
   Whitespace at the end of the line is NOT discarded.  Names may be
   comprised of any character A-Z (upper or lower case), any digit, _
   (underscore), or - (dash).  \ (backslash) escaping allows for use of
   whitespace in names.  Using " (double-quotes) will put the string in
   scape mode until another " (double-quotes) character is encountered.  The
   \ (backslash) escaping in names only indicates to interpret the next
   character literally; no special expansions (such as \n to newline) are
   performed.

   Values can make use of the standard \ (backslash) escape mechanism.  \n
   will insert a newline into the value.  A \ at the end of the line (no
   spacing after it) will cause the contents of the next line to be
   appended; see below for details.  A \  (backslash followed by a space)
   inserts a space; this can be used at the beginning of a value to stop the
   whitespace discarding rule, allowing you to have whitespace at the
   beginning of your value.

   When the line append escape (\ at end of line) is used, the actual
   newline will be discarded.  Additionally, the whitespace on the beginning
   of the following line is also discarded; the discarding can be broken
   using the \  (backslach followed by a space) escape described above.
   There is no limit to how many lines you can append together.

   You may also have comments.  A comment is any line that begins with zero
   or more whitespace, followed by a # (hash/pound sign).  Note that
   comments cannot follow any other directives; a comment must be on its own
   line.

   Some examples:

    # room in the castle
    room kings_court {
     title = King's Court
     # exits in the room
     exit 1 {
      title = Main Door
     }
     # items on the floor
     object {
	  blueprint = longsword
      condition = bad
     }
     # npcs in the room
     npc {
	  blueprint = goblin_warrior
      health = 40%
     }
    }
 
    # goblin blueprint
    blueprint goblin {
     title = ugly goblin
    }
    # goblin warrior blueprint
    blueprint goblin_warrior {
	 inherit = goblin
     title = goblin battle warrior
     int strength = 10
    }

4. File Object Reader
   ==================

   The first, and likely most important, part of the file object API is the
   File::Reader class.  This class represents a file object that is being
   read.  The API is incredibly simple; it is based entirely on the pull
   mechanism.

   What this means is, you request the next chunk of data, over and over,
   until you hit the end of the file.  This is just like how you read bytes
   using normal UNIX file operations; the File::Reader class will only
   return whole data chunks, however.

   A data chunk is defined as either the opening of a new object, an
   attribute, or the closing of an object.

   The File::Reader makes use of the Node object for returning data.  The
   Node object has the following methods:

     const string& get_type(void);
     const string& get_name(void);
     const string& get_data(void);

     size_t get_line(void);

     bool is_attr(void);
     bool is_end(void);
     bool is_begin(void);

   After getting a new node, you should use the is_*() methods to determine
   what kind of node this is.  A begin is the beginning of an object.  An
   end node is the end/closing of an object.  A data node is an attribute.

   The get_*() accessors are used to query information on the node.  Note
   that none of the get_*() accessors are valid with end nodes.

   The get_line() method returns the line in the file the node began on.
   Some backends may not support this feature, and may get_line()'s result
   is undefined for those backends.  You can use this method in error
   reporting (for example, spitting out a warning about an attribute you
   weren't expecting in an object).

   In order to read an object file, you make a File::Reader object.  The
   File::Reader object can be given a file-name in its constructor (for
   filesystem based backends).  Alternatively, the open() methods may be
   used on an existing File::Reader object.  open() returns -1 on error, or
   0 on success; errno will be set on error.  You may check if a
   File::Reader is open by using the is_open() method.  You may close the
   File::Reader using close().  This is done automatically when the object
   is destroyed.

     File::Reader(const string& path);
     int open(const string& path);
     void close(void);
     bool is_open(void);

   Reading from the File::Reader object is very simple.  There are only two
   methods, and one is merely a convenience method.

     bool get(Node& node);
     void consume(void);

   The get method takes a node, which will be set to the value of the next
   node in the file stream.  If there are no more nodes, false will be
   returned; reading values can be easily coded as:

     File::Reader reader("path_to_file");
     File::Node node;
     while (reader.get(node)) {
       // process node
     }

   The consume() method will keep reading nodes until the end of file, or
   until the current object is ended.  This will allow you to quickly skip
   the rest of the current object.  It handles nested objects; if a new
   object is found while consuming, it will consume both that object (and
   its children) and the rest of the original object.

   These operations are not reversable; you cannot seek back in the node
   stream.  If you need the value of a node, you must store it on your own.

5. File Object Writer
   ==================

   The file writer is used to create or store a new file.  You cannot modify
   an existing file; you must write a new one from scratch.  It is possible
   to read in a file, store it in a tree internally, modify this tree, then
   rewrite it to disk.

   The file writer is a bit more low-level than the reader API; methods are
   provided for actual formatting (i.e., adding additional spacing, or
   writing comments).  As the default file format is designed for human
   editing, the thought is that it's nice for AweMUD to add useful
   formatting to make the output easier to work with.  Backends that do not
   support (or need) these extra formatting may just ignore these method
   calls; they should not ever be harmful to the output.

   You may, as with the File::Reader object, open a file by handing a path
   to the constructor, or to the open() method.  The is_open() and close()
   methods also exist, and work the same as in the File::Reader object.

     File::Writer(const string& path);
     int open(const string& path);
     void close(void);
     bool is_open(void);

   The methods you will use to write out data are very, very simple.  You
   simply provide the necessary values to the methods.  Output will be
   automatically sanitized; if you give invalid characters to a parameter
   that must be a plain word in the output, the reader will cut out the
   invalid characters.  For this reason, it's best to be sure you are using
   valid names; otherwise, you may be surprised when reading the file back
   in.  Also note that the data part of the attr() method (used to write out
   attribuets) will be automatically escaped for you; if you put in escape
   characters, _those_ will be escaped (so \n will become \\n).  The name
   parameter to begin() may be empty strings.

     void attr(const string& type, const string& name, const string& data);
     void begin(const string& type, const string& name);
     void end(void);

   The following methods are for the pretty formatting.  The bl() method
   just adds a blank line.  The comment() method does the obvious.

     void comment(const string& text);
     void bl(void);

   An example of using the API:

     // open file
     File::Writer writer("my_new_file");
     if (!writer.is_open()) {
       // handle error
     }

     // header
     writer.comment("New File - List of Friends");
     writer.comment("Copyright (C) 2003  Me");
     writer.bl();

     // johny, my best friend
     writer.begin("friend", "johny");
     writer.attr("", "name", "John Smith");
     writer.attr("", "birthday", "June 11th");
     writer.end();

     // susan, my partner
     writer.begin("friend", "susan");
     writer.attr("", "name", "Susan Jones");
     writer.attr("", "birthday", "January 20th");
     writer.end();

     // finish up
     writer.close();

6. Backends
   ========

   The file object API can be easily extended to other backends.  Possible
   backends include an XML backend, a SQL backend, and so on.
