Readme documentation for CHASM (CHL Assembler)
for the Lionhead Virtual Machine, version 1.0

by James "Anthem" Costlow.


Readme contents:
================
 1. Zip contents
 2. What is CHASM?
 3. What is its status?
 4. What if I find a bug?
 5. Can I get the source?
 6. How do I use it?
 7. The CHASM language
    7.1. Overview
    7.2. Project files
    7.3. Source files
    7.4. Global sections
         7.4.1. Global variables
         7.4.2. Data
         7.4.3. Functions
         7.4.4. Externs
 8. For additional information
    8.1. Included examples
    8.2. Unofficial Lionhead Virtual Machine Reference
 9. Contact Information


1. Zip Contents:
================

The following files and directories should have been included with this
document in the chasm.zip archive:

  -CHASM.EXE
  -README.TXT (this file)
  -EXAMPLES\
    -PREVIEW\
      -README.TXT
      -MOD.PRJ
      -RESCUEWOMAN.TXT
    -MEGASHEEP\
      -README.TXT
      -MEGASHEEP.PRJ
      -MEGASHEEP.TXT
    -ROCKMOTHER\
      -README.TXT
      -ROCKMOTHER.PRJ
      -ROCKMOTHER.TXT


2. What is CHASM?
=================
 CHASM is a command-line compiler for Lionhead Virtual Machine script files
written in a low-level assembly-like language, which is briefly described later
in this document.


3. What is its status?
======================
 CHASM is not being actively developed or maintained. It was originally created
to assist in reverse-engineering the original Black & White script by Lionhead
Studios, and as such, it has fulfilled its purpose.


4. What if I find a bug?
========================
 If you find a bug in CHASM, you are welcome to notify me of the issue at the
email address provided below. However, as CHASM is not being actively
maintained, no guarantee is provided that it will be resolved in a timely
manner, or at all. If you have programming experience and would like to resolve
the issue yourself, see the following section, "Can I get the source?"


5. Can I get the source?
========================
 Yes. You may send an email to the address below to request the CHASM source
code. It has no license, and you may freely use and redistribute it. I ask
only that if you redistribute it with modifications, please have some common
courtesy and credit me for my work.

 If you resolve a bug, or simply do something interesting with it, I'd like to
hear about it.


6. How do I use it?
===================
 NOTE: These instructions assume you're familiar with DOS and/or the Windows
       command prompt.

 Run the program with a command in the format:

    >  chasm projectfile output

 where 'projectfile' is the location of the project file (see section 7.2) and
'output' is the location where the compiled CHL file should be saved.


7. The CHASM language
=====================
   7.1. Overview
   =============
    Please note that this is only a description of the CHASM-specific aspects
   of CHL programming, and this document is not meant to stand alone. For more
   information, please refer to the Unofficial Lionhead Virtual Machine
   Reference (see section 8.2).

    CHASM produces a compiled script file which can be used with the game
   Black & White by Lionhead Studios.
    CHASM uses two types of files to produce this final product: a project file
   and the source files. These are described in the following two sections.


   7.2. Project files
   ==================
    Project files tell CHASM what source files to compile to produce the final
   compile CHL file. Its format is rather basic. Each line of the file contains
   the path of a source file to compile.

    The sources listed in the project file are compiled and linked in the order
   in which they appear in the project file. This means that CHASM does NOT
   automatically determine source interdependencies. Files that refer to
   elements of other files (by means of the extern keyword, see section 7.4.4),
   must appear AFTER the file they refer to. The "Preview" example included
   with CHASM illustrates this.


   7.3. Source files
   =================
    In the source files is where all the code appears. Source files allow four
   types of "global section", which are defined and explained in the following
   section.

    CHASM compiles source files and combines (links) them in the order they
   appear in the project file to produce a CHL script file. Source files are
   never specified directly on the CHASM command line.

    All elements of a CHASM source file, including variable names, function 
   names, keywords, etc. are case-sensitive.
    All extra whitespace between elements is ignored, except where noted.


   7.4. Global sections
   ====================
        7.4.1. Global variables
        =======================
         Global variables are variables that are accessible to any function in
        any source file, as long as that source file appears after the one in
        which the global variable is declared.
         Global variables are declared as follows:

            globals
              variablename1
              variablename2
              ...
            endg

         Any number of global variables may be declared in a single 'globals'
        section. Multiple 'globals' sections may appear in a single source
        file, and they may appear anywhere in the file.


        7.4.2. Data
        ===========
         CHL files include a data section, where miscellaneous data may be
        accessed by the script while it's running. Data can be included with
        or without a name.

         Data with a name may be referred to like a variable. But note that
        doing so will not result in the data's value being used, but its
        offset into the data section. This is as it should be, since only
        system calls can access this data, and they require the offset of the
        data to be passed to it as an integer.

         Data without a name is still included in the final CHL file, but
        cannot be directly referred to in a CHASM source file.

         Data can be of four types: string, float, int, and byte. Data sections
        may appear anywhere in the file, and are formatted as follows:

            data
              type name value
            endd

         Where 'type' is the data type, 'name' is the (optional) name for the
        data, and 'value' is the information to be stored. Only one value may
        be specified per line.

         String data:
         ------------
         Use 'string' as the type. The 'value' is a string of ASCII characters,
        surrounded by double quotes (").
         String values may contain the following escape sequences:
           \n : newline character (ASCII value 13)
           \r : carriage return character (ASCII value 10)
           \" : double quote character (")
           \\ : backslash character (\)

         String values may also span multiple lines by appending the backslash
        character to the end of an unfinished string. Any characters appearing
        between the end of the string and the backslash, and preceding a
        continuing line, are included with the string.
         Example:

	    data
              string twolines "This string spans\
              multiple lines"
            endd

         This produces the string "This string spans multiple  lines"
        (note the two spaces between "multiple" and "lines").


         Float data:
         -----------
         Use 'float' as the type. The 'value' is a 32-bit floating-point
        number, formatted according to the ANSI C printf specification.
         Example:

            data
              float bignumber 123456790.456
            endd

         Integer data:
         -------------
         Use 'int' as the type. The 'value' is a 32-bit integer, specified in
        decimal, octal or hexadecimal formats, according to the ANSI C printf
        specifications.
         Examples:

            data
              int decimal 16
              int octal 020
              int hex 0x10
            endd

         All of these examples produce the decimal value 10 (ten).

         Byte data:
         ----------
         Use 'byte' as the type. The 'value' is an 8-bit ASCII character, whose
        value is specified as an integer (as above) or character.
        
         Examples:

            data
              byte decimal 65
              byte octal 0101
              byte hex 0x41
              byte character 'A'
            endd

         All of these examples produce the letter 'A', which has ASCII
        value 65 (decimal).

         Also, multiple bytes of data may be specified with one command,
        and continued on following lines:

            data
              byte multiple 65, 0102,
                            'C', 'D'
            endd

        This example produces the string "ABCD".


        7.4.3. Functions
        ================
         Functions define all the code that goes into the CHL script. Functions
        Have a name, take parameters, and can have local variables.
         Functions are defined as follows:

            function FunctionName
            parameters
              parameter1
              parameter2
              ...
            endp
            locals
              local1
              local2
              ...
            endl
            begin
              (initialization)
              free
              (instructions)
            end

         The 'parameters' section appears only if the function must take
        parameters when called, and must precede the 'locals' section, if both
        appear. Any number of parameters may be specified. The name
        'parameters' may be shortened to 'parms' for convenience.
         Parameters are passed by value only, meaning that any changes made to
        a parameter within a function will not be retained in the calling
        function when the called function ends.

         The 'locals' section appears only if the function uses local
        variables. This section must appear after the parameters section, if
        both appear.

         The 'begin'/'end' block defined the function's body, which contains
        all labels and code to be executed. The body of a properly-written
        function consists of three parts:

         1: The function block
         ---------------------
         The function block frames all other code in the function's body. It is
        structured as follows:

            blk CEND

            (function's code)

        CEND:
            endb
            jmp END
            iter
        END:
            ret

         2: The initialization code
         --------------------------
         A function's initialization code pops all parameter values from the
        stack and initializes each local variable to have an initial value.
         Parameter values are stored in the parameter variables with the 'popf'
        instruction. Parameters values should be popped in the order they
        appear within the 'parameters' section of the function.
         Local variables are initialized simply by setting their value with the
        'popf' instruction, after a value has been left on the stack.
         After all parameter and local variables have been initialized, the
        'free' instruction should be executed.

         3: The function's code
         ----------------------
         This is the code that the function executes to perform its task. To
        exit a function, allow the final instructions of the function block to
        execute.

         Each line of the function's code consists of either one label, or one
        instruction/argument pair.

         Labels consist of the label's name and a colon, as follows:

            labelname:

	 Labels mark a point in the code that may be jumped to by the "jz"
        or "jmp instructions.

         A complete list and description of the various valid instructions for
        the Lionhead Virtual Machine can be found in the Unofficial Lionhead
        Virtual Machine Reference (see section 8.2). The following list is
        meant only to give the instruction names used by the CHASM language to
        refer to the instructions. A complete discussion of using these
        instructions is not within the scope of this document.

        Notes:
         Argument types listed in the Unofficial Lionhead Virtual Machine
        Reference as "ADDRESS" take the name of a label in CHASM source files.
         Any instruction that demands an integer, float, or boolean value may
        take either an immediate value, or a variable that contains such a
        value.
         The keywords 'true' and 'false' may be used for boolean values.

            Opcode | Flag,Type | Instruction
            --------------------------------
            0x00   | 0   , 0   | ret
            -------|-----------|------------
            0x01   | 0/1 , 1   | jz
            -------|-----------|------------
            0x02   | 0/1 , 1   | pshi
            0x02   | 0/1 , 2   | pshf
            0x02   | 0   , 3   | pshc
            0x02   | 0/1 , 4   | psho
            0x02   | 0   , 6   | pshb
            -------|-----------|------------
            0x03   | 0   , 1   | popi
            0x03   | 0/1 , 2   | popf
            0x03   | 0   , 4   | popo
            -------|-----------|------------
            0x04   | 0   , 2   | add
            0x04   | 0   , 3   | addc
            -------|-----------|------------
            0x05   | 0   , 0   | sys
            0x05   | 0   , 2   | sys2
            -------|-----------|------------
            0x06   | 0   , 0   | sub
            0x06   | 0   , 3   | subc
            -------|-----------|------------
            0x07   | 0   , 2   | neg
            -------|-----------|------------
            0x08   | 0   , 2   | mul
            -------|-----------|------------
            0x09   | 0   , 2   | div
            -------|-----------|------------
            0x0B   | 0   , 1   | not
            -------|-----------|------------
            0x0C   | 0   , 1   | and
            -------|-----------|------------
            0x0D   | 0   , 1   | or
            -------|-----------|------------
            0x0E   | 0   , 2   | eq
            -------|-----------|------------
            0x0F   | 0   , 2   | neq
            -------|-----------|------------
            0x10   | 0   , 2   | gte
            -------|-----------|------------
            0x11   | 0   , 2   | lte
            -------|-----------|------------
            0x12   | 0   , 2   | gt
            -------|-----------|------------
            0x13   | 0   , 2   | lt
            -------|-----------|------------
            0x14   | 0/1 , 1   | jmp
            -------|-----------|------------
            0x15   | 0   , 2   | dly
            -------|-----------|------------
            0x16   | 0   , 1   | blk
            -------|-----------|------------
            0x17   | 0   , 1   | int
            0x17   | 0   , 2   | flt
            0x17   | 0   , 3   | crd
            0x17   | 0   , 4   | obj
            0x17   | 0   , 6   | bool
            0x17   | 1   , 2   | zero
            -------|-----------|------------
            0x18   | 0   , 1   | call
            0x18   | 1   , 1   | strt
            -------|-----------|------------
            0x19   | 0   , 1   | endb
            0x19   | 1   , 1   | free
            -------|-----------|------------
            0x1B   | 0   , 1   | iter
            -------|-----------|------------
            0x1C   | 0   , 1   | endc
            -------|-----------|------------
            0x1D   | 0   , 0   | swp
            0x1D   | 0   , 2   | swp4
            --------------------------------
            

        7.4.4. Externs
        ==============
         Externs tell CHASM that a variable, a function, or data, that's
        referenced from within a file is defined in another file. If during
        linking, CHASM cannot find the external identifier in any source 
        file, it will produce an error.
         Externs are specified with a single keyword, extern, as follow:

            extern identifier

         Where 'identifier' is the name of the variable, function, or data,
        that the file must refer to.
         If externs are specified in a file but never used, they are ignored.


8. For additional information
==============================
   8.1. Included examples
   ======================
    Included in the CHASM zip file in the EXAMPLES directory is three example
   projects. Each includes its own brief documentation.

   8.2. Unofficial Lionhead Virtual Machine Reference
   ==================================================
    As previously stated, this readme intends only to give information specific
   to the CHASM compiler. For more detailed information about the Lionhead
   Virtual Machine and how to program for it, see the Unofficial Lionhead
   Virtual Machine Reference, available in the Documents section of the Team
   Grey Area website:
       http://www.planetblackandwhite.com/greyarea/

9. Contact information
======================
 To contact me about CHASM, obtaining its source, or its documentation, please
email me at:
    anthem2112@verizon.net
or contact Team Grey Area through our forums at our website, listed above.