RISC OS Asm and BASIC Fun : Tutorials.Compiler.Overview
PREV:Nothing Return to Tutorials NEXT:Part 1
Tutorials.Compiler.Overview
A Breif Overview of This Tutorial.

Work In Progress.
WIP

This information should not be viewed in any way as accurate, or complete. While it gives some information it is a Work In Progres.

Overview:
In this tutorial series we will cover the basic concepts of writing a simple recursive descent compiler. Each section will provide a bit more detail on how to do things, with example implementations.

The language to be covered in the initial portions of the tutorial (through part 12) is a subset of BBC BASIC V on RISC OS. That subset is the programming language for this tutorial. Parts 13 through 20 will show how to extend this to add more of the features of BBC BASIC V, going beyond the language as defined for the tutorial. After the tutorial series is completed the compiler will be made available as an open source alternative to !ABC. The supported language will continue to be updated until it reaches a fairly complete implementation of BASIC V. There will also be a few extensions to the BASIC V language supported by the compiler.
Language Definition For Tutorial Compiler:
When implementing a compiler it is important to define the programming language being targeted. To this end this section will define the subset of BASIC V that will be introduced during the tutorial series.

Originally I was doing this in EBNF. I changed that to simpler descriptions that I think more people will understand.

The language structure as a simplified diagram of the priority levels is:

Program:
  Procdef:
    Block
  Block:
    Declare
    Statement:
      Keystatement
      Expression:
        OrExp
          AndExp
            CompareShift
              Additive
                Term
                  Exponent
                    Power
                      Factor
      Assign:
        Expression
      If structure
      Case structure
      While loop
      For loop
      Repeat unitl loop
      Branch Satatement
Even though it may feel a bit small for the BASIC V language, the above actually covers the priority chain for the full language. More details on what each item in this diagram is will be provided from here down.

Program:
The entire program file(s). To note this compiler will support modular compiling, that is source spread among multiple files.

Block:
A block is a section of code containing one or more of the items listed within block in the above diagram. A block may contain any number of these items.

Declare:
A declaration of some kind. This may be the declaration of an array, or the declaration of a block of memory to reserve. A declaration takes the simple form:

DIM ident(cardinal)
Or the form:

DIM ident% cardinal
Where ident is a valid identifier for a variable, and cardinal is an expression that evaluates to a cardinal number.

Statement: A statement is just about anything that actually does something. Think of a verb, action. All of the conditionals, looping, calling, etc are statements.

Assign:
This is an assignment statement. The general form is:

LET lval = expression
Where lval is variable, array ellement, or inderection. Expression is an expression.

Expression:
An expression is anything that represents a value. In some cases this is a series of operations yealding a value, in some cases a function call, in other cases it may be just a value or variable.

A description in somewhat incomplete EBNF is:

Expression = [ lval ( "+=" | "-=" ) ] { OrExp };

OrExp = AndExp { ( "EOR" | "OR" ) AndExp };

AndExp = CompareShift { ( "AND" ) CompareShift };

CompareShift = Additive {
         ( "=" | "<>" | "<" | ">" | "<=" | ">=" | "<<" | ">>" | ">>>")
         Additive };

Additive = Term { ( "+" | "-" ) Term };

Term = Power { ( "*" | "/" | "DIV" | "MOD" ) Power };

Power = Factor { "^" Factor };

Factor = Function | Variable | PseudoVariable | Constant | Indirection;
lval is described above.
Function is either a builtin function or a user function.
Variable, constant, PseudoVariable, and Indirection are exactly what there name implies, and will be described in more detail further down.

Key Statement:
A key statement refers to a statement that starts with a keyword, and pretty much is on its own. Like PRINT, INPUT, etc. The keystatements that are going to be supported by our tutorial subset are:

WIP


On After The Above
Once things get past the above listed state, things will progress to a near complete implementation of BASIC V with a couple of extensions. Those things missing in the end are those that do not make any sence to compile by there nature, that is stuff like EVAL.

Extras to be added include stuctures. I am looking at other implementations of BBC BASIC for inspiration on this. I also intend to implement a means of dealing with pointers to procedures and functions.