 |

C/C++ GOLD engine v1.8

GOLD (Grammar Oriented Language Developer)
is a free parsing system by Devin Cook for programmers to
develop their own
programming languages, scripting languages and interpreters. It
strives to be a development tool that can be used with numerous
programming languages and on multiple platforms. For more information
about GOLD, what it does and what a GOLD engine is, see the GOLD
website:
GOLD
My engine is for C/C++ programmers. It comes with a special GOLD
template "C - Kessels Engine grammar.h.pgt"
which generates a "grammar.h" file, containing the exact same information
as a .cgt (Compiled Grammar Table) file. This means that the complete grammar
will literally be compiled into your program. Initialization is very fast
because there is no loading of a .cgt file from disk.
- ANSI-C
- Thread safe
- Compiled grammar
- Unicode (UCS-2)
A second GOLD template "C - Kessels Engine template.c.pgt" is included that
creates the source of a bare-bones interpreter. All you have to
do is fill in the blanks...
Download Gold-C-KesselsEngine-v1.8.zip
(152kb)
Included are:
- "engine.c" and "engine.h"
- GOLD template to generate grammar.h files.
- GOLD template to generate a bare-bones interpreter.
- 4 examples, with sources, grammar, and input.
|
Installing and usage
- Copy the *.pgt templates to the GOLD templates folder. GOLD will
automatically find and list the templates. On my machine the folder
is:
C:\Program Files\GOLD Parser Builder\Templates
- Create a grammar with GOLD. A small test-grammar has been included
in file "grammar.grm".
- In GOLD select "Tools" -> "Create a Skeleton Program"
- Select "C - Kessels Engine grammar.h"
- Click "Create"
- Select "grammar.h"
- Do you want to replace it - YES.
- Write a program that calls the parser and interprets the results.
The easiest way is by creating a template.c with the second GOLD
template:
In GOLD select "Tools" -> "Create a Skeleton Program"
- Select "C - Kessels Engine template.c"
- Click "Create"
- Enter a filename, for example "template.c".
Alternatively you can write your own program. Example1 calls the
parser and shows output similar to the parse tree in the GOLD test
window, Example2 demonstrates the tokenizer, Example3 was generated
by the template, and Example4 is a small calculator that does some
actual work.
- Add the "grammar.h", "engine.c", and "engine.h" files to your
project. There are no special settings needed, the files should
flatly compile on all C/C++ compilers. A simple "makemic"
file is included to compile the examples with Microsoft C on Windows.
If you want to compile on Unix/Linux then you could use "makemic"
as an example for your own makefile, adapting it should be easy.
On Windows you can compile by executing the following commands in a
Command Prompt window. The first line sets your environment, the second
runs Microsoft's "nmake" utility.
"C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\vcvars32.bat"
nmake /f makemic
|
Parser API
The parser is basically just a single function ("Parse()") that takes a
text buffer as input, and two selection parameters. It returns a struct
full of data with the results (which you cleanup by calling
"DeleteParseData()"). It boils down to adding the following lines to your
program:
#include "parser.h"
....
Result = Parse(
InputBuf, /* Pointer to the input data. */
strlen(InputBuf), /* Number of characters in the input. */
TRIMREDUCTIONS, /* 0 = don't trim, 1 = trim reductions. */
DEBUG, /* 0 = no debug, 1 = print debug info. */
&Token); /* The output. */
....
DeleteTokens(Token);
Calling just a single function, can it be that easy? Yes, and no. The parser
does all the hard work and parses and structures the input data for you.
The next step is interpreting the results and do whatever work is necessary,
such as calling functions, evaluating expressions, allocating variables, and
stuff like that. The struct contains the following data:
struct TokenStruct {
int ReductionRule; /* Index into Grammar.RuleArray[]. */
struct TokenStruct **Tokens; /* Array of reduction Tokens. */
int Symbol; /* Index into Grammar.SymbolArray[]. */
wchar_t *Data; /* String with data from the input. */
int Line; /* Line number in the input. */
int Column; /* Column in the input. */
};
A token can be one of two things: a "symbol" (ReductionRule = -1) or a
"reduction" (ReductionRule >= 0). A symbol is a character string, for
example "+", "function", "176", and other literal strings from the input.
Symbols use the following fields:
[symbol]
Token->ReductionRule = -1
Token->Symbol = index into Grammar.SymbolArray[]
Token->Data = character string from the input.
The Token->Symbol is an index into the Grammar.SymbolArray[]. The Grammar is a
big struct with lots of arrays and stuff that was compiled by GOLD. Take a
look at "engine.h" for the definitions, and "grammar.h" for the contents.
A reduction is a token that holds pointers to other tokens. It maps to a
rule from your grammar. For example the following rule:
<Calculations> ::= <Calculate> <Calculations>
Will generate a Token with the following content:
[reduction]
Token->ReductionRule = index into Grammar.RuleArray[] of the rule.
Token->Tokens[0] = Token for "<Calculate>"
Token->Tokens[1] = Token for "<Calculations>"
So, the Token->ReductionRule tells you which rule it is, and the Tokens[]
are the elements of the rule (please note that Token->NextToken is not
involved at this point). Goto Token->Tokens[0] to find another reduction
Token, or perhaps a symbol. Following tokens will eventually bring you to
a Symbol.
And here the lesson ends. The rest is up to you. Take a look at Example1.c
and Example4.c for working examples.
|
Sample grammar
"Name" = 'Calculator'
"Version" = '1.8'
"Author" = 'Jeroen Kessels'
"About" = 'Simple calculator.'
"Case Sensitive" = True
"Start Symbol" = <Calculations>
Comment Start = '/*'
Comment End = '*/'
Comment Line = '//' | 'REM' | '#' | '--' | ''
DecLiteral = ([123456789]{Number}* | 0)
<Calculations> ::= <Calculate> <Calculations>
|
<Calculate> ::= 'print' '(' <Expression> ')'
<Expression> ::= <MultiplyDivide> '+' <Expression>
| <MultiplyDivide> '-' <Expression>
| <MultiplyDivide>
<MultiplyDivide> ::= <Value> '*' <MultiplyDivide>
| <Value> '/' <MultiplyDivide>
| <Value>
<Value> ::= DecLiteral
| '(' <Expression> ')'
|
|
|
 |