1
Structure of bison grammar file
2
-------------------------------
4
- Extra statements to include in the begining of produced C file (normally
5
includes and forward declarations).
6
- defining the default data type of semantic values (int by default)
8
+ SHOULD be defined in lexer as well, otherwise will be considered int.
9
- defining how deep the parser stack can grow before reporting an error
10
#define YYMAXDEPTH ### (default 200)
13
list of tokens returned by lexical analyzer (see below)
17
definition of all nonterminal languages symbols (see below)
20
user code, should include following functions:
21
1. function which is called in the case of error:
22
void yyerror( const char *str) { }
23
2. function main calling:
26
-------------------------------------------------------------------------
30
- we could/should define grouping order and operator precedence
32
%keyword <list of tokens> - all listed tokens have same grouping priority
33
%keyword <another list> - this had higher grouping priority
35
keywords - defining grouping order
37
%left, %token - left operators ( x + y + z -> (x + y) + z )
38
%right - right operators ( x + y + z -> x + (y + z)
39
%nonassoc - no associativity ( x + y + z -> error )
41
types - defining non-default token types
47
%token <type2_name> TOKEN
49
%type <type1_name> nonterminal /* defining types for non-terminal symbols */
53
%start <symbol> /* specify start symbol - first non terminal
55
%pure_parser /* request producing of reentrant parser,
56
multiple calls from the same programm */
57
%expect # /* eleminate shift/reduce conflicts warning */
62
- each nonterminal symbol must have grammatical rules showing how it is made
63
out of simpler constructs.
64
- the start symbol should be defined
65
- convention: TOKENS - upcase, nonterminals - lowcase (a terminal symbol that
66
stands for a particular keyword in the language should be named after that
67
keyword converted to upper case)
68
- the single-symbol tokens could be returned from lexer just using ASCII code
69
- the terminal symbol 'error' is reserved for error recovery.
74
sequence of tokens and rules | another sequence | ...
75
{ optional actions: Most of the time, the purpose of an action is to
76
compute the semantic value of the whole construct from the semantic values
78
- Default action: $$ = $1 (see below a meaning)
82
<symbol_name_2>: - comma-separated sequence of 0 or more 's' groupings
83
/* empty, to allow empty string matching */ | s
88
- something like 'NUM op1 NUM'
89
- Normally the precedence of rule (and, therefore precedence of whole
90
non-terminal symbol in the current semantical context) is equal to the
91
precedence of last terminal symbol mentioned in it's components.
92
However, using '%prec' keyword it is possible to directly specify symbol
93
which should identify the precedence of whole rule (this symbol is not
94
mandatory to appear in the rule itself).
97
| exp '-' exp %prec BINARY_MINUS
98
| '-' exp %prec UNARY_MINUS
99
- For recovery there is special symbol 'error' which can be specified on
100
rules line, it would be executed if everything else failed. Example:
102
'\n' # empty line, ignoring
103
| statement '\n' {} # statement, processing
104
| error '\n' {} # error, executing recovery procedure
108
- Macroses to return imidiately from yyparse:
109
YYACCEPT - return with success code
110
YYABORT - return with error code (1)
111
YYERROR - cause an immediate syntax error (does not return imideatly
114
- Macroses to calculate semantic values:
115
$$ - semantic value of the resulting grouping should be written
117
* $<type1_name>$ - for alternative types
118
$n - contains the semantic value for the nth component of the
120
* $<type1_name>n - for alternative types
122
YYBACKUP (token, value) - Unshift token (installs a look-ahead token
123
with token type token and semantic value value, then it
124
discards value that was going to be reduced by this rule.
125
Allowed only for rules that reduce a single value, and only
126
when there is no look-ahead token.
128
- Look ahead macroses
129
yychar - current look ahead token (see algorithm)
132
YYRECOVERING - contains '1' when the parser recovering from a syntax
133
error and '0' - otherwise.
134
yyclearin; - discard current look-ahead token
135
yyerrorok; - resume generating error messages immediately for
136
subsequent syntax errors.
138
@n - structure containing information on the line numbers and
139
column numbers of the 'n'th component of the current rule.
141
int first_line, last_line;
142
int first_column, last_column;
144
+ 'yylex' supply this information (for all or certain tokens)
145
+ use of this feature makes the parser noticeably slower.